Skip to content

steel-dev/leaderboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ddf7867 · Mar 5, 2025

History

57 Commits
Mar 4, 2025
Feb 27, 2025
Mar 5, 2025
Mar 5, 2025
Feb 27, 2025
Feb 27, 2025
Mar 4, 2025
Mar 4, 2025
Mar 4, 2025
Mar 4, 2025
Mar 4, 2025
Mar 4, 2025
Feb 27, 2025
Mar 3, 2025
Feb 27, 2025
Mar 4, 2025

Repository files navigation

Browser Agent Leaderboard

This repository presents the current standings of various web agents evaluated on the WebVoyager benchmark (paper). The WebVoyager benchmark comprises 643 tasks across 15 popular websites, assessing agents' abilities to perform diverse web navigation and interaction tasks.


Steel.dev - Open-source Browser API for AI Agents & Apps Steel is an open-source browser API purpose-built for AI agents.

Leaderboard

Rank Model Organization WebVoyager Score Source Open Source New SOTA
1 Browser Use Browser Use 89.1% Source Yes Yes Yes
2 Operator OpenAI 87% Source No Yes
3 Kura Kura 87% Source No Yes
4 Skyvern 2.0 Skyvern 85.85% Source Yes Yes
5 Project Mariner Google 83.5% Source No
6 Proxy Convergence AI 82% Source No
7 Agent-E Emergence AI 73.1% Source No
8 Runner H 0.1 H Company 67% Source No
9 WILBUR Academic Research 60.6% Source No
10 WebVoyager Academic Research 59.1% Source Yes
11 Computer Use Anthropic 52% Source No

Notes:

  • Open Source: Indicates whether the agent's source code is publicly available.
  • New: Denotes recently introduced models.
  • SOTA: Signifies models that have achieved state-of-the-art performance.

Contributing

We encourage contributions to keep this leaderboard up-to-date. If you have information about new models or updated scores, please submit a pull request or open an issue.

License

This project is licensed under the MIT License.