HNNewShowAskJobs
Built with Tanstack Start
Browser Agent Benchmark: Comparing LLM models for web automation(browser-use.com)
11 points by MagMueller 2 days ago | 6 comments
  • wiradikusuma2 days ago

    Since we're in this topic, can anyone suggest good AI-based tool for exploratory (fuzzy?) web testing?

  • pixel_popping2 days ago

    It's lacking the best model (Opus 4.5) on the benchmark tho.

    • djohnstona day ago |parent

      Yeah but then their own product might not score the highest.

      • pixel_popping9 hours ago |parent

        Exactly why I'm pointing it out, which feels a bit corrupt, but understandable.

        • djohnston7 hours ago |parent

          tbh i was a bit cranky yesterday - even if they are #2 on a legit benchmark that would be impressive

  • MagMueller2 days ago

    [dead]