arlen/benchOpen benchmarks for agentic consumers
SNAPSHOT web_search-2026-q3
11 JUN 2026 · BERKELEY, CA
§ 02 · Leaderboard · Live

Web Extraction

Field-level fidelity on hand-verified pages. snapshot web_extraction-2026-q2 · weighted over title, anchor phrases, table cells, numeric facts, excluded noise.

Answer firstweb_extraction-2026-q2

Firecrawl leads field-level fidelity at 0.91 and holds up best on JS-rendered pages. Jina is cheapest per verified-correct answer at $0.0009. The widest static-vs-JS gap is Brave at 0.34; missing schema coverage renders as an em-dash, never 0.

Best fidelity
0.91
Firecrawl +0.02 vs Q2
Best on JS-rendered
0.88
Firecrawl · n=38
Best cost/correct
$0.0009
Jina · plan: reader
Highest block rate
11.4%
Brave +2.1pp
§ 02.1

Full Leaderboard

click any column to sort
Vendor fidelity 0–1 JS gap Δ block rate % cost/correct $ schema validity %
1Firecrawl agent-ready0.910.034.1$0.003196.2
2Jina0.860.096.0$0.0009
3Apify0.840.053.3$0.007291.7
4Tavily agent-ready0.790.187.6$0.0044
5Exa agent-ready0.740.228.8$0.005884.0
6Serper0.690.279.5$0.0027
7Brave0.630.3411.4$0.0036

Fidelity is the weighted field score, 0–1. JS gap = fidelity(static) − fidelity(JS-rendered); smaller is better. = vendor returns no schema-validatable payload; excluded from that column, never scored 0.

Machine surface: /bench/api/web_extraction-2026-q2.json — the verified snapshot aggregate (CC-BY-4.0).