Vendor Profile · Illustrative prototype
Tavily
Tavily on the arlen/bench leaderboards — agent web-search accuracy, extraction fidelity, freshness, cost per verified-correct answer, and agent-readiness, scored against golden truth.
Answer firstillustrative
Tavily ranks #2 of 6 for agent web-search accuracy at 84.0% hit@5; #4 of 7 for extraction fidelity at 0.79, and is agent-ready — an autonomous agent obtained a working API key in live trials with zero humans.
hit@5
84.0%
web search
cost / correct
$0.0059
web search
extraction fidelity
0.79
web extraction
agent-ready
Yes
onboarding harness
§ A
Web Search — Tavily
| Metric | Tavily | Leaderboard best | Direction |
|---|---|---|---|
| hit@1 % | 68.4 | Exa 71.2 | higher is better |
| hit@5 % | 84.0 | Exa 87.6 | higher is better |
| fresh<30d % | 86.4 | leads | higher is better |
| retrievability h | 7.8 | leads | lower is better |
| cost/correct $ | $0.0059 | Serper $0.0041 | lower is better |
| p50 latency ms | 688 | Brave 301 | lower is better |
§ B
Web Extraction — Tavily
| Metric | Tavily | Leaderboard best | Direction |
|---|---|---|---|
| fidelity 0-1 | 0.79 | Firecrawl 0.91 | higher is better |
| JS gap Δ | 0.18 | Firecrawl 0.03 | lower is better |
| block rate % | 7.6 | Apify 3.3 | lower is better |
| cost/correct $ | $0.0044 | Jina $0.0009 | lower is better |
| schema validity % | — | Firecrawl 96.2 | higher is better |
§ C
Tavily vs Exa Tavily vs Serper Tavily vs Brave Tavily vs Firecrawl Tavily vs SerpAPICompare Tavily
Illustrative prototype. No verified vendor run has been published yet; every figure here is a placeholder and must not be cited as a measured result. Numbers are replaced when a snapshot’s first full run lands.