Vendor × Web Extraction · Verified
Jina — Web Extraction
How Jina performs on the arlen/bench web extraction leaderboard, the verified WCXB run (n=150).
Answer firstweb_extraction-2026-q2
Jina ranks #4 of 4 on the web extraction leaderboard with fidelity 0.54. Cost per verified-correct is not yet priced (—).
§
Metrics
| Metric | Jina | vs field |
|---|---|---|
| fidelity 0-1 | 0.5495% CI 0.511–0.573 | higher better |
| phrase recall | 0.59 | higher better |
| boilerplate excl. | 0.26 | higher better |
| cost/correct $ | — | lower better |
| coverage % | 99.3 | leads |
Full leaderboard: Web Extraction · snapshot JSON: /bench/api/web_extraction-2026-q2.json
§
Strengths & Weaknesses
Strongest: service (0.6), article (0.59). Weakest: product (0.48), listing (0.4).
§
Sample Rows
| Type | Page | fidelity |
|---|---|---|
| listing | All Latest News | 0.07 |
| product | Men's Wool Runner | 0.3 |
| documentation | Using the Fetch API | 0.45 |
§
Vendor Right of Reply
Jina has not yet been sent its rows for pre-publication review (notifications pending). Right of reply is standing; any response will be published verbatim here and linked from the leaderboard row. No commercial relationship; vendors cannot pay for placement.