TableQA2 (pdf)
tableqa2-pdf
10 runs · 5 models · evaluated by HybridEvaluator.
| # | Model ↕ | Variant ↕ | Mode ↕ | Score ↓ | Avg. dur ↕ | Tokens ↕ | Date ↕ |
|---|---|---|---|---|---|---|---|
| 1 | claude-opus-4-6 | tools,high | file | 0.880 | 46.3s | 13.7M | 2026-03-23 |
| 2 | gemini-3-pro-preview | — | file | 0.880 | 41.8s | 946.0k | 2026-02-03 |
| 3 | gemini-3-pro-preview | tools,high | file | 0.880 | 53.3s | 957.3k | 2026-02-03 |
| 4 | claude-opus-4-5 | tools,high | file | 0.850 | 29.6s | 7.9M | 2026-03-22 |
| 5 | gpt-5-2 | tools,high | file | 0.840 | 4.1m | 8.6M | 2026-02-03 |
| 6 | gpt-5-2-pro | — | file | 0.820 | 2.4m | 9.9M | 2026-02-03 |
| 7 | claude-opus-4-6 | — | file | 0.810 | 38.2s | 16.8M | 2026-03-20 |
| 8 | gpt-5-2-pro | tools,high | file | 0.800 | 4.0m | 9.0M | 2026-02-03 |
| 9 | gpt-5-2 | — | file | 0.760 | 1.9m | 10.0M | 2026-02-03 |
| 10 | claude-opus-4-5 | — | file | 0.750 | 21.9s | 8.4M | 2026-03-20 |
Click column headers to sort. Click mode chips to filter.