Edison Labs
Benchmarks / LabBench2

SourceQuality

sourcequality

10 runs · 5 models · evaluated by HybridEvaluator.

# Model Variant Mode Score Avg. dur Tokens Date
1 gemini-3-pro-preview code,high file 0.900 45.1s 762.6k 2026-03-13
2 gemini-3-pro-preview file 0.887 35.6s 759.8k 2026-03-13
3 gpt-5-2-pro file 0.807 1.6m 9.0M 2026-03-13
4 gpt-5-2 file 0.793 29.9s 6.9M 2026-03-13
5 gpt-5-2-pro code,high file 0.760 2.6m 10.3M 2026-03-13
6 gpt-5-2 code,high file 0.720 3.1m 12.4M 2026-03-13
7 claude-opus-4-6 code,high file 0.673 1.3m 43.3M 2026-03-23
8 claude-opus-4-5 file 0.660 16.5s 6.2M 2026-03-20
9 claude-opus-4-5 code,high file 0.620 30.2s 8.7M 2026-03-23
10 claude-opus-4-6 file 0.607 19.0s 11.8M 2026-03-20

Click column headers to sort. Click mode chips to filter.