Edison Labs

Benchmarks

Public benchmark suites and their leaderboards.

LabBench2

Real-world capabilities of AI systems on scientific research tasks.

15 sub-benchmarks 9 models 201 runs

BixBench

Coming soon

Bioinformatics workflows under realistic constraints.

HLE Gold

Coming soon

Humanity's Last Exam — gold-set curated subset.