AI2 Reasoning Challenge (Easy set) — grade-school science questions.
Data from LayerLens
As of April 18, 2026, the top-scoring model on ARC Easy is Claude Opus 4 at 99.7%, followed by Claude Opus 4 at 99.7% and Qwen3 32B at 99.1%. 40 models have been evaluated on this benchmark.
Last updated: April 18, 2026
Models
40
Best Score
99.7
Average
97.9
Std Dev
3.3
Provider | Model | Input $/M | Output $/M | ARC Easy | Actions |
|---|---|---|---|---|---|
$15.000 | $75.000 | 99.7 | |||
$15.000 | $75.000 | 99.7 | |||
$0.080 | $0.240 | 99.1 | |||
$0.080 | $0.240 | 99.1 | |||
$0.400 | $2.000 | 99.1 | |||
$3.000 | $15.000 | 99.1 | |||
$3.000 | $15.000 | 99.1 | |||
$3.000 | $15.000 | 99.0 | |||
$3.000 | $15.000 | 99.0 | |||
$2.000 | $8.000 | 99.0 | |||
$0.065 | $0.140 | 98.9 | |||
$1.100 | $4.400 | 98.9 | |||
$0.400 | $1.760 | 98.9 | |||
$0.400 | $1.760 | 98.9 | |||
$0.300 | $2.500 | 98.9 | |||
$0.300 | $2.500 | 98.9 | |||
$0.300 | $2.500 | 98.9 | |||
$0.300 | $2.500 | 98.9 | |||
$0.300 | $0.500 | 98.9 | |||
$0.800 | $3.200 | 98.8 | |||
$2.500 | $10.000 | 98.8 | |||
$0.100 | $0.400 | 98.8 | |||
$0.150 | $0.580 | 98.7 | |||
$0.150 | $0.580 | 98.7 | |||
$0.500 | $2.150 | 98.7 | |||
$0.550 | $2.200 | 98.6 | |||
$0.080 | $0.300 | 98.6 | |||
$0.150 | $0.600 | 98.6 | |||
$0.014 | $0.028 | 98.6 | |||
$0.080 | $0.160 | 98.2 | |||
$0.550 | $2.000 | 97.9 | |||
$0.075 | $0.200 | 97.8 | |||
$0.300 | $0.300 | 97.6 | |||
$0.060 | $0.240 | 97.5 | |||
$2.500 | $10.000 | 97.2 | |||
$0.070 | $0.280 | 97.1 | |||
$2.500 | $10.000 | 96.6 | |||
$0.035 | $0.140 | 95.8 | |||
$0.060 | $0.120 | 93.4 | |||
$0.030 | $0.050 | 78.6 |
Pricing from OpenRouter. Benchmarks from Artificial Analysis.
Get our weekly newsletter on pricing changes, new releases, and tools.
AI2 Reasoning Challenge (Easy set) — grade-school science questions.
This leaderboard shows all models with ARC Easy benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.