Mathematics benchmark covering algebra, geometry, number theory, and calculus problems.
Data from LayerLens
As of March 16, 2026, the top-scoring model on Mathematics is Claude Opus 4.6 at 95.6%, followed by Claude Opus 4.6 at 95.6% and o4 Mini High at 94.6%. 34 models have been evaluated on this benchmark.
Last updated: March 16, 2026
Models
34
Best Score
95.6
Average
84.2
Std Dev
12.0
Provider | Model | Input $/M | Output $/M | Mathematics | Actions |
|---|---|---|---|---|---|
$5.000 | $25.000 | 95.6 | |||
$5.000 | $25.000 | 95.6 | |||
$1.100 | $4.400 | 94.6 | |||
$0.720 | $2.300 | 94.0 | |||
$0.720 | $2.300 | 94.0 | |||
$1.100 | $4.400 | 93.1 | |||
$0.080 | $0.280 | 93.0 | |||
$0.080 | $0.280 | 93.0 | |||
$0.550 | $2.190 | 92.7 | |||
$0.150 | $0.400 | 92.1 | |||
$0.150 | $0.400 | 92.1 | |||
$3.000 | $15.000 | 92.0 | |||
$15.000 | $75.000 | 91.2 | |||
$15.000 | $75.000 | 91.2 | |||
$0.100 | $0.400 | 90.7 | |||
$3.000 | $15.000 | 90.3 | |||
$3.000 | $15.000 | 90.3 | |||
$3.000 | $15.000 | 89.0 | |||
$3.000 | $15.000 | 89.0 | |||
$0.150 | $0.600 | 86.8 | |||
$0.030 | $0.110 | 84.9 | |||
$0.400 | $2.000 | 84.2 | |||
$0.014 | $0.028 | 83.1 | |||
$0.080 | $0.300 | 80.0 | |||
$0.060 | $0.140 | 78.2 | |||
$2.500 | $10.000 | 77.2 | |||
$0.800 | $3.200 | 74.8 | |||
$0.030 | $0.050 | 74.4 | |||
$0.800 | $4.000 | 73.6 | |||
$1.000 | $10.000 | 70.8 | |||
$0.060 | $0.240 | 70.1 | |||
$2.000 | $6.000 | 68.5 | |||
$0.035 | $0.140 | 67.0 | |||
$2.500 | $10.000 | 36.9 |
Pricing from OpenRouter. Benchmarks from Artificial Analysis.
108 out of our 483 tracked models have had a price change in March.
Get our weekly newsletter on pricing changes, new releases, and tools.
Mathematics benchmark covering algebra, geometry, number theory, and calculus problems.
This leaderboard shows all models with Mathematics benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.