Price Per TokenPrice Per Token

AIME 2024 Leaderboard

American Invitational Mathematics Examination 2024 problems testing olympiad-level mathematical reasoning.

Data from Artificial Analysis

As of April 4, 2026, the top-scoring model on AIME 2024 is GPT-5 at 95.7%, followed by Grok 4 at 94.3% and o4 Mini at 94.0%. 118 models have been evaluated on this benchmark.

Last updated: April 4, 2026

Models

118

Best Score

95.7

Average

40.4

Std Dev

30.8

Categories
Mathematical Problem Solving
Provider
Model
Input $/M
Output $/M
AIME 2024
Actions
$0.625
$5.000
95.7
$3.000
$15.000
94.3
$0.550
$2.200
94.0
$0.149
$0.900
94.0
$0.250
$0.500
93.3
$0.625
$5.000
91.7
$2.000
$8.000
90.3
$0.450
$2.150
89.3
$1.000
$10.000
88.7
$0.600
$2.200
87.3
$1.250
$10.000
87.0
$1.100
$4.400
86.0
$0.100
$0.400
86.0
$1.250
$10.000
84.3
$0.400
$0.800
84.0
$0.625
$5.000
83.0
$0.300
$2.500
82.3
$0.400
$1.760
81.3
$0.080
$0.240
80.7
$2.000
$8.000
79.0
$0.150
$0.580
78.0
$3.000
$15.000
77.3
$1.100
$4.400
77.0
$0.060
$0.200
76.3
$15.000
$75.000
75.7
$0.080
$0.280
75.3
$0.050
$0.200
74.7
$0.090
$0.300
72.7
$15.000
$60.000
72.3
$0.071
$0.100
71.7
$0.100
$0.400
70.3
$0.550
$2.200
69.3
$0.550
$2.200
69.3
$0.290
$0.290
68.7
$0.550
$2.000
68.3
$0.130
$0.850
67.3
$0.700
$0.800
67.0
$0.200
$0.200
65.7
$15.000
$75.000
56.3
$0.200
$0.770
52.0
$0.100
$0.200
51.0
$0.300
$2.500
50.0
$0.100
$0.400
50.0
$0.280
$0.900
49.3
$1.000
$1.000
48.7
$3.000
$15.000
48.7
$0.220
$0.900
47.7
$0.150
$0.580
45.3
$0.400
$2.000
44.0
$2.000
$8.000
43.7
$0.200
$0.800
43.0
$3.000
$15.000
40.7
$0.150
$0.600
39.0
$0.625
$5.000
36.7
$0.100
$0.400
33.0
$3.000
$15.000
33.0
$0.400
$0.800
32.7
$0.075
$0.200
32.3
$0.080
$0.240
30.3
$0.100
$0.320
30.0
$0.070
$0.270
29.7
$3.000
$15.000
29.0
$0.080
$0.300
28.3
$0.060
$0.200
28.0
$0.075
$0.300
27.7
$0.080
$0.280
26.0
$0.080
$0.160
25.3
$0.200
$0.770
25.3
$0.900
$0.900
24.7
$0.050
$0.200
24.3
$0.050
$0.200
23.7
$1.040
$4.160
23.3
$3.000
$15.000
22.3
$0.040
$0.130
22.0
$0.900
$0.900
21.3
$0.200
$0.200
21.3
$0.200
$0.200
21.3
$0.200
$0.200
21.3
$0.340
$0.390
17.3
$2.500
$12.500
17.0
$0.120
$0.390
16.0
$3.000
$15.000
15.7
$10.000
$30.000
15.0
$0.065
$0.140
14.3
$0.020
$0.040
13.7
$0.100
$0.400
13.7
$0.200
$0.600
13.0
$0.660
$0.800
12.0
$0.033
$0.130
12.0
$2.500
$10.000
11.7
$0.150
$0.600
11.7
$2.000
$6.000
11.0
$0.200
$0.600
11.0
$0.800
$3.200
10.7
$0.060
$0.240
10.7
$2.500
$10.000
9.7
$0.049
$0.049
9.3
$2.000
$6.000
9.3
$0.030
$0.110
9.3
$0.035
$0.140
8.0
$0.050
$0.080
8.0
$0.020
$0.050
7.7
$2.000
$6.000
7.0
$0.030
$0.050
6.7
$0.400
$2.000
6.7
$0.040
$0.080
6.3
$2.000
$8.000
5.7
$0.030
$0.090
5.3
$0.800
$4.000
3.3
$0.300
$0.300
2.3
$0.250
$1.250
1.0
$0.070
$0.280
0.3
$0.140
$0.420
-
$0.500
$1.500
-
$1.200
$1.200
-
$0.510
$0.740
-
$0.030
$0.040
-
$0.020
$0.020
-

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

OpenClaw

Deploy OpenClaw in Under 1 Minute We handle hosting, scaling, and maintenance

8 Ways to Use Fewer Tokens

About AIME 2024

American Invitational Mathematics Examination 2024 problems testing olympiad-level mathematical reasoning.

This leaderboard shows all models with AIME 2024 benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

American Invitational Mathematics Examination 2024 problems testing olympiad-level mathematical reasoning.
As of April 4, 2026, GPT-5 leads the AIME 2024 leaderboard with a score of 95.7. Rankings change as new models are released and evaluated.
Currently 118 models have been evaluated on AIME 2024, with an average score of 40.4 and standard deviation of 30.8.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.