Price Per TokenPrice Per Token

Humanity's Last Exam Leaderboard

Humanity's Last Exam — extremely challenging questions designed to test the upper limits of AI capability across diverse domains.

Data from Artificial Analysis

As of April 18, 2026, the top-scoring model on Humanity's Last Exam is Gemini 3.1 Pro Preview at 44.7%, followed by GPT-5.4 at 41.6% and GPT-5.3 Codex at 39.9%. 262 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

262

Best Score

44.7

Average

10.7

Std Dev

8.6

Categories
Reasoning and Logic
Provider
Model
Input $/M
Output $/M
Humanity's Last Exam
Actions
$2.000
$12.000
44.7
$2.500
$15.000
41.6
$1.750
$14.000
39.9
$2.000
$12.000
37.2
$5.000
$25.000
36.7
$10.500
$84.000
35.4
$0.500
$3.000
34.7
$1.750
$14.000
33.5
$0.207
$0.828
33.4
$3.000
$15.000
30.0
$0.383
$1.720
29.4
$5.000
$25.000
28.4
$0.300
$1.200
28.1
$0.950
$3.150
28.0
$2.000
$12.000
27.6
$0.390
$0.900
27.3
$0.720
$2.300
27.2
$1.250
$10.000
26.5
$1.250
$10.000
26.5
$0.780
$3.900
26.2
$0.400
$1.200
26.1
$1.250
$10.000
25.6
$0.950
$3.150
25.6
$1.200
$4.000
25.4
$0.390
$1.750
25.1
$0.875
$7.000
24.9
$3.000
$15.000
23.9
$1.250
$10.000
23.5
$1.250
$10.000
23.4
$0.260
$2.080
23.4
$0.130
$0.380
22.7
$0.100
$0.300
22.6
$0.550
$2.200
22.3
$0.260
$0.380
22.2
$0.290
$0.950
22.2
$0.195
$0.900
22.2
$1.000
$10.000
21.1
$0.090
$0.290
21.1
$2.000
$8.000
20.0
$0.090
$0.290
20.0
$0.400
$2.000
19.9
$0.125
$1.000
19.7
$0.163
$0.900
19.7
$0.100
$0.300
19.1
$0.118
$0.950
19.1
$0.390
$0.900
18.8
$5.000
$25.000
18.6
$0.039
$0.100
18.5
$1.250
$10.000
18.4
$0.070
$0.350
18.3
$0.200
$0.500
17.6
$0.550
$2.200
17.5
$3.000
$15.000
17.3
$1.250
$10.000
17.1
$0.200
$0.500
17.0
$0.250
$2.000
16.9
$0.250
$1.500
16.2
$1.200
$4.000
15.8
$0.250
$0.750
15.5
$1.250
$10.000
15.4
$0.210
$0.790
15.2
$0.130
$0.600
15.0
$0.500
$2.150
14.9
$0.260
$2.080
14.8
$0.125
$1.000
14.6
$0.500
$3.000
14.1
$0.270
$0.410
13.8
$0.390
$1.740
13.3
$0.040
$0.150
13.3
$3.000
$15.000
13.2
$0.195
$0.900
13.2
$0.150
$0.750
13.0
$5.000
$25.000
12.9
$0.163
$0.900
12.8
$0.255
$1.000
12.5
$1.100
$4.400
12.3
$0.383
$1.720
12.3
$0.600
$2.200
12.2
$0.200
$1.100
12.1
$0.780
$3.900
12.0
$15.000
$75.000
11.9
$0.455
$0.900
11.7
$15.000
$75.000
11.7
$0.098
$0.300
11.7
$0.300
$2.500
11.6
$0.250
$0.500
11.1
$0.300
$2.500
11.1
$0.780
$3.900
11.1
$0.900
$0.900
11.0
$3.000
$15.000
10.8
$0.070
$0.350
10.7
$0.071
$0.100
10.6
$2.500
$15.000
10.6
$0.260
$0.380
10.5
$3.000
$15.000
10.3
$1.000
$3.000
10.3
$0.050
$0.200
10.2
$0.260
$0.900
10.1
$0.030
$0.100
9.8
$0.080
$0.300
9.8
$1.000
$5.000
9.7
$3.000
$15.000
9.6
$0.104
$0.416
9.6
$0.550
$2.000
9.3
$0.780
$3.900
9.3
$0.150
$0.800
9.3
$0.300
$0.900
8.9
$0.550
$2.200
8.7
$0.130
$0.600
8.7
$0.270
$0.410
8.6
$0.040
$0.150
8.6
$0.210
$0.790
8.4
$0.080
$0.240
8.3
$0.150
$0.580
8.2
$0.400
$1.760
8.2
$0.050
$0.400
8.2
$0.090
$0.290
8.0
$3.000
$15.000
7.9
$0.130
$0.400
7.9
$15.000
$60.000
7.7
$0.050
$0.400
7.6
$0.400
$1.760
7.5
$0.200
$1.500
7.5
$1.000
$1.000
7.3
$0.090
$0.780
7.3
$0.875
$7.000
7.3
$0.720
$2.300
7.2
$3.000
$15.000
7.1
$0.060
$0.400
7.1
$0.550
$2.200
7.0
$0.130
$0.850
6.8
$0.090
$0.300
6.8
$0.100
$0.400
6.8
$0.080
$0.280
6.6
$0.100
$0.400
6.6
$0.100
$0.400
6.5
$0.100
$0.400
6.4
$0.130
$0.520
6.4
$0.150
$0.750
6.3
$0.400
$2.000
6.3
$0.200
$0.880
6.3
$0.104
$0.416
6.3
$0.700
$0.800
6.1
$0.390
$1.750
6.1
$0.150
$0.500
6.0
$0.200
$0.200
5.9
$15.000
$75.000
5.9
$0.600
$1.800
5.9
$0.150
$0.500
5.9
$1.250
$10.000
5.8
$0.100
$0.200
5.8
$0.120
$0.200
5.7
$0.290
$0.290
5.5
$1.250
$10.000
5.4
$0.020
$0.020
5.3
$0.100
$0.400
5.3
$0.200
$0.200
5.3
$0.030
$0.050
5.2
$0.060
$0.060
5.2
$0.040
$0.080
5.2
$0.200
$0.770
5.2
$0.039
$0.100
5.2
$0.390
$1.740
5.2
$1.250
$10.000
5.2
$0.030
$0.040
5.1
$0.020
$0.050
5.1
$0.200
$0.200
5.1
$3.000
$15.000
5.1
$0.300
$2.500
5.1
$0.030
$0.100
5.1
$0.300
$2.500
5.0
$0.200
$0.500
5.0
$0.200
$0.500
5.0
$0.010
$0.020
4.9
$0.200
$0.600
4.9
$0.060
$0.400
4.9
$3.000
$15.000
4.8
$0.150
$0.580
4.8
$0.040
$0.130
4.8
$0.100
$0.300
4.8
$0.150
$0.600
4.8
$0.200
$0.200
4.8
$0.035
$0.140
4.7
$0.080
$0.160
4.7
$0.455
$0.900
4.7
$0.200
$0.200
4.7
$2.500
$12.500
4.7
$0.340
$0.390
4.6
$0.900
$0.900
4.6
$0.060
$0.240
4.6
$2.500
$10.000
4.6
$0.200
$0.800
4.6
$2.000
$8.000
4.6
$0.080
$0.280
4.6
$0.040
$0.160
4.6
$0.100
$0.400
4.6
$0.200
$0.200
4.6
$0.050
$0.200
4.6
$0.140
$0.420
4.5
$1.040
$4.160
4.5
$0.200
$0.200
4.5
$0.510
$0.740
4.4
$0.075
$0.300
4.4
$0.060
$0.120
4.4
$0.220
$0.900
4.4
$0.400
$2.000
4.4
$0.200
$0.200
4.4
$0.080
$0.300
4.3
$0.080
$0.240
4.3
$0.060
$0.200
4.3
$0.400
$2.000
4.3
$0.075
$0.200
4.3
$0.100
$0.400
4.3
$1.000
$5.000
4.3
$0.150
$0.150
4.3
$0.900
$0.900
4.2
$0.120
$0.390
4.2
$0.033
$0.130
4.2
$0.060
$0.200
4.2
$0.050
$0.200
4.2
$1.000
$3.000
4.2
$1.200
$1.200
4.1
$0.300
$0.300
4.1
$0.065
$0.140
4.1
$0.050
$0.080
4.1
$0.200
$0.600
4.1
$0.050
$0.400
4.1
$0.150
$0.600
4.0
$2.000
$6.000
4.0
$0.100
$0.320
4.0
$3.000
$15.000
4.0
$0.070
$0.270
4.0
$0.040
$0.160
4.0
$0.250
$1.250
3.9
$3.000
$15.000
3.9
$0.100
$0.400
3.9
$0.660
$0.800
3.8
$0.200
$0.600
3.8
$0.400
$2.000
3.8
$2.000
$8.000
3.8
$0.050
$0.200
3.7
$0.200
$0.200
3.7
$0.070
$0.280
3.7
$0.100
$0.400
3.7
$0.300
$0.900
3.7
$2.000
$6.000
3.6
$0.075
$0.300
3.6
$0.200
$0.770
3.6
$0.600
$1.800
3.6
$0.130
$0.400
3.6
$0.400
$0.900
3.6
$0.800
$4.000
3.5
$0.280
$0.900
3.5
$0.100
$0.400
3.5
$0.500
$1.500
3.4
$0.800
$3.200
3.4
$5.000
$15.000
3.3
$0.117
$1.365
3.3
$2.000
$6.000
3.2
$0.300
$2.500
3.0
$0.080
$0.200
2.9
$0.050
$0.200
2.8

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About Humanity's Last Exam

Humanity's Last Exam — extremely challenging questions designed to test the upper limits of AI capability across diverse domains.

This leaderboard shows all models with Humanity's Last Exam benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Humanity's Last Exam — extremely challenging questions designed to test the upper limits of AI capability across diverse domains.
As of April 18, 2026, Gemini 3.1 Pro Preview leads the Humanity's Last Exam leaderboard with a score of 44.7. Rankings change as new models are released and evaluated.
Currently 262 models have been evaluated on Humanity's Last Exam, with an average score of 10.7 and standard deviation of 8.6.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.