Price Per TokenPrice Per Token

MMLU-Pro Leaderboard

Massive Multitask Language Understanding benchmark testing knowledge across 57 diverse subjects including STEM, humanities, social sciences, and professional domains.

Data from Artificial Analysis

As of April 3, 2026, the top-scoring model on MMLU-Pro is Gemini 3 Pro Preview at 89.8%, followed by Claude Opus 4.5 at 89.5% and Gemini 3 Flash Preview at 89.0%. 202 models have been evaluated on this benchmark.

Last updated: April 3, 2026

Models

202

Best Score

89.8

Average

74.4

Std Dev

11.7

Categories
General Knowledge
Provider
Model
Input $/M
Output $/M
MMLU-Pro
Actions
$2.000
$12.000
89.8
$5.000
$25.000
89.5
$0.500
$3.000
89.0
$5.000
$25.000
88.9
$0.500
$3.000
88.2
$15.000
$75.000
88.0
$3.000
$15.000
87.5
$0.270
$0.950
87.5
$10.500
$84.000
87.4
$15.000
$75.000
87.3
$0.625
$5.000
87.1
$0.625
$5.000
87.0
$0.625
$5.000
86.7
$3.000
$15.000
86.6
$1.250
$10.000
86.5
$0.400
$1.200
86.3
$1.000
$10.000
86.2
$0.260
$0.380
86.2
$15.000
$75.000
86.0
$0.625
$5.000
86.0
$3.000
$15.000
86.0
$1.250
$10.000
86.0
$1.250
$10.000
85.8
$0.390
$1.750
85.6
$0.200
$0.500
85.4
$2.000
$8.000
85.3
$0.150
$0.750
85.1
$0.210
$0.790
85.1
$0.200
$0.500
85.0
$0.270
$0.410
85.0
$0.450
$2.150
84.9
$0.550
$2.200
84.8
$0.550
$2.000
84.4
$0.149
$0.900
84.3
$0.090
$0.290
84.3
$3.000
$15.000
84.2
$15.000
$60.000
84.1
$0.780
$3.900
84.1
$0.780
$3.900
83.8
$3.000
$15.000
83.7
$1.250
$10.000
83.7
$3.000
$15.000
83.7
$0.125
$1.000
83.7
$0.260
$0.380
83.7
$0.210
$0.790
83.6
$0.270
$0.410
83.6
$0.600
$2.200
83.5
$0.150
$0.750
83.3
$0.550
$2.200
83.2
$0.300
$2.500
83.2
$1.000
$3.000
82.9
$0.390
$1.700
82.9
$0.400
$0.800
82.8
$0.250
$0.500
82.8
$0.071
$0.100
82.8
$0.780
$3.900
82.4
$0.550
$2.200
82.4
$0.780
$3.900
82.4
$0.200
$0.880
82.3
$0.200
$1.100
82.2
$0.255
$1.000
82.0
$0.250
$2.000
82.0
$0.200
$0.770
81.9
$0.400
$2.000
81.9
$0.090
$0.780
81.9
$0.130
$0.850
81.5
$0.100
$0.400
81.4
$1.750
$14.000
81.4
$0.207
$0.828
81.3
$0.130
$0.400
81.1
$0.150
$0.600
80.9
$0.300
$2.500
80.9
$0.400
$1.760
80.8
$0.039
$0.100
80.8
$0.100
$0.400
80.8
$2.000
$8.000
80.6
$0.625
$5.000
80.6
$3.000
$15.000
80.3
$1.100
$4.400
80.2
$0.625
$5.000
80.1
$1.000
$5.000
80.0
$3.000
$15.000
79.9
$0.300
$0.900
79.9
$0.080
$0.240
79.8
$0.100
$0.400
79.6
$0.700
$0.800
79.5
$0.050
$0.200
79.4
$0.390
$1.750
79.4
$0.200
$1.500
79.3
$1.100
$4.400
79.1
$0.104
$0.416
79.1
$0.220
$0.900
78.8
$0.600
$1.800
78.8
$0.390
$1.700
78.4
$0.200
$0.800
78.1
$0.050
$0.400
78.0
$0.100
$0.400
77.9
$0.080
$0.280
77.7
$0.090
$0.300
77.7
$0.280
$0.900
77.6
$0.039
$0.100
77.5
$0.060
$0.200
77.4
$3.000
$15.000
77.2
$0.050
$0.400
77.2
$0.150
$0.580
76.4
$0.130
$0.520
76.4
$0.150
$0.500
76.3
$1.040
$4.160
76.2
$0.400
$0.800
76.2
$0.400
$0.900
76.2
$0.400
$2.000
76.0
$1.000
$5.000
76.0
$0.100
$0.400
75.9
$0.200
$0.200
75.9
$0.150
$0.500
75.9
$3.000
$15.000
75.5
$0.200
$0.770
75.2
$0.080
$0.300
75.2
$0.300
$0.900
75.2
$0.600
$1.800
75.1
$0.030
$0.100
74.8
$0.090
$0.290
74.4
$0.090
$0.290
74.4
$0.050
$0.200
74.3
$0.200
$0.200
74.3
$0.200
$0.500
74.3
$0.300
$2.500
74.3
$0.040
$0.160
74.2
$0.290
$0.290
73.9
$0.040
$0.160
73.9
$2.500
$12.500
73.3
$0.900
$0.900
73.2
$0.200
$0.500
73.0
$1.000
$3.000
72.9
$0.080
$0.240
72.7
$0.075
$0.300
72.4
$0.100
$0.400
72.4
$0.120
$0.390
72.0
$0.065
$0.140
71.4
$0.100
$0.320
71.3
$2.500
$10.000
71.2
$0.080
$0.280
71.0
$0.400
$2.000
70.8
$0.070
$0.270
70.6
$2.000
$6.000
70.1
$2.000
$6.000
69.7
$0.200
$0.600
69.7
$0.200
$0.200
69.6
$10.000
$30.000
69.4
$0.200
$0.200
69.3
$0.100
$0.400
69.2
$0.800
$3.200
69.1
$0.900
$0.900
69.0
$1.000
$1.000
68.9
$0.080
$0.200
68.6
$2.000
$6.000
68.3
$0.400
$2.000
68.3
$0.075
$0.200
68.1
$0.340
$0.390
67.6
$0.060
$0.200
67.5
$0.200
$0.200
67.2
$0.080
$0.160
66.9
$0.100
$0.200
66.9
$0.130
$0.400
66.4
$0.030
$0.110
65.9
$0.050
$0.200
65.7
$0.120
$0.200
65.5
$0.050
$0.080
65.2
$0.200
$0.200
64.9
$0.150
$0.600
64.8
$0.150
$0.580
64.8
$0.050
$0.200
64.3
$0.150
$0.150
64.2
$0.660
$0.800
63.5
$0.800
$4.000
63.4
$0.033
$0.130
63.3
$0.070
$0.280
62.2
$0.200
$0.600
61.1
$0.040
$0.130
59.5
$0.060
$0.240
59.0
$0.200
$0.200
58.6
$0.050
$0.200
57.9
$2.000
$8.000
57.7
$0.510
$0.740
57.4
$0.300
$0.300
57.1
$0.050
$0.400
55.6
$1.200
$1.200
53.7
$0.035
$0.140
53.1
$0.100
$0.200
52.2
$0.500
$1.500
51.5
$0.050
$0.200
51.1
$0.010
$0.020
50.5
$0.020
$0.040
48.8
$0.020
$0.050
47.6
$0.030
$0.090
47.3
$0.049
$0.049
46.4
$0.500
$1.500
46.2
$0.040
$0.080
41.7
$0.030
$0.040
40.5
$0.140
$0.420
38.7
$0.030
$0.050
34.7
$0.020
$0.020
20.0

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

OpenClaw

Deploy OpenClaw in Under 1 Minute We handle hosting, scaling, and maintenance

8 Ways to Use Fewer Tokens

About MMLU-Pro

Massive Multitask Language Understanding benchmark testing knowledge across 57 diverse subjects including STEM, humanities, social sciences, and professional domains.

This leaderboard shows all models with MMLU-Pro benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Massive Multitask Language Understanding benchmark testing knowledge across 57 diverse subjects including STEM, humanities, social sciences, and professional domains.
As of April 3, 2026, Gemini 3 Pro Preview leads the MMLU-Pro leaderboard with a score of 89.8. Rankings change as new models are released and evaluated.
Currently 202 models have been evaluated on MMLU-Pro, with an average score of 74.4 and standard deviation of 11.7.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.