Price Per TokenPrice Per Token

MMLU Leaderboard

Massive Multitask Language Understanding — tests knowledge across 57 subjects.

Data from LayerLens

As of March 15, 2026, the top-scoring model on MMLU is GLM 5 at 91.7%, followed by GLM 5 at 91.7% and R1 0528 at 90.5%. 34 models have been evaluated on this benchmark.

Last updated: March 15, 2026

Models

34

Best Score

91.7

Average

78.8

Std Dev

17.6

Categories
General Knowledge
Provider
Model
Input $/M
Output $/M
MMLU
Actions
$0.720
$2.300
91.7
$0.720
$2.300
91.7
$0.450
$2.150
90.5
$0.300
$0.500
89.2
$0.550
$2.200
88.9
$0.550
$2.200
88.3
$0.550
$2.200
88.3
$0.039
$0.100
87.6
$0.039
$0.100
87.6
$0.150
$0.400
85.9
$0.150
$0.400
85.9
$0.300
$2.500
85.7
$0.300
$2.500
85.7
$0.150
$0.600
85.5
$3.000
$15.000
85.3
$3.000
$15.000
85.3
$0.080
$0.280
85.3
$0.080
$0.280
85.3
$0.100
$0.400
84.8
$2.000
$8.000
84.6
$0.280
$0.900
84.5
$2.500
$10.000
84.1
$0.014
$0.028
83.4
$0.400
$2.000
82.6
$0.800
$3.200
78.3
$0.060
$0.140
77.6
$2.000
$6.000
77.2
$0.060
$0.180
76.0
$0.070
$0.280
73.5
$0.800
$4.000
72.8
$0.035
$0.140
68.9
$0.900
$0.900
37.1
$0.080
$0.300
24.6
$0.030
$0.050
15.3

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

108 out of our 483 tracked models have had a price change in March.

Get our weekly newsletter on pricing changes, new releases, and tools.

About MMLU

Massive Multitask Language Understanding — tests knowledge across 57 subjects.

This leaderboard shows all models with MMLU benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Massive Multitask Language Understanding — tests knowledge across 57 subjects.
As of March 15, 2026, GLM 5 leads the MMLU leaderboard with a score of 91.7. Rankings change as new models are released and evaluated.
Currently 34 models have been evaluated on MMLU, with an average score of 78.8 and standard deviation of 17.6.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.