Price Per TokenPrice Per Token

MMMU Leaderboard

Multimodal Understanding benchmark testing vision-language models on expert-level tasks.

Data from LayerLens

As of April 18, 2026, the top-scoring model on MMMU is o4 Mini High at 79.2%, followed by GPT-5 at 79.1% and GPT-5 at 79.1%. 64 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

64

Best Score

79.2

Average

60.1

Std Dev

15.5

Categories
Multimodal
Provider
Model
Input $/M
Output $/M
MMMU
Actions
$1.100
$4.400
79.2
$1.250
$10.000
79.1
$1.250
$10.000
79.1
$1.250
$10.000
79.1
$1.250
$10.000
79.1
$0.260
$2.080
78.1
$0.260
$2.080
78.1
$0.195
$0.900
77.6
$0.195
$0.900
77.6
$0.065
$0.260
76.8
$0.163
$0.900
76.4
$0.163
$0.900
76.4
$5.000
$25.000
76.3
$5.000
$25.000
76.3
$0.125
$1.000
75.3
$0.125
$1.000
75.3
$3.000
$15.000
75.3
$3.000
$15.000
75.3
$3.000
$15.000
75.3
$3.000
$15.000
72.9
$3.000
$15.000
71.7
$2.000
$8.000
69.3
$0.100
$0.400
69.0
$0.200
$0.880
68.2
$3.000
$15.000
66.9
$3.000
$15.000
66.9
$1.000
$5.000
65.2
$1.000
$5.000
65.2
$0.875
$7.000
62.3
$0.875
$7.000
62.3
$1.250
$10.000
60.7
$1.250
$10.000
60.7
$0.200
$0.500
60.4
$0.400
$2.000
58.7
$0.400
$2.000
58.1
$0.780
$3.900
57.4
$0.780
$3.900
57.4
$0.075
$0.200
55.9
$0.800
$4.000
54.3
$0.080
$0.160
53.9
$0.300
$2.500
53.0
$0.300
$2.500
53.0
$0.300
$2.500
53.0
$0.300
$2.500
53.0
$0.150
$0.580
52.9
$0.150
$0.580
52.9
$0.300
$0.500
52.4
$0.080
$0.240
52.1
$0.080
$0.240
52.1
$0.550
$2.000
50.7
$0.800
$3.200
50.1
$3.000
$15.000
49.1
$0.080
$0.280
49.0
$0.080
$0.280
49.0
$0.014
$0.028
48.2
$0.900
$0.900
45.2
$0.080
$0.300
42.3
$2.500
$10.000
42.0
$0.065
$0.140
39.7
$2.500
$10.000
37.9
$0.150
$0.600
31.3
$0.390
$0.900
21.8
$0.390
$0.900
21.8
$0.260
$1.560
13.1

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About MMMU

Multimodal Understanding benchmark testing vision-language models on expert-level tasks.

This leaderboard shows all models with MMMU benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Multimodal Understanding benchmark testing vision-language models on expert-level tasks.
As of April 18, 2026, o4 Mini High leads the MMMU leaderboard with a score of 79.2. Rankings change as new models are released and evaluated.
Currently 64 models have been evaluated on MMMU, with an average score of 60.1 and standard deviation of 15.5.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.