Multimodal Understanding benchmark testing vision-language models on expert-level tasks.
Data from LayerLens
As of April 18, 2026, the top-scoring model on MMMU is o4 Mini High at 79.2%, followed by GPT-5 at 79.1% and GPT-5 at 79.1%. 64 models have been evaluated on this benchmark.
Last updated: April 18, 2026
Models
64
Best Score
79.2
Average
60.1
Std Dev
15.5
Provider | Model | Input $/M | Output $/M | MMMU | Actions |
|---|---|---|---|---|---|
$1.100 | $4.400 | 79.2 | |||
$1.250 | $10.000 | 79.1 | |||
$1.250 | $10.000 | 79.1 | |||
$1.250 | $10.000 | 79.1 | |||
$1.250 | $10.000 | 79.1 | |||
$0.260 | $2.080 | 78.1 | |||
$0.260 | $2.080 | 78.1 | |||
$0.195 | $0.900 | 77.6 | |||
$0.195 | $0.900 | 77.6 | |||
$0.065 | $0.260 | 76.8 | |||
$0.163 | $0.900 | 76.4 | |||
$0.163 | $0.900 | 76.4 | |||
$5.000 | $25.000 | 76.3 | |||
$5.000 | $25.000 | 76.3 | |||
$0.125 | $1.000 | 75.3 | |||
$0.125 | $1.000 | 75.3 | |||
$3.000 | $15.000 | 75.3 | |||
$3.000 | $15.000 | 75.3 | |||
$3.000 | $15.000 | 75.3 | |||
$3.000 | $15.000 | 72.9 | |||
$3.000 | $15.000 | 71.7 | |||
$2.000 | $8.000 | 69.3 | |||
$0.100 | $0.400 | 69.0 | |||
$0.200 | $0.880 | 68.2 | |||
$3.000 | $15.000 | 66.9 | |||
$3.000 | $15.000 | 66.9 | |||
$1.000 | $5.000 | 65.2 | |||
$1.000 | $5.000 | 65.2 | |||
$0.875 | $7.000 | 62.3 | |||
$0.875 | $7.000 | 62.3 | |||
$1.250 | $10.000 | 60.7 | |||
$1.250 | $10.000 | 60.7 | |||
$0.200 | $0.500 | 60.4 | |||
$0.400 | $2.000 | 58.7 | |||
$0.400 | $2.000 | 58.1 | |||
$0.780 | $3.900 | 57.4 | |||
$0.780 | $3.900 | 57.4 | |||
$0.075 | $0.200 | 55.9 | |||
$0.800 | $4.000 | 54.3 | |||
$0.080 | $0.160 | 53.9 | |||
$0.300 | $2.500 | 53.0 | |||
$0.300 | $2.500 | 53.0 | |||
$0.300 | $2.500 | 53.0 | |||
$0.300 | $2.500 | 53.0 | |||
$0.150 | $0.580 | 52.9 | |||
$0.150 | $0.580 | 52.9 | |||
$0.300 | $0.500 | 52.4 | |||
$0.080 | $0.240 | 52.1 | |||
$0.080 | $0.240 | 52.1 | |||
$0.550 | $2.000 | 50.7 | |||
$0.800 | $3.200 | 50.1 | |||
$3.000 | $15.000 | 49.1 | |||
$0.080 | $0.280 | 49.0 | |||
$0.080 | $0.280 | 49.0 | |||
$0.014 | $0.028 | 48.2 | |||
$0.900 | $0.900 | 45.2 | |||
$0.080 | $0.300 | 42.3 | |||
$2.500 | $10.000 | 42.0 | |||
$0.065 | $0.140 | 39.7 | |||
$2.500 | $10.000 | 37.9 | |||
$0.150 | $0.600 | 31.3 | |||
$0.390 | $0.900 | 21.8 | |||
$0.390 | $0.900 | 21.8 | |||
$0.260 | $1.560 | 13.1 |
Pricing from OpenRouter. Benchmarks from Artificial Analysis.
Get our weekly newsletter on pricing changes, new releases, and tools.
Multimodal Understanding benchmark testing vision-language models on expert-level tasks.
This leaderboard shows all models with MMMU benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.