Price Per TokenPrice Per Token

GAIA Leaderboard

GAIA — General AI Assistants benchmark testing multi-step real-world tasks.

Data from LayerLens

As of April 18, 2026, the top-scoring model on GAIA is GPT-5 Mini at 44.8%, followed by GPT-5 Mini at 44.8% and Claude 3.7 Sonnet at 43.9%. 12 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

12

Best Score

44.8

Average

27.5

Std Dev

13.6

Categories
Multi-turn
Provider
Model
Input $/M
Output $/M
GAIA
Actions
$0.125
$1.000
44.8
$0.125
$1.000
44.8
$3.000
$15.000
43.9
$3.000
$15.000
43.9
$1.000
$10.000
33.3
$0.500
$2.150
27.9
$0.400
$2.000
23.3
$0.090
$0.400
20.6
$0.080
$0.240
12.3
$0.080
$0.240
12.3
$0.150
$0.750
11.5
$0.150
$0.750
11.5

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About GAIA

GAIA — General AI Assistants benchmark testing multi-step real-world tasks.

This leaderboard shows all models with GAIA benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

GAIA — General AI Assistants benchmark testing multi-step real-world tasks.
As of April 18, 2026, GPT-5 Mini leads the GAIA leaderboard with a score of 44.8. Rankings change as new models are released and evaluated.
Currently 12 models have been evaluated on GAIA, with an average score of 27.5 and standard deviation of 13.6.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.