AGIEval English — human-level reasoning tasks from standardized exams like SAT, LSAT, and civil service exams.
Data from LayerLens
As of April 18, 2026, the top-scoring model on AGIEval English is Gemini 3.1 Pro Preview at 94.0%, followed by Gemini 3 Pro Preview at 93.2% and Gemini 3 Pro Preview at 93.2%. 121 models have been evaluated on this benchmark.
Last updated: April 18, 2026
Models
121
Best Score
94.0
Average
79.0
Std Dev
11.9
Provider | Model | Input $/M | Output $/M | AGIEval English | Actions |
|---|---|---|---|---|---|
$2.000 | $12.000 | 94.0 | |||
$2.000 | $12.000 | 93.2 | |||
$2.000 | $12.000 | 93.2 | |||
$0.390 | $0.900 | 91.4 | |||
$0.390 | $0.900 | 91.4 | |||
$1.250 | $10.000 | 91.4 | |||
$1.250 | $10.000 | 91.4 | |||
$1.250 | $10.000 | 91.4 | |||
$1.250 | $10.000 | 91.4 | |||
$1.000 | $10.000 | 91.1 | |||
$2.000 | $8.000 | 90.9 | |||
$0.195 | $0.900 | 90.6 | |||
$0.195 | $0.900 | 90.6 | |||
$0.260 | $2.080 | 90.3 | |||
$0.260 | $2.080 | 90.3 | |||
$0.390 | $1.750 | 90.1 | |||
$0.390 | $1.750 | 90.1 | |||
$3.000 | $15.000 | 89.5 | |||
$3.000 | $15.000 | 89.5 | |||
$3.000 | $15.000 | 89.5 | |||
$0.065 | $0.260 | 89.5 | |||
$0.163 | $0.900 | 89.5 | |||
$0.163 | $0.900 | 89.5 | |||
$3.000 | $15.000 | 89.3 | |||
$0.720 | $2.300 | 89.1 | |||
$0.720 | $2.300 | 89.1 | |||
$0.500 | $2.150 | 89.0 | |||
$0.270 | $0.410 | 89.0 | |||
$0.200 | $0.500 | 88.9 | |||
$0.200 | $0.500 | 88.6 | |||
$0.390 | $1.740 | 88.6 | |||
$0.390 | $1.740 | 88.6 | |||
$0.150 | $0.580 | 88.0 | |||
$0.150 | $0.580 | 88.0 | |||
$1.100 | $4.400 | 87.8 | |||
$0.550 | $2.000 | 87.6 | |||
$3.000 | $15.000 | 87.5 | |||
$0.125 | $1.000 | 87.1 | |||
$0.125 | $1.000 | 87.1 | |||
$0.200 | $1.100 | 86.4 | |||
$3.000 | $15.000 | 86.0 | |||
$3.000 | $15.000 | 85.7 | |||
$0.270 | $0.410 | 85.7 | |||
$0.250 | $0.500 | 85.2 | |||
$0.255 | $1.000 | 85.1 | |||
$0.090 | $0.400 | 84.8 | |||
$0.210 | $0.790 | 84.7 | |||
$0.210 | $0.790 | 84.7 | |||
$0.500 | $3.000 | 84.2 | |||
$0.500 | $3.000 | 84.2 | |||
$3.000 | $15.000 | 83.9 | |||
$3.000 | $15.000 | 83.9 | |||
$15.000 | $75.000 | 83.4 | |||
$15.000 | $75.000 | 83.4 | |||
$0.080 | $0.280 | 83.1 | |||
$0.080 | $0.280 | 83.1 | |||
$0.550 | $2.200 | 83.0 | |||
$0.550 | $2.200 | 83.0 | |||
$0.039 | $0.100 | 82.7 | |||
$0.039 | $0.100 | 82.7 | |||
$15.000 | $75.000 | 82.0 | |||
$15.000 | $75.000 | 82.0 | |||
$0.550 | $2.200 | 81.9 | |||
$0.400 | $1.760 | 81.5 | |||
$0.400 | $1.760 | 81.5 | |||
$0.600 | $2.200 | 80.9 | |||
$0.200 | $0.880 | 80.7 | |||
$0.400 | $2.000 | 80.7 | |||
$0.150 | $0.750 | 79.8 | |||
$0.150 | $0.750 | 79.8 | |||
$0.030 | $0.100 | 79.6 | |||
$0.030 | $0.100 | 79.6 | |||
$0.780 | $3.900 | 79.1 | |||
$0.780 | $3.900 | 79.1 | |||
$0.118 | $0.950 | 78.3 | |||
$0.260 | $1.560 | 77.8 | |||
$1.000 | $5.000 | 76.9 | |||
$1.000 | $5.000 | 76.9 | |||
$0.150 | $0.600 | 76.7 | |||
$0.071 | $0.100 | 76.7 | |||
$0.014 | $0.028 | 76.2 | |||
$0.300 | $2.500 | 74.2 | |||
$0.300 | $2.500 | 74.2 | |||
$0.300 | $2.500 | 74.2 | |||
$0.300 | $2.500 | 74.2 | |||
$0.280 | $0.900 | 74.1 | |||
$3.000 | $15.000 | 74.0 | |||
$0.500 | $1.500 | 74.0 | |||
$0.100 | $0.400 | 73.4 | |||
$0.400 | $2.000 | 71.9 | |||
$0.875 | $7.000 | 71.7 | |||
$0.875 | $7.000 | 71.7 | |||
$3.000 | $15.000 | 71.3 | |||
$3.000 | $15.000 | 71.2 | |||
$3.000 | $15.000 | 71.2 | |||
$0.400 | $2.000 | 70.3 | |||
$2.500 | $10.000 | 70.1 | |||
$2.000 | $8.000 | 70.0 | |||
$1.250 | $10.000 | 69.4 | |||
$1.250 | $10.000 | 69.4 | |||
$0.065 | $0.140 | 68.2 | |||
$0.200 | $0.500 | 67.4 | |||
$0.200 | $0.500 | 67.0 | |||
$0.130 | $0.850 | 66.4 | |||
$0.800 | $4.000 | 66.2 | |||
$0.060 | $0.240 | 65.8 | |||
$0.800 | $3.200 | 65.5 | |||
$0.080 | $0.160 | 65.1 | |||
$2.000 | $6.000 | 64.7 | |||
$2.000 | $6.000 | 64.5 | |||
$0.900 | $0.900 | 63.7 | |||
$0.300 | $0.300 | 62.5 | |||
$0.075 | $0.200 | 62.3 | |||
$2.500 | $10.000 | 60.4 | |||
$0.070 | $0.280 | 60.0 | |||
$0.100 | $0.400 | 58.0 | |||
$0.100 | $0.400 | 58.0 | |||
$0.035 | $0.140 | 55.6 | |||
$0.060 | $0.120 | 53.1 | |||
$0.080 | $0.300 | 27.4 | |||
$0.030 | $0.050 | 26.5 |
Pricing from OpenRouter. Benchmarks from Artificial Analysis.
Get our weekly newsletter on pricing changes, new releases, and tools.
AGIEval English — human-level reasoning tasks from standardized exams like SAT, LSAT, and civil service exams.
This leaderboard shows all models with AGIEval English benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.