Price Per TokenPrice Per Token

AGIEval English Leaderboard

AGIEval English — human-level reasoning tasks from standardized exams like SAT, LSAT, and civil service exams.

Data from LayerLens

As of April 18, 2026, the top-scoring model on AGIEval English is Gemini 3.1 Pro Preview at 94.0%, followed by Gemini 3 Pro Preview at 93.2% and Gemini 3 Pro Preview at 93.2%. 121 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

121

Best Score

94.0

Average

79.0

Std Dev

11.9

Categories
Reasoning and Logic
Provider
Model
Input $/M
Output $/M
AGIEval English
Actions
$2.000
$12.000
94.0
$2.000
$12.000
93.2
$2.000
$12.000
93.2
$0.390
$0.900
91.4
$0.390
$0.900
91.4
$1.250
$10.000
91.4
$1.250
$10.000
91.4
$1.250
$10.000
91.4
$1.250
$10.000
91.4
$1.000
$10.000
91.1
$2.000
$8.000
90.9
$0.195
$0.900
90.6
$0.195
$0.900
90.6
$0.260
$2.080
90.3
$0.260
$2.080
90.3
$0.390
$1.750
90.1
$0.390
$1.750
90.1
$3.000
$15.000
89.5
$3.000
$15.000
89.5
$3.000
$15.000
89.5
$0.065
$0.260
89.5
$0.163
$0.900
89.5
$0.163
$0.900
89.5
$3.000
$15.000
89.3
$0.720
$2.300
89.1
$0.720
$2.300
89.1
$0.500
$2.150
89.0
$0.270
$0.410
89.0
$0.200
$0.500
88.9
$0.200
$0.500
88.6
$0.390
$1.740
88.6
$0.390
$1.740
88.6
$0.150
$0.580
88.0
$0.150
$0.580
88.0
$1.100
$4.400
87.8
$0.550
$2.000
87.6
$3.000
$15.000
87.5
$0.125
$1.000
87.1
$0.125
$1.000
87.1
$0.200
$1.100
86.4
$3.000
$15.000
86.0
$3.000
$15.000
85.7
$0.270
$0.410
85.7
$0.250
$0.500
85.2
$0.255
$1.000
85.1
$0.090
$0.400
84.8
$0.210
$0.790
84.7
$0.210
$0.790
84.7
$0.500
$3.000
84.2
$0.500
$3.000
84.2
$3.000
$15.000
83.9
$3.000
$15.000
83.9
$15.000
$75.000
83.4
$15.000
$75.000
83.4
$0.080
$0.280
83.1
$0.080
$0.280
83.1
$0.550
$2.200
83.0
$0.550
$2.200
83.0
$0.039
$0.100
82.7
$0.039
$0.100
82.7
$15.000
$75.000
82.0
$15.000
$75.000
82.0
$0.550
$2.200
81.9
$0.400
$1.760
81.5
$0.400
$1.760
81.5
$0.600
$2.200
80.9
$0.200
$0.880
80.7
$0.400
$2.000
80.7
$0.150
$0.750
79.8
$0.150
$0.750
79.8
$0.030
$0.100
79.6
$0.030
$0.100
79.6
$0.780
$3.900
79.1
$0.780
$3.900
79.1
$0.118
$0.950
78.3
$0.260
$1.560
77.8
$1.000
$5.000
76.9
$1.000
$5.000
76.9
$0.150
$0.600
76.7
$0.071
$0.100
76.7
$0.014
$0.028
76.2
$0.300
$2.500
74.2
$0.300
$2.500
74.2
$0.300
$2.500
74.2
$0.300
$2.500
74.2
$0.280
$0.900
74.1
$3.000
$15.000
74.0
$0.500
$1.500
74.0
$0.100
$0.400
73.4
$0.400
$2.000
71.9
$0.875
$7.000
71.7
$0.875
$7.000
71.7
$3.000
$15.000
71.3
$3.000
$15.000
71.2
$3.000
$15.000
71.2
$0.400
$2.000
70.3
$2.500
$10.000
70.1
$2.000
$8.000
70.0
$1.250
$10.000
69.4
$1.250
$10.000
69.4
$0.065
$0.140
68.2
$0.200
$0.500
67.4
$0.200
$0.500
67.0
$0.130
$0.850
66.4
$0.800
$4.000
66.2
$0.060
$0.240
65.8
$0.800
$3.200
65.5
$0.080
$0.160
65.1
$2.000
$6.000
64.7
$2.000
$6.000
64.5
$0.900
$0.900
63.7
$0.300
$0.300
62.5
$0.075
$0.200
62.3
$2.500
$10.000
60.4
$0.070
$0.280
60.0
$0.100
$0.400
58.0
$0.100
$0.400
58.0
$0.035
$0.140
55.6
$0.060
$0.120
53.1
$0.080
$0.300
27.4
$0.030
$0.050
26.5

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About AGIEval English

AGIEval English — human-level reasoning tasks from standardized exams like SAT, LSAT, and civil service exams.

This leaderboard shows all models with AGIEval English benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

AGIEval English — human-level reasoning tasks from standardized exams like SAT, LSAT, and civil service exams.
As of April 18, 2026, Gemini 3.1 Pro Preview leads the AGIEval English leaderboard with a score of 94.0. Rankings change as new models are released and evaluated.
Currently 121 models have been evaluated on AGIEval English, with an average score of 79.0 and standard deviation of 11.9.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.