Price Per TokenPrice Per Token

GPQA Leaderboard

Graduate-level multiple-choice questions written by domain experts in biology, physics, and chemistry. Questions are Google-proof and extremely difficult.

Data from Artificial Analysis

As of May 20, 2026, the top-scoring model on GPQA is Gemini 3.1 Pro Preview at 94.1%, followed by GPT-5.4 at 92.0% and GPT-5.3 Codex at 91.5%. 277 models have been evaluated on this benchmark.

Last updated: May 20, 2026

Models

277

Best Score

94.1

Average

66.6

Std Dev

17.0

Categories
Reasoning and Logic
Provider
Model
Input $/M
Output $/M
GPQA
Actions
$2.000
$12.000
94.1
$2.500
$15.000
92.0
$1.750
$14.000
91.5
$5.000
$25.000
91.4
$0.730
$3.400
91.1
$2.000
$12.000
90.8
$10.500
$84.000
90.3
$1.750
$14.000
89.9
$0.500
$3.000
89.8
$5.000
$25.000
89.6
$0.112
$0.224
89.4
$0.390
$0.900
89.3
$0.435
$0.870
88.8
$2.000
$12.000
88.7
$5.000
$25.000
88.5
$0.400
$1.900
87.9
$3.000
$15.000
87.7
$3.000
$15.000
87.5
$0.279
$1.200
87.4
$0.625
$5.000
87.3
$0.287
$0.431
87.1
$2.500
$15.000
87.1
$0.980
$3.080
86.8
$5.000
$25.000
86.6
$1.000
$3.000
86.6
$0.875
$7.000
86.4
$0.780
$3.900
86.1
$0.390
$0.900
86.1
$1.250
$10.000
86.0
$0.400
$1.750
85.9
$0.195
$0.900
85.8
$0.260
$0.900
85.7
$0.120
$0.370
85.7
$1.250
$10.000
85.4
$0.200
$0.500
85.3
$0.150
$1.150
84.8
$0.200
$0.500
84.7
$1.200
$4.000
84.7
$0.100
$0.300
84.6
$20.000
$80.000
84.5
$0.140
$0.900
84.5
$1.000
$10.000
84.4
$1.250
$10.000
84.2
$0.195
$0.900
84.2
$0.150
$0.900
84.1
$0.252
$0.378
84.0
$5.000
$25.000
84.0
$0.980
$3.080
83.9
$0.550
$2.200
83.8
$1.250
$10.000
83.7
$1.250
$10.000
83.6
$0.100
$0.300
83.5
$3.000
$15.000
83.4
$0.100
$0.300
83.1
$0.290
$0.950
83.0
$0.250
$2.000
82.8
$0.400
$2.000
82.8
$2.000
$8.000
82.7
$0.260
$0.900
82.7
$0.100
$0.300
82.6
$1.250
$10.000
82.2
$0.250
$1.500
82.2
$0.600
$1.920
82.0
$0.140
$0.900
81.9
$0.150
$0.900
81.7
$0.500
$2.150
81.3
$0.250
$2.000
81.3
$0.500
$3.000
81.2
$0.280
$0.900
81.1
$5.000
$25.000
81.0
$15.000
$75.000
80.9
$1.200
$4.000
80.9
$1.250
$10.000
80.8
$0.040
$0.150
80.6
$0.250
$2.000
80.3
$3.000
$15.000
79.9
$0.270
$0.410
79.7
$3.000
$15.000
79.7
$15.000
$75.000
79.6
$0.270
$0.950
79.2
$0.060
$0.300
79.2
$0.250
$0.500
79.1
$0.300
$2.500
79.0
$0.149
$0.900
79.0
$0.400
$1.900
78.9
$0.730
$3.400
78.8
$0.040
$0.150
78.6
$1.100
$4.400
78.4
$0.600
$2.200
78.2
$0.039
$0.180
78.2
$0.390
$1.740
78.0
$0.210
$0.790
77.9
$3.000
$15.000
77.7
$0.255
$1.000
77.7
$0.780
$3.900
77.6
$1.100
$4.400
77.3
$3.000
$15.000
77.2
$0.260
$0.900
77.2
$0.250
$0.750
77.0
$0.900
$0.900
76.8
$0.400
$2.000
76.7
$0.550
$2.200
76.6
$0.780
$3.900
76.4
$0.780
$3.900
76.4
$0.207
$0.828
76.4
$1.000
$3.000
76.2
$0.200
$1.100
76.1
$0.098
$0.300
75.9
$0.050
$0.200
75.7
$0.071
$0.100
75.3
$0.270
$0.950
75.1
$0.252
$0.378
75.1
$1.100
$4.400
74.8
$0.100
$0.400
74.8
$2.500
$15.000
74.8
$15.000
$60.000
74.7
$0.090
$0.780
73.8
$0.270
$0.410
73.8
$0.110
$0.800
73.7
$0.210
$0.790
73.5
$0.130
$0.850
73.3
$0.104
$0.416
73.3
$1.000
$3.000
72.7
$0.200
$1.500
72.7
$3.000
$15.000
72.7
$0.130
$0.900
72.0
$0.300
$0.900
71.9
$0.435
$0.870
71.7
$0.112
$0.224
71.6
$0.060
$0.300
71.4
$0.200
$0.880
71.2
$0.875
$7.000
71.2
$0.100
$0.400
70.9
$0.550
$2.000
70.8
$0.080
$0.300
70.7
$15.000
$75.000
70.1
$0.455
$0.900
70.0
$0.130
$0.400
69.9
$0.300
$2.500
69.8
$0.400
$2.200
69.7
$0.130
$0.520
69.5
$3.000
$15.000
69.3
$0.030
$0.140
68.8
$1.250
$10.000
68.6
$0.600
$1.800
68.4
$3.000
$15.000
68.3
$0.300
$2.500
68.3
$0.400
$2.200
68.2
$0.050
$0.400
67.6
$1.250
$10.000
67.3
$0.039
$0.180
67.2
$1.000
$5.000
67.2
$0.150
$0.600
67.1
$0.104
$0.416
67.1
$0.050
$0.400
67.0
$0.080
$0.280
66.8
$0.200
$0.200
66.7
$2.000
$8.000
66.6
$0.600
$1.920
66.6
$0.200
$0.800
66.4
$0.400
$1.750
66.4
$0.090
$0.300
65.9
$3.000
$15.000
65.6
$0.100
$0.300
65.6
$0.200
$0.770
65.5
$0.100
$0.400
65.1
$1.000
$5.000
64.6
$0.100
$0.400
64.3
$0.625
$5.000
64.3
$0.200
$0.500
63.7
$0.390
$1.740
63.2
$0.100
$0.400
62.5
$0.100
$0.400
62.3
$0.220
$0.900
61.8
$0.080
$0.280
61.6
$0.290
$0.290
61.5
$0.455
$0.900
61.3
$0.030
$0.140
61.1
$0.150
$0.500
61.0
$0.200
$0.500
60.6
$0.060
$0.200
60.4
$0.300
$2.500
60.3
$3.000
$15.000
59.9
$0.300
$2.500
59.4
$0.400
$0.900
59.4
$0.900
$0.900
59.3
$0.010
$0.030
59.3
$0.150
$0.500
59.1
$0.050
$0.200
58.9
$0.400
$2.000
58.8
$1.040
$4.160
58.7
$0.080
$0.300
58.7
$0.060
$0.400
58.1
$0.117
$1.365
57.9
$3.000
$15.000
57.8
$0.400
$2.000
57.8
$0.065
$0.140
57.5
$0.600
$1.800
57.3
$0.200
$0.200
57.2
$0.200
$0.200
57.2
$0.040
$0.160
57.0
$2.500
$12.500
56.9
$0.300
$0.900
56.6
$0.900
$0.900
55.7
$0.200
$0.770
55.7
$0.040
$0.160
55.7
$0.075
$0.300
54.2
$0.200
$0.600
53.9
$1.000
$3.000
53.6
$0.075
$0.300
53.5
$0.080
$0.280
53.5
$2.500
$10.000
52.7
$0.200
$0.200
52.2
$0.200
$0.200
51.7
$0.100
$0.400
51.7
$0.070
$0.270
51.6
$0.120
$0.200
51.6
$0.900
$0.900
51.5
$0.080
$0.280
51.5
$0.050
$0.200
51.2
$2.000
$6.000
50.5
$0.075
$0.200
50.5
$0.800
$3.200
49.9
$0.100
$0.320
49.8
$0.400
$2.000
49.2
$0.360
$0.400
49.1
$0.130
$0.400
49.1
$2.000
$6.000
48.6
$0.100
$0.400
48.1
$0.100
$0.400
47.4
$2.000
$6.000
47.2
$1.000
$1.000
47.1
$0.150
$0.150
47.1
$0.060
$0.200
47.0
$0.900
$0.900
46.6
$0.900
$0.900
46.5
$0.050
$0.080
46.2
$0.100
$0.300
45.4
$0.050
$0.200
45.2
$0.060
$0.400
45.2
$0.200
$0.200
43.9
$0.060
$0.240
43.3
$0.080
$0.160
42.8
$0.050
$0.400
42.8
$0.080
$0.200
42.7
$0.150
$0.600
42.6
$0.200
$0.200
42.5
$0.200
$0.600
42.4
$0.660
$0.800
41.7
$0.070
$0.280
41.4
$0.033
$0.130
41.0
$0.340
$0.390
40.9
$0.800
$4.000
40.8
$0.700
$0.800
40.2
$0.300
$0.300
40.1
$0.100
$0.200
40.0
$0.050
$0.200
39.9
$0.200
$0.200
39.8
$2.000
$8.000
39.0
$0.510
$0.740
37.9
$0.250
$1.250
37.4
$0.035
$0.140
35.8
$0.500
$1.500
35.1
$0.040
$0.130
34.9
$0.010
$0.020
34.4
$0.200
$0.200
33.9
$1.200
$1.200
33.2
$0.050
$0.200
32.8
$0.500
$1.000
29.7
$0.040
$0.040
29.6
$0.060
$0.120
29.6
$0.140
$0.420
29.2
$0.040
$0.080
29.1
$0.020
$0.050
25.9
$0.030
$0.050
25.5
$0.060
$0.060
22.1
$0.020
$0.020
19.6

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About GPQA

Graduate-level multiple-choice questions written by domain experts in biology, physics, and chemistry. Questions are Google-proof and extremely difficult.

This leaderboard shows all models with GPQA benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Graduate-level multiple-choice questions written by domain experts in biology, physics, and chemistry. Questions are Google-proof and extremely difficult.
As of May 20, 2026, Gemini 3.1 Pro Preview leads the GPQA leaderboard with a score of 94.1. Rankings change as new models are released and evaluated.
Currently 277 models have been evaluated on GPQA, with an average score of 66.6 and standard deviation of 17.0.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.