Price Per TokenPrice Per Token

SimpleQA Leaderboard

Simple question answering benchmark testing factual accuracy and knowledge retrieval.

Data from LayerLens

As of April 18, 2026, the top-scoring model on SimpleQA is Gemini 2.5 Pro at 53.0%, followed by Qwen3 235B A22B Instruct 2507 at 50.6% and Qwen3 VL 235B A22B Instruct at 46.7%. 45 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

45

Best Score

53.0

Average

20.8

Std Dev

14.0

Categories
Reasoning and LogicGeneral Knowledge
Provider
Model
Input $/M
Output $/M
SimpleQA
Actions
$1.000
$10.000
53.0
$0.071
$0.100
50.6
$0.200
$0.880
46.7
$2.000
$8.000
40.4
$0.090
$0.780
40.1
$3.000
$15.000
38.3
$0.260
$0.900
37.9
$3.000
$15.000
37.4
$3.000
$15.000
37.4
$0.280
$0.900
36.9
$3.000
$15.000
32.8
$3.000
$15.000
32.8
$0.550
$2.000
29.1
$0.550
$2.200
26.5
$0.550
$2.200
26.5
$0.500
$2.150
25.1
$0.150
$0.750
23.3
$0.150
$0.750
23.3
$0.014
$0.028
23.0
$0.150
$0.600
22.1
$0.400
$2.000
20.5
$0.400
$2.000
19.7
$0.300
$0.500
18.4
$0.300
$0.300
17.1
$2.000
$6.000
16.7
$2.500
$10.000
15.8
$0.550
$2.200
14.0
$0.455
$0.900
12.7
$0.455
$0.900
12.7
$0.800
$3.200
12.6
$0.090
$0.400
11.9
$0.075
$0.200
9.7
$0.080
$0.160
8.5
$0.800
$4.000
8.0
$0.080
$0.300
7.3
$0.070
$0.280
6.7
$0.060
$0.240
6.6
$0.080
$0.280
5.6
$0.080
$0.280
5.6
$0.080
$0.240
5.5
$0.080
$0.240
5.5
$0.035
$0.140
4.7
$0.060
$0.120
4.0
$0.065
$0.140
2.3
$0.030
$0.050
0.5

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About SimpleQA

Simple question answering benchmark testing factual accuracy and knowledge retrieval.

This leaderboard shows all models with SimpleQA benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Simple question answering benchmark testing factual accuracy and knowledge retrieval.
As of April 18, 2026, Gemini 2.5 Pro leads the SimpleQA leaderboard with a score of 53.0. Rankings change as new models are released and evaluated.
Currently 45 models have been evaluated on SimpleQA, with an average score of 20.8 and standard deviation of 14.0.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.