Simple question answering benchmark testing factual accuracy and knowledge retrieval.
Data from LayerLens
As of April 18, 2026, the top-scoring model on SimpleQA is Gemini 2.5 Pro at 53.0%, followed by Qwen3 235B A22B Instruct 2507 at 50.6% and Qwen3 VL 235B A22B Instruct at 46.7%. 45 models have been evaluated on this benchmark.
Last updated: April 18, 2026
Models
45
Best Score
53.0
Average
20.8
Std Dev
14.0
Provider | Model | Input $/M | Output $/M | SimpleQA | Actions |
|---|---|---|---|---|---|
$1.000 | $10.000 | 53.0 | |||
$0.071 | $0.100 | 50.6 | |||
$0.200 | $0.880 | 46.7 | |||
$2.000 | $8.000 | 40.4 | |||
$0.090 | $0.780 | 40.1 | |||
$3.000 | $15.000 | 38.3 | |||
$0.260 | $0.900 | 37.9 | |||
$3.000 | $15.000 | 37.4 | |||
$3.000 | $15.000 | 37.4 | |||
$0.280 | $0.900 | 36.9 | |||
$3.000 | $15.000 | 32.8 | |||
$3.000 | $15.000 | 32.8 | |||
$0.550 | $2.000 | 29.1 | |||
$0.550 | $2.200 | 26.5 | |||
$0.550 | $2.200 | 26.5 | |||
$0.500 | $2.150 | 25.1 | |||
$0.150 | $0.750 | 23.3 | |||
$0.150 | $0.750 | 23.3 | |||
$0.014 | $0.028 | 23.0 | |||
$0.150 | $0.600 | 22.1 | |||
$0.400 | $2.000 | 20.5 | |||
$0.400 | $2.000 | 19.7 | |||
$0.300 | $0.500 | 18.4 | |||
$0.300 | $0.300 | 17.1 | |||
$2.000 | $6.000 | 16.7 | |||
$2.500 | $10.000 | 15.8 | |||
$0.550 | $2.200 | 14.0 | |||
$0.455 | $0.900 | 12.7 | |||
$0.455 | $0.900 | 12.7 | |||
$0.800 | $3.200 | 12.6 | |||
$0.090 | $0.400 | 11.9 | |||
$0.075 | $0.200 | 9.7 | |||
$0.080 | $0.160 | 8.5 | |||
$0.800 | $4.000 | 8.0 | |||
$0.080 | $0.300 | 7.3 | |||
$0.070 | $0.280 | 6.7 | |||
$0.060 | $0.240 | 6.6 | |||
$0.080 | $0.280 | 5.6 | |||
$0.080 | $0.280 | 5.6 | |||
$0.080 | $0.240 | 5.5 | |||
$0.080 | $0.240 | 5.5 | |||
$0.035 | $0.140 | 4.7 | |||
$0.060 | $0.120 | 4.0 | |||
$0.065 | $0.140 | 2.3 | |||
$0.030 | $0.050 | 0.5 |
Pricing from OpenRouter. Benchmarks from Artificial Analysis.
Get our weekly newsletter on pricing changes, new releases, and tools.
Simple question answering benchmark testing factual accuracy and knowledge retrieval.
This leaderboard shows all models with SimpleQA benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.