Price Per TokenPrice Per Token

SWE-bench Lite Leaderboard

Software Engineering benchmark testing ability to resolve real GitHub issues.

Data from LayerLens

As of April 18, 2026, the top-scoring model on SWE-bench Lite is Claude Opus 4.6 at 62.7%, followed by Claude Opus 4.6 at 62.7% and MiniMax M2.5 at 56.3%. 62 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

62

Best Score

62.7

Average

26.8

Std Dev

19.9

Categories
Multi-turn
Provider
Model
Input $/M
Output $/M
SWE-bench Lite
Actions
$5.000
$25.000
62.7
$5.000
$25.000
62.7
$0.118
$0.950
56.3
$1.250
$10.000
54.3
$1.250
$10.000
54.3
$1.250
$10.000
54.3
$1.250
$10.000
54.3
$1.000
$5.000
54.3
$1.000
$5.000
54.3
$0.720
$2.300
53.3
$0.720
$2.300
53.3
$5.000
$25.000
49.3
$5.000
$25.000
49.3
$0.220
$0.900
44.7
$0.550
$2.200
42.0
$0.550
$2.200
42.0
$0.390
$1.740
42.0
$0.390
$1.740
42.0
$1.000
$10.000
40.0
$0.255
$1.000
39.0
$0.125
$1.000
38.3
$0.125
$1.000
38.3
$0.400
$2.000
36.5
$0.071
$0.100
36.3
$1.250
$10.000
36.3
$1.250
$10.000
36.3
$0.500
$1.500
33.3
$0.014
$0.028
29.1
$0.800
$4.000
27.7
$0.300
$2.500
26.1
$0.300
$2.500
26.1
$0.300
$2.500
26.1
$0.300
$2.500
26.1
$0.300
$0.900
26.0
$0.300
$0.900
26.0
$0.390
$0.900
20.0
$0.390
$0.900
20.0
$0.080
$0.240
16.3
$0.080
$0.240
16.3
$0.150
$0.750
14.3
$0.150
$0.750
14.3
$0.500
$3.000
12.7
$0.500
$3.000
12.7
$0.039
$0.100
9.0
$0.039
$0.100
9.0
$0.150
$0.600
8.0
$3.000
$15.000
7.7
$0.600
$2.200
7.7
$0.200
$0.500
7.0
$0.075
$0.200
5.7
$0.080
$0.300
4.0
$0.800
$3.200
2.7
$0.200
$0.500
0.7
$0.200
$0.500
0.7
$0.065
$0.140
-
$0.400
$1.760
-
$0.400
$1.760
-
$0.270
$0.410
-
$0.270
$0.410
-
$0.383
$1.720
-
$0.383
$1.720
-
$0.260
$1.560
-

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About SWE-bench Lite

Software Engineering benchmark testing ability to resolve real GitHub issues.

This leaderboard shows all models with SWE-bench Lite benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Software Engineering benchmark testing ability to resolve real GitHub issues.
As of April 18, 2026, Claude Opus 4.6 leads the SWE-bench Lite leaderboard with a score of 62.7. Rankings change as new models are released and evaluated.
Currently 62 models have been evaluated on SWE-bench Lite, with an average score of 26.8 and standard deviation of 19.9.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.