Price Per TokenPrice Per Token

IFBench Leaderboard

Instruction Following Benchmark measuring LLM ability to adhere to nuanced writing constraints and formatting requirements.

Data from Artificial Analysis

As of March 15, 2026, the top-scoring model on IFBench is Qwen3.5 397B A17B at 78.8%, followed by Gemini 3 Flash Preview at 78.0% and GPT-5.2-Codex at 77.6%. 210 models have been evaluated on this benchmark.

Last updated: March 15, 2026

Models

210

Best Score

78.8

Average

45.8

Std Dev

14.9

Categories
Instruction Following
Provider
Model
Input $/M
Output $/M
IFBench
Actions
$0.390
$0.900
78.8
$0.500
$3.000
78.0
$1.750
$14.000
77.6
$0.260
$2.080
75.7
$0.195
$1.560
75.6
$0.250
$2.000
75.4
$10.500
$84.000
75.4
$1.750
$14.000
75.4
$1.250
$10.000
74.1
$2.500
$15.000
73.9
$1.250
$10.000
73.1
$1.250
$10.000
72.9
$0.163
$1.000
72.5
$0.255
$1.000
72.3
$0.720
$2.300
72.3
$0.090
$0.290
71.8
$0.250
$0.950
71.6
$2.000
$8.000
71.4
$0.050
$0.200
71.1
$0.780
$3.900
70.7
$1.250
$10.000
70.6
$2.000
$12.000
70.4
$15.000
$60.000
70.3
$0.450
$2.200
70.2
$1.250
$10.000
70.0
$0.270
$0.950
69.9
$0.250
$0.750
69.8
$0.039
$0.100
69.0
$1.100
$4.400
68.7
$0.207
$0.828
68.4
$0.550
$2.200
68.1
$0.250
$2.000
67.9
$0.380
$1.750
67.9
$0.050
$0.400
67.6
$1.100
$4.400
67.1
$0.050
$0.150
66.7
$1.250
$10.000
66.6
$0.150
$0.500
66.0
$0.050
$0.400
65.9
$0.030
$0.100
65.1
$0.090
$0.290
64.2
$0.400
$1.200
63.9
$0.060
$0.400
60.8
$0.260
$0.380
60.7
$0.039
$0.100
58.3
$5.000
$25.000
58.0
$3.000
$15.000
57.3
$0.210
$0.790
57.0
$3.000
$15.000
56.6
$15.000
$75.000
55.4
$0.720
$2.300
55.2
$0.500
$3.000
55.1
$3.000
$15.000
54.7
$0.380
$1.750
54.6
$1.000
$5.000
54.3
$0.270
$0.410
54.1
$0.780
$3.900
53.8
$15.000
$75.000
53.7
$3.000
$15.000
53.7
$5.000
$25.000
53.1
$0.200
$0.500
52.7
$0.100
$0.400
52.6
$0.390
$0.900
51.6
$0.110
$0.600
51.2
$0.260
$2.080
50.8
$0.200
$0.500
50.5
$0.300
$2.500
50.3
$0.100
$0.400
49.9
$0.200
$0.200
49.8
$0.150
$0.500
49.1
$0.260
$0.380
49.0
$1.000
$10.000
48.7
$3.000
$15.000
48.3
$1.200
$6.000
48.0
$0.875
$7.000
47.4
$0.100
$0.320
47.1
$3.000
$15.000
46.9
$0.195
$1.560
46.9
$0.060
$0.400
46.3
$0.071
$0.100
46.1
$0.250
$0.500
45.9
$1.250
$10.000
45.6
$3.000
$15.000
45.4
$5.000
$25.000
44.6
$0.163
$1.000
44.5
$0.600
$2.200
44.1
$1.200
$6.000
44.1
$3.000
$15.000
44.0
$0.450
$2.200
43.7
$0.390
$1.740
43.4
$15.000
$75.000
43.3
$1.250
$10.000
43.2
$0.270
$0.410
43.1
$0.150
$0.600
43.0
$2.000
$8.000
43.0
$5.000
$25.000
43.0
$0.800
$4.000
42.8
$0.200
$0.880
42.7
$3.000
$15.000
42.7
$3.000
$15.000
42.4
$1.000
$5.000
42.0
$0.100
$0.400
41.8
$0.400
$2.000
41.7
$0.080
$0.280
41.5
$0.550
$2.200
41.5
$0.150
$0.750
41.5
$0.120
$0.200
41.5
$0.200
$1.500
41.4
$0.400
$1.760
41.2
$0.210
$0.790
41.2
$3.000
$15.000
41.2
$0.200
$0.770
41.0
$0.060
$0.200
40.5
$0.220
$0.900
40.5
$0.300
$2.500
40.5
$0.100
$0.400
40.2
$0.090
$0.290
39.9
$0.400
$2.000
39.8
$0.090
$0.780
39.7
$0.450
$2.150
39.6
$0.080
$0.300
39.5
$0.400
$2.000
39.3
$0.104
$0.416
39.2
$0.200
$0.600
39.2
$0.280
$0.900
39.1
$0.900
$0.900
39.0
$0.550
$2.190
39.0
$0.300
$2.500
39.0
$0.150
$0.400
38.8
$0.400
$0.800
38.7
$0.400
$1.600
38.3
$0.800
$3.200
38.1
$0.050
$0.200
38.1
$0.400
$0.900
38.1
$0.150
$0.750
37.8
$0.050
$0.150
37.8
$0.200
$0.500
37.7
$0.130
$0.850
37.6
$0.050
$0.200
37.5
$0.510
$0.740
37.1
$0.100
$0.400
37.0
$0.120
$0.390
36.9
$0.040
$0.130
36.7
$0.390
$1.740
36.7
$0.400
$0.800
36.6
$2.500
$10.000
36.5
$0.200
$0.500
36.5
$0.080
$0.240
36.3
$2.500
$12.500
36.2
$0.250
$1.250
36.1
$2.000
$8.000
35.2
$0.120
$0.750
35.2
$0.200
$0.770
34.8
$1.000
$3.000
34.8
$0.070
$0.280
34.6
$2.000
$6.000
34.5
$0.340
$0.390
34.4
$0.600
$1.800
34.2
$0.060
$0.240
34.1
$0.200
$1.100
34.0
$0.050
$0.200
33.5
$0.200
$0.200
33.5
$0.060
$0.180
33.5
$0.090
$0.300
33.1
$0.130
$0.520
33.1
$0.100
$0.400
32.9
$0.100
$0.200
32.8
$0.070
$0.270
32.7
$1.000
$3.000
32.7
$0.200
$0.200
32.5
$0.050
$0.400
32.5
$0.080
$0.200
32.3
$0.100
$0.400
32.0
$0.200
$0.200
32.0
$0.080
$0.280
31.9
$0.200
$0.200
31.9
$0.030
$0.110
31.8
$2.000
$6.000
31.6
$0.080
$0.240
31.5
$0.100
$0.400
31.5
$0.130
$0.400
31.3
$2.000
$6.000
31.2
$0.150
$0.600
31.0
$0.900
$0.900
30.7
$0.049
$0.049
30.4
$0.300
$0.900
30.1
$0.100
$0.300
29.9
$0.400
$2.000
29.9
$0.035
$0.140
29.4
$0.150
$0.150
29.1
$0.130
$0.400
29.0
$0.020
$0.050
28.6
$0.050
$0.200
28.6
$0.600
$1.800
28.6
$0.040
$0.080
28.3
$0.020
$0.040
27.9
$0.300
$0.900
27.9
$0.700
$0.800
27.6
$0.040
$0.160
27.6
$0.040
$0.160
27.1
$0.200
$0.200
26.9
$0.050
$0.080
26.4
$0.010
$0.020
26.3
$0.030
$0.050
26.2
$0.200
$0.200
25.9
$0.030
$0.040
24.6
$0.060
$0.200
23.9
$0.060
$0.140
23.5
$0.290
$0.290
22.9
$0.020
$0.020
22.8

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

108 out of our 483 tracked models have had a price change in March.

Get our weekly newsletter on pricing changes, new releases, and tools.

About IFBench

Instruction Following Benchmark measuring LLM ability to adhere to nuanced writing constraints and formatting requirements.

This leaderboard shows all models with IFBench benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Instruction Following Benchmark measuring LLM ability to adhere to nuanced writing constraints and formatting requirements.
As of March 15, 2026, Qwen3.5 397B A17B leads the IFBench leaderboard with a score of 78.8. Rankings change as new models are released and evaluated.
Currently 210 models have been evaluated on IFBench, with an average score of 45.8 and standard deviation of 14.9.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.