Accounting and audit benchmark testing financial reasoning capabilities.
Data from LayerLens
As of April 18, 2026, the top-scoring model on Accounting Audit is Claude 3.7 Sonnet at 86.7%, followed by Gemini 2.5 Pro Preview 05-06 at 86.7% and Claude 3.7 Sonnet at 83.3%. 66 models have been evaluated on this benchmark.
Last updated: April 18, 2026
Models
66
Best Score
86.7
Average
71.6
Std Dev
18.9
Provider | Model | Input $/M | Output $/M | Accounting Audit | Actions |
|---|---|---|---|---|---|
$3.000 | $15.000 | 86.7 | |||
$1.250 | $10.000 | 86.7 | |||
$3.000 | $15.000 | 83.3 | |||
$0.150 | $0.580 | 83.3 | |||
$0.150 | $0.580 | 83.3 | |||
$2.000 | $8.000 | 83.3 | |||
$0.300 | $2.500 | 83.3 | |||
$0.300 | $2.500 | 83.3 | |||
$0.300 | $2.500 | 83.3 | |||
$0.300 | $2.500 | 83.3 | |||
$3.000 | $15.000 | 83.3 | |||
$1.250 | $10.000 | 83.3 | |||
$1.250 | $10.000 | 83.3 | |||
$1.250 | $10.000 | 83.3 | |||
$1.250 | $10.000 | 83.3 | |||
$0.014 | $0.028 | 80.0 | |||
$0.550 | $2.000 | 80.0 | |||
$0.150 | $0.600 | 80.0 | |||
$1.100 | $4.400 | 80.0 | |||
$0.060 | $0.200 | 80.0 | |||
$0.060 | $0.200 | 80.0 | |||
$0.080 | $0.280 | 80.0 | |||
$0.080 | $0.280 | 80.0 | |||
$3.000 | $15.000 | 80.0 | |||
$3.000 | $15.000 | 80.0 | |||
$15.000 | $75.000 | 80.0 | |||
$15.000 | $75.000 | 80.0 | |||
$0.500 | $2.150 | 80.0 | |||
$0.390 | $1.750 | 80.0 | |||
$0.390 | $1.750 | 80.0 | |||
$1.000 | $1.000 | 76.7 | |||
$0.800 | $4.000 | 76.7 | |||
$0.800 | $3.200 | 76.7 | |||
$0.550 | $2.200 | 76.7 | |||
$0.455 | $0.900 | 76.7 | |||
$0.455 | $0.900 | 76.7 | |||
$0.080 | $0.240 | 76.7 | |||
$0.080 | $0.240 | 76.7 | |||
$3.000 | $15.000 | 76.7 | |||
$3.000 | $15.000 | 76.7 | |||
$0.400 | $1.760 | 76.7 | |||
$0.400 | $1.760 | 76.7 | |||
$0.780 | $3.900 | 76.7 | |||
$0.780 | $3.900 | 76.7 | |||
$3.000 | $15.000 | 73.3 | |||
$0.400 | $2.000 | 73.3 | |||
$0.100 | $0.400 | 73.3 | |||
$0.100 | $0.400 | 73.3 | |||
$0.100 | $0.400 | 73.3 | |||
$0.100 | $0.400 | 73.3 | |||
$1.000 | $5.000 | 73.3 | |||
$1.000 | $5.000 | 73.3 | |||
$0.300 | $0.300 | 70.0 | |||
$2.500 | $10.000 | 70.0 | |||
$0.080 | $0.300 | 70.0 | |||
$0.065 | $0.140 | 66.7 | |||
$0.070 | $0.280 | 63.3 | |||
$0.035 | $0.140 | 53.3 | |||
$0.030 | $0.050 | 50.0 | |||
$2.000 | $6.000 | 50.0 | |||
$0.060 | $0.120 | 43.3 | |||
$0.900 | $0.900 | 40.0 | |||
$0.080 | $0.160 | 33.3 | |||
$2.500 | $10.000 | - | |||
$2.500 | $10.000 | - | |||
$0.100 | $0.400 | - |
Pricing from OpenRouter. Benchmarks from Artificial Analysis.
Get our weekly newsletter on pricing changes, new releases, and tools.
Accounting and audit benchmark testing financial reasoning capabilities.
This leaderboard shows all models with Accounting Audit benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.
Built by @aellman
2026 68 Ventures, LLC. All rights reserved.