Price Per TokenPrice Per Token

Best LLM for Math

Compare mathematical reasoning performance across LLMs using competition-level benchmarks. Models are ranked by MATH (Hard) score with pricing information.

Get our weekly newsletter on pricing changes, new releases, and tools.

OpenClaw

Deploy OpenClaw in Under 1 Minute We handle hosting, scaling, and maintenance

About This Leaderboard

This leaderboard ranks AI models by community votes from developers, with benchmark_math_hard and other benchmark scores shown alongside, helping you find the best llm for math.

Pricing is shown per million tokens from OpenRouter. Compare mathematical reasoning performance across LLMs using competition-level benchmarks. Models are ranked by MATH (Hard) score with pricing information.

Frequently Asked Questions

Based on MATH (Hard) benchmark scores, the top-ranked model currently leads our math leaderboard. Reasoning models with chain-of-thought capabilities tend to significantly outperform standard models on mathematical tasks.
MATH (Hard) consists of competition-level mathematics problems covering algebra, geometry, number theory, and calculus. It requires multi-step reasoning and is significantly more challenging than standard math benchmarks.
AIME (American Invitational Mathematics Examination) tests olympiad-level mathematical reasoning. It is one of the hardest math benchmarks for LLMs and helps differentiate frontier reasoning models.
Benchmark scores are updated when new evaluations are published. Community votes update in real-time. Pricing data is refreshed daily.