Price Per TokenPrice Per Token

WMT 2014 Leaderboard

Workshop on Machine Translation 2014 — multilingual translation quality benchmark.

Data from LayerLens

As of April 18, 2026, the top-scoring model on WMT 2014 is Gemini 2.0 Flash at 38.9%, followed by Llama 3.1 405B Instruct at 38.0% and Llama 4 Maverick at 38.0%. 12 models have been evaluated on this benchmark.

Last updated: April 18, 2026

Models

12

Best Score

38.9

Average

35.5

Std Dev

3.8

Categories
Multilingual
Provider
Model
Input $/M
Output $/M
WMT 2014
Actions
$0.100
$0.400
38.9
$0.900
$0.900
38.0
$0.150
$0.600
38.0
$2.000
$8.000
37.6
$3.000
$15.000
37.4
$0.080
$0.300
37.1
$0.800
$3.200
36.9
$0.014
$0.028
36.6
$0.800
$4.000
35.6
$0.070
$0.280
34.1
$2.500
$10.000
30.0
$0.030
$0.050
25.2

Pricing from OpenRouter. Benchmarks from Artificial Analysis.

Get our weekly newsletter on pricing changes, new releases, and tools.

Join the Price Per Token Community
8 Ways to Use Fewer Tokens

About WMT 2014

Workshop on Machine Translation 2014 — multilingual translation quality benchmark.

This leaderboard shows all models with WMT 2014 benchmark scores, ranked from highest to lowest. Pricing data is included to help you compare performance against cost.

Frequently Asked Questions

Workshop on Machine Translation 2014 — multilingual translation quality benchmark.
As of April 18, 2026, Gemini 2.0 Flash leads the WMT 2014 leaderboard with a score of 38.9. Rankings change as new models are released and evaluated.
Currently 12 models have been evaluated on WMT 2014, with an average score of 35.5 and standard deviation of 3.8.
Benchmark scores are updated when new evaluations are published by our data sources (Artificial Analysis and LayerLens). Pricing data is refreshed daily from OpenRouter.