Key Takeaways
Llama 3.3 70B Instruct wins:
- Cheaper input tokens
- Cheaper output tokens
- Faster response time
- Higher intelligence benchmark
Llama 3.1 Nemotron 70B Instruct wins:
- Better at coding
- Better at math
Price Advantage
Llama 3.3 70B Instruct
Benchmark Advantage
Llama 3.1 Nemotron 70B Instruct
Context Window
Llama 3.1 Nemotron 70B Instruct
Speed
Llama 3.3 70B Instruct
Pricing Comparison
Price Comparison
| Metric | Llama 3.3 70B Instruct | Llama 3.1 Nemotron 70B Instruct | Winner |
|---|---|---|---|
| Input (per 1M tokens) | $0.10 | $0.90 | Llama 3.3 70B Instruct |
| Output (per 1M tokens) | $0.32 | $0.90 | Llama 3.3 70B Instruct |
| Cache Read (per 1M) | $0.13 | $0.45 | Llama 3.3 70B Instruct |
Using a 3:1 input/output ratio, Llama 3.3 70B Instruct is 83% cheaper overall.
Llama 3.3 70B Instruct Providers
No provider data available
Llama 3.1 Nemotron 70B Instruct Providers
No provider data available
Benchmark Comparison
8
Benchmarks Compared
6
Llama 3.3 70B Instruct Wins
1
Llama 3.1 Nemotron 70B Instruct Wins
Benchmark Scores
| Benchmark | Llama 3.3 70B Instruct | Llama 3.1 Nemotron 70B Instruct | Winner |
|---|---|---|---|
Intelligence Index Overall intelligence score | 14.5 | 13.4 | |
Coding Index Code generation & understanding | 10.7 | 10.8 | - |
Math Index Mathematical reasoning | 7.7 | 11.0 | |
MMLU Pro Academic knowledge | 71.3 | 69.0 | |
GPQA Graduate-level science | 49.8 | 46.5 | |
LiveCodeBench Competitive programming | 28.8 | 16.9 | |
Aider Real-world code editing | 59.4 | 54.9 | |
AIME Competition math | 30.0 | 24.7 |
Llama 3.3 70B Instruct wins 6 out of 8 benchmarks.
Cost vs Quality
X-axis:
Y-axis:
Loading chart...
Llama 3.3 70B Instruct
Other models
Context & Performance
Context Window
Llama 3.3 70B Instruct
131,072
tokens
Llama 3.1 Nemotron 70B Instruct
131,072
tokens
Speed Performance
| Metric | Llama 3.3 70B Instruct | Llama 3.1 Nemotron 70B Instruct | Winner |
|---|---|---|---|
| Tokens/second | 99.5 tok/s | 35.5 tok/s | |
| Time to First Token | 0.54s | 0.51s |
Llama 3.3 70B Instruct responds 180% faster on average.
Capabilities
Feature Comparison
| Feature | Llama 3.3 70B Instruct | Llama 3.1 Nemotron 70B Instruct |
|---|---|---|
| Vision (Image Input) | ||
| Tool/Function Calls | ||
| Reasoning Mode | ||
| Audio Input | ||
| Audio Output | ||
| PDF Input | ||
| Prompt Caching | ||
| Web Search |
License & Release
| Property | Llama 3.3 70B Instruct | Llama 3.1 Nemotron 70B Instruct |
|---|---|---|
| License | Open Source | Proprietary |
| Author | Meta-llama | Nvidia |
| Released | Dec 2024 | Oct 2024 |
Llama 3.3 70B Instruct Modalities
Input
text
Output
text
Llama 3.1 Nemotron 70B Instruct Modalities
Input
text
Output
text