Key Takeaways
Llama 3.1 405B Instruct wins:
- Higher intelligence benchmark
- Better at coding
Llama 3.3 Nemotron Super 49B V1.5 wins:
- Cheaper input tokens
- Cheaper output tokens
- Larger context window
- Faster response time
- Better at math
- Has reasoning mode
Price Advantage
Llama 3.3 Nemotron Super 49B V1.5
Benchmark Advantage
Llama 3.1 405B Instruct
Context Window
Llama 3.3 Nemotron Super 49B V1.5
Speed
Llama 3.3 Nemotron Super 49B V1.5
Pricing Comparison
Price Comparison
| Metric | Llama 3.1 405B Instruct | Llama 3.3 Nemotron Super 49B V1.5 | Winner |
|---|---|---|---|
| Input (per 1M tokens) | $0.90 | $0.10 | Llama 3.3 Nemotron Super 49B V1.5 |
| Output (per 1M tokens) | $0.90 | $0.40 | Llama 3.3 Nemotron Super 49B V1.5 |
| Cache Read (per 1M) | $0.45 | N/A | Llama 3.1 405B Instruct |
Using a 3:1 input/output ratio, Llama 3.3 Nemotron Super 49B V1.5 is 81% cheaper overall.
Llama 3.1 405B Instruct Providers
No provider data available
Llama 3.3 Nemotron Super 49B V1.5 Providers
No provider data available
Benchmark Comparison
8
Benchmarks Compared
6
Llama 3.1 405B Instruct Wins
1
Llama 3.3 Nemotron Super 49B V1.5 Wins
Benchmark Scores
| Benchmark | Llama 3.1 405B Instruct | Llama 3.3 Nemotron Super 49B V1.5 | Winner |
|---|---|---|---|
Intelligence Index Overall intelligence score | 17.4 | 14.6 | |
Coding Index Code generation & understanding | 14.5 | 10.5 | |
Math Index Mathematical reasoning | 3.0 | 8.0 | |
MMLU Pro Academic knowledge | 73.2 | 69.2 | |
GPQA Graduate-level science | 51.5 | 48.1 | |
LiveCodeBench Competitive programming | 30.5 | 29.0 | |
Aider Real-world code editing | 66.2 | - | - |
AIME Competition math | 21.3 | 13.7 |
Llama 3.1 405B Instruct wins 6 out of 8 benchmarks.
Cost vs Quality
X-axis:
Y-axis:
Loading chart...
Llama 3.1 405B Instruct
Other models
Context & Performance
Context Window
Llama 3.1 405B Instruct
131,000
tokens
Llama 3.3 Nemotron Super 49B V1.5
131,072
tokens
Llama 3.3 Nemotron Super 49B V1.5 has a 0% larger context window.
Speed Performance
| Metric | Llama 3.1 405B Instruct | Llama 3.3 Nemotron Super 49B V1.5 | Winner |
|---|---|---|---|
| Tokens/second | 33.7 tok/s | 82.7 tok/s | |
| Time to First Token | 0.71s | 0.24s |
Llama 3.3 Nemotron Super 49B V1.5 responds 146% faster on average.
Capabilities
Feature Comparison
| Feature | Llama 3.1 405B Instruct | Llama 3.3 Nemotron Super 49B V1.5 |
|---|---|---|
| Vision (Image Input) | ||
| Tool/Function Calls | ||
| Reasoning Mode | ||
| Audio Input | ||
| Audio Output | ||
| PDF Input | ||
| Prompt Caching | ||
| Web Search |
License & Release
| Property | Llama 3.1 405B Instruct | Llama 3.3 Nemotron Super 49B V1.5 |
|---|---|---|
| License | Open Source | Proprietary |
| Author | Meta-llama | Nvidia |
| Released | Jul 2024 | Oct 2025 |
Llama 3.1 405B Instruct Modalities
Input
text
Output
text
Llama 3.3 Nemotron Super 49B V1.5 Modalities
Input
text
Output
text