
Together AI Pricing
Compare Together AI pricing for 227 models. Serverless inference, dedicated GPU endpoints, and fine-tuning across LLMs, embeddings, image, video, and audio models.
Last updated: Apr 10, 2026
Together AI Overview
74
Serverless
156
Dedicated GPU
0
Registry Only
98
Free Tier
$0.02
Cheapest Input/1M
Together AI Model Pricing
Provider | Model | Context | Quant | Input/1M | Output/1M | Dedicated/hr |
|---|---|---|---|---|---|---|
33k | FP16 | $0.020 | $0.040 | — | ||
QW | 33k | FP16 | $0.020 | $0.020 | $3.99/hr H100-80GB | |
E5 Large Instruct | 1k | FP16 | $0.020 | $0.020 | — | |
LQ | 33k | FP16 | $0.030 | $0.120 | — | |
131k | FP16 | $0.050 | $0.200 | $3.99/hr H100-80GB | ||
131k | FP16 | $0.060 | $0.060 | $3.99/hr H100-80GB | ||
NV | 131k | FP16 | $0.060 | $0.250 | $3.99/hr H100-80GB | |
QW | 33k | FP16 | $0.100 | $0.100 | — | |
8k | INT4 | $0.100 | $0.100 | — | ||
33k | FP16 | $0.100 | $0.300 | $7.98/hr 2x H100-80GB | ||
QW | 262k | FP16 | $0.100 | $0.150 | $7.98/hr 2x H100-80GB | |
Salesforce Llama Rank V1 | 8k | FP16 | $0.100 | $0.100 | $3.99/hr H100-80GB | |
33k | FP16 | $0.150 | $0.150 | — | ||
131k | FP16 | $0.150 | $0.600 | $3.99/hr H100-80GB | ||
QW | 262k | FP16 | $0.150 | $1.500 | $15.96/hr 4x H100-80GB | |
QW | 262k | FP16 | $0.150 | $1.500 | $15.96/hr 4x H100-80GB | |
DS | 131k | FP16 | $0.180 | $0.180 | $3.99/hr H100-80GB | |
1049k | FP16 | $0.180 | $0.590 | $31.92/hr 8x H100-80GB | ||
131k | FP8 | $0.180 | $0.180 | $3.99/hr H100-80GB | ||
QW | 262k | FP16 | $0.180 | $0.680 | $3.99/hr H100-80GB | |
262k | FP16 | $0.200 | $0.500 | $7.98/hr 2x H100-80GB | ||
8k | FP16 | $0.200 | $0.200 | $3.99/hr H100-80GB | ||
8k | FP16 | $0.200 | $0.200 | $3.99/hr H100-80GB | ||
262k | FP16 | $0.200 | $0.200 | $3.99/hr H100-80GB | ||
33k | FP16 | $0.200 | $0.200 | $7.98/hr 2x H100-80GB | ||
33k | FP16 | $0.200 | $0.200 | $7.98/hr 2x H100-80GB | ||
QW | 262k | FP16 | $0.200 | $0.600 | $15.96/hr 4x H100-80GB | |
Z | 131k | FP16 | $0.200 | $1.100 | $15.96/hr 4x H100-80GB | |
16k | FP16 | $0.200 | $0.200 | $7.98/hr 2x H100-80GB | ||
Llama Guard 4 12B | 1049k | FP16 | $0.200 | $0.200 | — | |
Rime Arcana v2 | — | FP16 | $0.270 | $N/A | $3.99/hr H100-80GB | |
1049k | FP16 | $0.270 | $0.850 | $31.92/hr 8x H100-80GB | ||
Whisper Large V3 | — | FP16 | $0.270 | $0.850 | $3.99/hr H100-80GB | |
MM | 197k | FP16 | $0.300 | $1.200 | — | |
QW | 33k | FP8 | $0.300 | $0.300 | $3.99/hr H100-80GB | |
Z | 203k | FP16 | $0.450 | $2.000 | $31.92/hr 8x H100-80GB | |
262k | FP16 | $0.500 | $2.800 | $43.92/hr 8x H200-140GB | ||
QW | 262k | FP16 | $0.500 | $1.200 | $7.98/hr 2x H100-80GB | |
QW | 262k | FP16 | $0.500 | $1.500 | $7.98/hr 2x H100-80GB | |
DS | 131k | FP16 | $0.600 | $1.700 | $43.92/hr 8x H200-140GB | |
33k | FP16 | $0.600 | $0.600 | $7.98/hr 2x H100-80GB | ||
33k | FP16 | $0.600 | $0.600 | $15.96/hr 4x H100-80GB | ||
QW | 262k | FP16 | $0.600 | $3.600 | — | |
Z | 203k | FP16 | $0.600 | $2.200 | $31.92/hr 8x H100-80GB | |
QW | 262k | FP16 | $0.650 | $3.000 | $21.96/hr 4x H200-140GB | |
DS | 16k | FP16 | $0.800 | $0.800 | $15.96/hr 4x H100-80GB | |
8k | FP16 | $0.800 | $0.800 | $7.98/hr 2x H100-80GB | ||
QW | 33k | FP16 | $0.800 | $0.800 | $7.98/hr 2x H100-80GB | |
QW | 16k | FP16 | $0.800 | $0.800 | $15.96/hr 4x H100-80GB | |
131k | FP8 | $0.880 | $0.880 | $7.98/hr 2x H100-80GB | ||
131k | FP8 | $0.880 | $0.880 | $15.96/hr 4x H100-80GB | ||
8k | FP8 | $0.880 | $0.880 | $15.96/hr 4x H100-80GB | ||
NV | 33k | FP16 | $0.880 | $0.880 | $15.96/hr 4x H100-80GB | |
Z | 203k | FP16 | $1.000 | $3.200 | — | |
262k | FP16 | $1.200 | $4.000 | $43.92/hr 8x H200-140GB | ||
QW | 33k | FP16 | $1.200 | $1.200 | $15.96/hr 4x H100-80GB | |
QW | 131k | FP8 | $1.200 | $1.200 | $15.96/hr 4x H100-80GB | |
QW | 33k | FP16 | $1.200 | $1.200 | $15.96/hr 4x H100-80GB | |
QW | 131k | FP16 | $1.200 | $1.200 | $15.96/hr 4x H100-80GB | |
164k | FP16 | $1.250 | $1.250 | — | ||
DS | 164k | FP16 | $1.250 | $1.250 | $43.92/hr 8x H200-140GB | |
Z | 203k | FP16 | $1.400 | $4.400 | $39.80/hr 4x B200-180GB | |
DS | 131k | FP16 | $1.600 | $1.600 | $15.96/hr 4x H100-80GB | |
QW | 33k | FP16 | $1.950 | $8.000 | $15.96/hr 4x H100-80GB | |
DS | 131k | FP16 | $2.000 | $2.000 | $31.92/hr 8x H100-80GB | |
QW | 262k | FP16 | $2.000 | $2.000 | $31.92/hr 8x H100-80GB | |
DS | 164k | FP16 | $3.000 | $7.000 | — | |
DS | 164k | FP16 | $3.000 | $7.000 | $43.92/hr 8x H200-140GB | |
4k | FP16 | $3.500 | $3.500 | $31.92/hr 8x H100-80GB | ||
Kokoro 82M | — | FP16 | $4.000 | $N/A | — | |
Orpheus 3B | — | FP16 | $15.000 | $N/A | $3.99/hr H100-80GB | |
Cartesia Sonic | — | FP16 | $65.000 | $N/A | — | |
Cartesia Sonic 2 | 0k | FP16 | $65.000 | $N/A | — | |
Cartesia Sonic 3 | 0k | FP16 | $65.000 | $N/A | — |
Pricing from Together AI. Serverless prices per 1M tokens. Dedicated prices per hour (cheapest GPU config shown). Batch API at 50% of serverless.
Quantization: FP16 = Reference, FP8 = Turbo, INT4 = Lite. Image/video pricing not available via API.
Compare Together AI with Other Providers
Together AI Free Tier
Free models, credits & limits
vs
Together AI vs Groq
Compare pricing & models
vs
Together AI vs Fireworks AI
Compare pricing & models
vs
Together AI vs DeepInfra
Compare pricing & models
vs
Together AI vs Cerebras
Compare pricing & models
vs
Together AI vs SambaNova
Compare pricing & models
vsTogether AI vs Nebius AI
Compare pricing & models
vsTogether AI vs Cloudflare Workers AI
Compare pricing & models
vsTogether AI vs AWS Bedrock
Compare pricing & models
vsTogether AI vs Azure OpenAI
Compare pricing & models
vsTogether AI vs Google AI Studio
Compare pricing & models
vsTogether AI vs OpenRouter
Compare pricing & models
vs
Together AI vs Novita AI
Compare pricing & models