
Together AI Pricing
Compare Together AI pricing for 257 models. Serverless inference, dedicated GPU endpoints, and fine-tuning across LLMs, embeddings, image, video, and audio models.
Last updated: May 26, 2026
Together AI Overview
77
Serverless
171
Dedicated GPU
0
Registry Only
111
Free Tier
$0.01
Cheapest Input/1M
Together AI Model Pricing
Provider | Model | Context | Quant | Input/1M | Output/1M | Dedicated/hr |
|---|---|---|---|---|---|---|
BGE Base EN v1.5 | 1k | FP16 | $0.008 | $0.008 | $2.59/hr A100-80GB | |
QW | 33k | FP16 | $0.020 | $0.020 | $6.49/hr H100-80GB | |
E5 Large Instruct | 1k | FP16 | $0.020 | $0.020 | — | |
LQ | 33k | FP16 | $0.030 | $0.120 | — | |
AR | 128k | FP16 | $0.045 | $0.150 | $6.49/hr H100-80GB | |
131k | FP16 | $0.050 | $0.200 | $6.49/hr H100-80GB | ||
33k | FP16 | $0.060 | $0.120 | — | ||
131k | FP16 | $0.060 | $0.060 | $6.49/hr H100-80GB | ||
131k | FP16 | $0.060 | $0.060 | $25.96/hr 4x H100-80GB | ||
NV | 131k | FP16 | $0.060 | $0.250 | $6.49/hr H100-80GB | |
QW | 33k | FP16 | $0.100 | $0.100 | — | |
8k | INT4 | $0.100 | $0.100 | — | ||
33k | FP16 | $0.100 | $0.300 | $12.98/hr 2x H100-80GB | ||
QW | 262k | FP16 | $0.100 | $0.150 | $12.98/hr 2x H100-80GB | |
Salesforce Llama Rank V1 | 8k | FP16 | $0.100 | $0.100 | $6.49/hr H100-80GB | |
33k | FP16 | $0.150 | $0.150 | — | ||
131k | FP16 | $0.150 | $0.600 | $6.49/hr H100-80GB | ||
QW | 262k | FP16 | $0.150 | $1.500 | $25.96/hr 4x H100-80GB | |
QW | 262k | FP16 | $0.150 | $1.500 | $25.96/hr 4x H100-80GB | |
DS | 131k | FP16 | $0.180 | $0.180 | $6.49/hr H100-80GB | |
1049k | FP16 | $0.180 | $0.590 | $51.92/hr 8x H100-80GB | ||
131k | FP8 | $0.180 | $0.180 | $6.49/hr H100-80GB | ||
QW | 262k | FP16 | $0.180 | $0.680 | $6.49/hr H100-80GB | |
8k | FP16 | $0.200 | $0.200 | $6.49/hr H100-80GB | ||
8k | FP16 | $0.200 | $0.200 | $6.49/hr H100-80GB | ||
262k | FP16 | $0.200 | $0.200 | $6.49/hr H100-80GB | ||
33k | FP16 | $0.200 | $0.200 | $12.98/hr 2x H100-80GB | ||
33k | FP16 | $0.200 | $0.200 | $12.98/hr 2x H100-80GB | ||
QW | 262k | FP16 | $0.200 | $0.600 | $25.96/hr 4x H100-80GB | |
Z | 131k | FP16 | $0.200 | $1.100 | $12.98/hr 2x H100-80GB | |
16k | FP16 | $0.200 | $0.200 | $12.98/hr 2x H100-80GB | ||
Llama Guard 4 12B | 1049k | FP16 | $0.200 | $0.200 | — | |
Rime Arcana v2 | — | FP16 | $0.270 | $N/A | $6.49/hr H100-80GB | |
1049k | FP16 | $0.270 | $0.850 | $51.92/hr 8x H100-80GB | ||
Voxtral Mini 3B 2507 | — | FP16 | $0.270 | $0.850 | — | |
Whisper Large V3 | 0k | FP16 | $0.270 | $0.850 | $6.49/hr H100-80GB | |
32k | FP16 | $0.280 | $0.860 | — | ||
MM | 197k | FP16 | $0.300 | $1.200 | $23.90/hr 2x B200-180GB | |
QW | 33k | FP8 | $0.300 | $0.300 | $6.49/hr H100-80GB | |
262k | FP16 | $0.390 | $0.970 | $12.98/hr 2x H100-80GB | ||
Z | 203k | FP16 | $0.450 | $2.000 | $51.92/hr 8x H100-80GB | |
QW | Qwen3.6 Plus | 1000k | FP16 | $0.500 | $3.000 | — |
QW | 262k | FP16 | $0.500 | $1.200 | $12.98/hr 2x H100-80GB | |
QW | 262k | FP16 | $0.500 | $1.500 | $12.98/hr 2x H100-80GB | |
33k | FP16 | $0.600 | $0.600 | $12.98/hr 2x H100-80GB | ||
33k | FP16 | $0.600 | $0.600 | $25.96/hr 4x H100-80GB | ||
QW | 262k | FP16 | $0.600 | $3.600 | — | |
Z | 203k | FP16 | $0.600 | $2.200 | $51.92/hr 8x H100-80GB | |
DS | 16k | FP16 | $0.800 | $0.800 | $25.96/hr 4x H100-80GB | |
8k | FP16 | $0.800 | $0.800 | $12.98/hr 2x H100-80GB | ||
QW | 33k | FP16 | $0.800 | $0.800 | $12.98/hr 2x H100-80GB | |
QW | 16k | FP16 | $0.800 | $0.800 | $25.96/hr 4x H100-80GB | |
131k | FP8 | $0.880 | $0.880 | $12.98/hr 2x H100-80GB | ||
131k | FP8 | $0.880 | $0.880 | $25.96/hr 4x H100-80GB | ||
8k | FP8 | $0.880 | $0.880 | $25.96/hr 4x H100-80GB | ||
NV | 33k | FP16 | $0.880 | $0.880 | $25.96/hr 4x H100-80GB | |
QW | 33k | FP16 | $0.900 | $0.900 | $25.96/hr 4x H100-80GB | |
Z | 203k | FP16 | $1.000 | $3.200 | $47.80/hr 4x B200-180GB | |
262k | FP16 | $1.200 | $4.500 | $95.60/hr 8x B200-180GB | ||
QW | 33k | FP16 | $1.200 | $1.200 | $25.96/hr 4x H100-80GB | |
QW | 131k | FP8 | $1.200 | $1.200 | $25.96/hr 4x H100-80GB | |
QW | 33k | FP16 | $1.200 | $1.200 | $25.96/hr 4x H100-80GB | |
QW | 131k | FP16 | $1.200 | $1.200 | $25.96/hr 4x H100-80GB | |
164k | FP16 | $1.250 | $1.250 | — | ||
QW | Qwen3.7 Max | 1000k | FP16 | $1.250 | $3.750 | — |
Z | 203k | FP16 | $1.400 | $4.400 | $47.80/hr 4x B200-180GB | |
DS | 131k | FP16 | $1.600 | $1.600 | $25.96/hr 4x H100-80GB | |
QW | 33k | FP16 | $1.950 | $8.000 | $25.96/hr 4x H100-80GB | |
DS | 131k | FP16 | $2.000 | $2.000 | $51.92/hr 8x H100-80GB | |
QW | 262k | FP16 | $2.000 | $2.000 | $51.92/hr 8x H100-80GB | |
DS | 512k | FP16 | $2.100 | $4.400 | $95.60/hr 8x B200-180GB | |
4k | FP16 | $3.500 | $3.500 | $51.92/hr 8x H100-80GB | ||
Kokoro 82M | — | FP16 | $4.000 | $N/A | — | |
Orpheus 3B | — | FP16 | $15.000 | $N/A | $6.49/hr H100-80GB | |
Cartesia Sonic | — | FP16 | $65.000 | $N/A | — | |
Cartesia Sonic 2 | 0k | FP16 | $65.000 | $N/A | — | |
Cartesia Sonic 3 | 0k | FP16 | $65.000 | $N/A | — |
Pricing from Together AI. Serverless prices per 1M tokens. Dedicated prices per hour (cheapest GPU config shown). Batch API at 50% of serverless.
Quantization: FP16 = Reference, FP8 = Turbo, INT4 = Lite. Image/video pricing not available via API.
Compare Together AI with Other Providers
Together AI Free Tier
Free models, credits & limits
vs
Together AI vs Groq
Compare pricing & models
vs
Together AI vs Fireworks AI
Compare pricing & models
vs
Together AI vs DeepInfra
Compare pricing & models
vs
Together AI vs Cerebras
Compare pricing & models
vs
Together AI vs SambaNova
Compare pricing & models
vsTogether AI vs Nebius AI
Compare pricing & models
vsTogether AI vs Cloudflare Workers AI
Compare pricing & models
vsTogether AI vs AWS Bedrock
Compare pricing & models
vsTogether AI vs Azure OpenAI
Compare pricing & models
vsTogether AI vs Google AI Studio
Compare pricing & models
vsTogether AI vs OpenRouter
Compare pricing & models
vs
Together AI vs Novita AI
Compare pricing & models