
Together AI Pricing
Compare Together AI pricing for 261 models. Serverless inference, dedicated GPU endpoints, and fine-tuning across LLMs, embeddings, image, video, and audio models.
Last updated: Feb 21, 2026
Together AI Overview
66
Serverless
175
Dedicated GPU
138
Registry Only
4
Free Tier
$0.00
Cheapest Input/1M
Together AI Model Pricing
Provider | Model | Context | Quant | Input/1M | Output/1M | Dedicated/hr |
|---|---|---|---|---|---|---|
LQ | LFM2 24B A2B Preview | 128k | FP16 | $0.000 | $0.000 | — |
BGE Base EN v1.5 | 1k | FP16 | $0.008 | $0.008 | — | |
BGE Large EN v1.5 | — | FP16 | $0.016 | $0.016 | — | |
33k | FP16 | $0.020 | $0.040 | — | ||
E5 Large Instruct | 1k | FP16 | $0.020 | $0.020 | — | |
AR | 128k | FP16 | $0.045 | $0.150 | $6.72/hr 2x H100-80GB | |
131k | FP16 | $0.050 | $0.200 | $3.36/hr H100-80GB | ||
131k | FP16 | $0.060 | $0.060 | $3.36/hr H100-80GB | ||
131k | FP8 | $0.060 | $0.060 | — | ||
NV | 131k | FP16 | $0.060 | $0.250 | $3.36/hr H100-80GB | |
AL | GTE ModernBERT Base | 8k | FP16 | $0.080 | $0.080 | — |
QW | Qwen2.5 1.5B Instruct | 33k | FP16 | $0.100 | $0.100 | — |
8k | INT4 | $0.100 | $0.100 | — | ||
33k | FP16 | $0.100 | $0.300 | — | ||
Mxbai Rerank Large V2 | 33k | FP16 | $0.100 | $0.100 | — | |
33k | FP16 | $0.150 | $0.150 | — | ||
131k | FP16 | $0.150 | $0.600 | $3.36/hr H100-80GB | ||
QW | Qwen3 Next 80B A3B Instruct | 262k | FP16 | $0.150 | $1.500 | $13.44/hr 4x H100-80GB |
QW | 262k | FP16 | $0.150 | $1.500 | $13.44/hr 4x H100-80GB | |
Marin 8B Instruct | 4k | FP16 | $0.180 | $0.180 | — | |
1049k | FP16 | $0.180 | $0.590 | $26.88/hr 8x H100-80GB | ||
131k | FP8 | $0.180 | $0.180 | $3.36/hr H100-80GB | ||
QW | 262k | FP16 | $0.180 | $0.680 | $3.36/hr H100-80GB | |
131k | FP8 | $0.180 | $0.180 | — | ||
16k | FP16 | $0.200 | $0.200 | — | ||
8k | FP16 | $0.200 | $0.200 | $3.36/hr H100-80GB | ||
262k | FP16 | $0.200 | $0.200 | $3.36/hr H100-80GB | ||
33k | FP16 | $0.200 | $0.200 | $13.44/hr 4x H100-80GB | ||
33k | FP16 | $0.200 | $0.200 | $6.72/hr 2x H100-80GB | ||
QW | 262k | FP16 | $0.200 | $0.600 | $13.44/hr 4x H100-80GB | |
Z | 131k | FP16 | $0.200 | $1.100 | $13.44/hr 4x H100-80GB | |
Llama Guard 2 8B | 8k | FP16 | $0.200 | $0.200 | — | |
1049k | FP16 | $0.200 | $0.200 | — | ||
VirtueGuard Text Lite | 33k | INT4 | $0.200 | $0.200 | — | |
Orpheus 3B | — | FP16 | $0.270 | $0.850 | $3.36/hr H100-80GB | |
Kokoro 82M | — | FP16 | $0.270 | $0.850 | — | |
1049k | FP16 | $0.270 | $0.850 | $26.88/hr 8x H100-80GB | ||
Whisper Large V3 | — | FP16 | $0.270 | $0.850 | $3.36/hr H100-80GB | |
MM | 197k | FP16 | $0.300 | $1.200 | — | |
QW | 33k | FP8 | $0.300 | $0.300 | $3.36/hr H100-80GB | |
Z | 203k | FP16 | $0.450 | $2.000 | $26.88/hr 8x H100-80GB | |
262k | FP16 | $0.500 | $2.800 | $39.94/hr 8x H200-140GB | ||
QW | 262k | FP16 | $0.500 | $1.200 | $6.72/hr 2x H100-80GB | |
QW | 262k | FP16 | $0.500 | $1.500 | $6.72/hr 2x H100-80GB | |
DS | 131k | FP16 | $0.600 | $1.700 | $39.94/hr 8x H200-140GB | |
33k | FP16 | $0.600 | $0.600 | $6.72/hr 2x H100-80GB | ||
QW | Qwen3.5 397B A17B | 262k | FP16 | $0.600 | $3.600 | — |
Z | 203k | FP16 | $0.600 | $2.200 | $26.88/hr 8x H100-80GB | |
QW | 262k | FP16 | $0.650 | $3.000 | $19.97/hr 4x H200-140GB | |
QW | Qwen2.5 14B Instruct | 33k | FP16 | $0.800 | $0.800 | $6.72/hr 2x H100-80GB |
131k | FP8 | $0.880 | $0.880 | $6.72/hr 2x H100-80GB | ||
131k | FP8 | $0.880 | $0.880 | $13.44/hr 4x H100-80GB | ||
8k | FP16 | $0.900 | $0.900 | — | ||
8k | FP16 | $0.900 | $0.900 | — | ||
262k | FP16 | $1.000 | $3.000 | — | ||
Z | 203k | FP16 | $1.000 | $3.200 | — | |
262k | FP16 | $1.200 | $4.000 | $39.94/hr 8x H200-140GB | ||
QW | 33k | FP16 | $1.200 | $1.200 | $13.44/hr 4x H100-80GB | |
164k | FP16 | $1.250 | $1.250 | — | ||
DS | 131k | FP16 | $2.000 | $2.000 | $26.88/hr 8x H100-80GB | |
QW | 262k | FP16 | $2.000 | $2.000 | $26.88/hr 8x H100-80GB | |
DS | 164k | FP16 | $3.000 | $7.000 | $39.94/hr 8x H200-140GB | |
4k | FP16 | $3.500 | $3.500 | $26.88/hr 8x H100-80GB | ||
Cartesia Sonic | — | FP16 | $65.000 | $N/A | — | |
Cartesia Sonic 2 | — | FP16 | $65.000 | $N/A | — | |
Cartesia Sonic 3 | — | FP16 | $65.000 | $N/A | — |
Pricing from Together AI. Serverless prices per 1M tokens. Dedicated prices per hour (cheapest GPU config shown). Batch API at 50% of serverless.
Quantization: FP16 = Reference, FP8 = Turbo, INT4 = Lite. Image/video pricing not available via API.