Price Per TokenPrice Per Token
Together AI

Together AI Pricing

Compare Together AI pricing for 261 models. Serverless inference, dedicated GPU endpoints, and fine-tuning across LLMs, embeddings, image, video, and audio models.

Last updated: Feb 21, 2026

Together AI Overview

66
Serverless
175
Dedicated GPU
138
Registry Only
4
Free Tier
$0.00
Cheapest Input/1M

Together AI Model Pricing

Provider
Model
Context
Quant
Input/1M
Output/1M
Dedicated/hr
LFM2 24B A2B Preview
128k
FP16
$0.000
$0.000
BGE Base EN v1.5
1k
FP16
$0.008
$0.008
BGE Large EN v1.5
FP16
$0.016
$0.016
33k
FP16
$0.020
$0.040
E5 Large Instruct
1k
FP16
$0.020
$0.020
128k
FP16
$0.045
$0.150
$6.72/hr 2x H100-80GB
131k
FP16
$0.050
$0.200
$3.36/hr H100-80GB
131k
FP16
$0.060
$0.060
$3.36/hr H100-80GB
131k
FP8
$0.060
$0.060
131k
FP16
$0.060
$0.250
$3.36/hr H100-80GB
GTE ModernBERT Base
8k
FP16
$0.080
$0.080
Qwen2.5 1.5B Instruct
33k
FP16
$0.100
$0.100
8k
INT4
$0.100
$0.100
33k
FP16
$0.100
$0.300
Mxbai Rerank Large V2
33k
FP16
$0.100
$0.100
33k
FP16
$0.150
$0.150
131k
FP16
$0.150
$0.600
$3.36/hr H100-80GB
Qwen3 Next 80B A3B Instruct
262k
FP16
$0.150
$1.500
$13.44/hr 4x H100-80GB
262k
FP16
$0.150
$1.500
$13.44/hr 4x H100-80GB
Marin 8B Instruct
4k
FP16
$0.180
$0.180
1049k
FP16
$0.180
$0.590
$26.88/hr 8x H100-80GB
131k
FP8
$0.180
$0.180
$3.36/hr H100-80GB
262k
FP16
$0.180
$0.680
$3.36/hr H100-80GB
131k
FP8
$0.180
$0.180
16k
FP16
$0.200
$0.200
8k
FP16
$0.200
$0.200
$3.36/hr H100-80GB
262k
FP16
$0.200
$0.200
$3.36/hr H100-80GB
33k
FP16
$0.200
$0.200
$13.44/hr 4x H100-80GB
33k
FP16
$0.200
$0.200
$6.72/hr 2x H100-80GB
262k
FP16
$0.200
$0.600
$13.44/hr 4x H100-80GB
131k
FP16
$0.200
$1.100
$13.44/hr 4x H100-80GB
Llama Guard 2 8B
8k
FP16
$0.200
$0.200
1049k
FP16
$0.200
$0.200
VirtueGuard Text Lite
33k
INT4
$0.200
$0.200
Orpheus 3B
FP16
$0.270
$0.850
$3.36/hr H100-80GB
Kokoro 82M
FP16
$0.270
$0.850
1049k
FP16
$0.270
$0.850
$26.88/hr 8x H100-80GB
Whisper Large V3
FP16
$0.270
$0.850
$3.36/hr H100-80GB
197k
FP16
$0.300
$1.200
33k
FP8
$0.300
$0.300
$3.36/hr H100-80GB
203k
FP16
$0.450
$2.000
$26.88/hr 8x H100-80GB
262k
FP16
$0.500
$2.800
$39.94/hr 8x H200-140GB
262k
FP16
$0.500
$1.200
$6.72/hr 2x H100-80GB
262k
FP16
$0.500
$1.500
$6.72/hr 2x H100-80GB
131k
FP16
$0.600
$1.700
$39.94/hr 8x H200-140GB
33k
FP16
$0.600
$0.600
$6.72/hr 2x H100-80GB
Qwen3.5 397B A17B
262k
FP16
$0.600
$3.600
203k
FP16
$0.600
$2.200
$26.88/hr 8x H100-80GB
262k
FP16
$0.650
$3.000
$19.97/hr 4x H200-140GB
Qwen2.5 14B Instruct
33k
FP16
$0.800
$0.800
$6.72/hr 2x H100-80GB
131k
FP8
$0.880
$0.880
$6.72/hr 2x H100-80GB
131k
FP8
$0.880
$0.880
$13.44/hr 4x H100-80GB
8k
FP16
$0.900
$0.900
8k
FP16
$0.900
$0.900
262k
FP16
$1.000
$3.000
203k
FP16
$1.000
$3.200
262k
FP16
$1.200
$4.000
$39.94/hr 8x H200-140GB
33k
FP16
$1.200
$1.200
$13.44/hr 4x H100-80GB
164k
FP16
$1.250
$1.250
131k
FP16
$2.000
$2.000
$26.88/hr 8x H100-80GB
262k
FP16
$2.000
$2.000
$26.88/hr 8x H100-80GB
164k
FP16
$3.000
$7.000
$39.94/hr 8x H200-140GB
4k
FP16
$3.500
$3.500
$26.88/hr 8x H100-80GB
Cartesia Sonic
FP16
$65.000
$N/A
Cartesia Sonic 2
FP16
$65.000
$N/A
Cartesia Sonic 3
FP16
$65.000
$N/A

Pricing from Together AI. Serverless prices per 1M tokens. Dedicated prices per hour (cheapest GPU config shown). Batch API at 50% of serverless.

Quantization: FP16 = Reference, FP8 = Turbo, INT4 = Lite. Image/video pricing not available via API.