Price Per TokenPrice Per Token
Together AI

Together AI Pricing

Compare Together AI pricing for 264 models. Serverless inference, dedicated GPU endpoints, and fine-tuning across LLMs, embeddings, image, video, and audio models.

Last updated: Mar 16, 2026

Together AI Overview

49
Serverless
191
Dedicated GPU
159
Registry Only
3
Free Tier
$0.02
Cheapest Input/1M

Together AI Model Pricing

Provider
Model
Context
Quant
Input/1M
Output/1M
Dedicated/hr
33k
FP16
$0.020
$0.040
E5 Large Instruct
1k
FP16
$0.020
$0.020
33k
FP16
$0.030
$0.120
131k
FP16
$0.050
$0.200
$3.99/hr H100-80GB
131k
FP16
$0.060
$0.060
$3.99/hr H100-80GB
33k
FP16
$0.100
$0.100
8k
INT4
$0.100
$0.100
33k
FP16
$0.100
$0.300
262k
FP16
$0.100
$0.150
$3.99/hr H100-80GB
33k
FP16
$0.150
$0.150
131k
FP16
$0.150
$0.600
$3.99/hr H100-80GB
262k
FP16
$0.150
$1.500
$15.96/hr 4x H100-80GB
1049k
FP16
$0.180
$0.590
$31.92/hr 8x H100-80GB
131k
FP8
$0.180
$0.180
$3.99/hr H100-80GB
262k
FP16
$0.180
$0.680
$3.99/hr H100-80GB
16k
FP16
$0.200
$0.200
8k
FP16
$0.200
$0.200
$3.99/hr H100-80GB
33k
FP16
$0.200
$0.200
$7.98/hr 2x H100-80GB
262k
FP16
$0.200
$0.600
$15.96/hr 4x H100-80GB
131k
FP16
$0.200
$1.100
$15.96/hr 4x H100-80GB
Llama Guard 4 12B
1049k
FP16
$0.200
$0.200
VirtueGuard Text Lite
33k
INT4
$0.200
$0.200
1049k
FP16
$0.270
$0.850
$31.92/hr 8x H100-80GB
Whisper Large V3
FP16
$0.270
$0.850
$3.99/hr H100-80GB
197k
FP16
$0.300
$1.200
33k
FP8
$0.300
$0.300
$3.99/hr H100-80GB
203k
FP16
$0.450
$2.000
$31.92/hr 8x H100-80GB
262k
FP16
$0.500
$2.800
$43.92/hr 8x H200-140GB
262k
FP16
$0.500
$1.200
$7.98/hr 2x H100-80GB
131k
FP16
$0.600
$1.700
$43.92/hr 8x H200-140GB
33k
FP16
$0.600
$0.600
$7.98/hr 2x H100-80GB
262k
FP16
$0.600
$3.600
203k
FP16
$0.600
$2.200
$31.92/hr 8x H100-80GB
262k
FP16
$0.650
$3.000
$21.96/hr 4x H200-140GB
33k
FP16
$0.800
$0.800
$7.98/hr 2x H100-80GB
131k
FP8
$0.880
$0.880
$7.98/hr 2x H100-80GB
131k
FP8
$0.880
$0.880
$15.96/hr 4x H100-80GB
203k
FP16
$1.000
$3.200
$79.60/hr 8x B200-180GB
33k
FP16
$1.200
$1.200
$15.96/hr 4x H100-80GB
164k
FP16
$1.250
$1.250
131k
FP16
$2.000
$2.000
$31.92/hr 8x H100-80GB
262k
FP16
$2.000
$2.000
$31.92/hr 8x H100-80GB
164k
FP16
$3.000
$7.000
$43.92/hr 8x H200-140GB
4k
FP16
$3.500
$3.500
$31.92/hr 8x H100-80GB
Kokoro 82M
FP16
$4.000
$N/A
Orpheus 3B
FP16
$15.000
$N/A
$3.99/hr H100-80GB
Cartesia Sonic
FP16
$65.000
$N/A
Cartesia Sonic 2
FP16
$65.000
$N/A
Cartesia Sonic 3
FP16
$65.000
$N/A

Pricing from Together AI. Serverless prices per 1M tokens. Dedicated prices per hour (cheapest GPU config shown). Batch API at 50% of serverless.

Quantization: FP16 = Reference, FP8 = Turbo, INT4 = Lite. Image/video pricing not available via API.

Compare Together AI with Other Providers