Price Per TokenPrice Per Token
Together AI

Together AI Pricing

Compare Together AI pricing for 227 models. Serverless inference, dedicated GPU endpoints, and fine-tuning across LLMs, embeddings, image, video, and audio models.

Last updated: Apr 10, 2026

Together AI Overview

74
Serverless
156
Dedicated GPU
0
Registry Only
98
Free Tier
$0.02
Cheapest Input/1M

Together AI Model Pricing

Provider
Model
Context
Quant
Input/1M
Output/1M
Dedicated/hr
33k
FP16
$0.020
$0.040
33k
FP16
$0.020
$0.020
$3.99/hr H100-80GB
E5 Large Instruct
1k
FP16
$0.020
$0.020
33k
FP16
$0.030
$0.120
131k
FP16
$0.050
$0.200
$3.99/hr H100-80GB
131k
FP16
$0.060
$0.060
$3.99/hr H100-80GB
131k
FP16
$0.060
$0.250
$3.99/hr H100-80GB
33k
FP16
$0.100
$0.100
8k
INT4
$0.100
$0.100
33k
FP16
$0.100
$0.300
$7.98/hr 2x H100-80GB
262k
FP16
$0.100
$0.150
$7.98/hr 2x H100-80GB
Salesforce Llama Rank V1
8k
FP16
$0.100
$0.100
$3.99/hr H100-80GB
33k
FP16
$0.150
$0.150
131k
FP16
$0.150
$0.600
$3.99/hr H100-80GB
262k
FP16
$0.150
$1.500
$15.96/hr 4x H100-80GB
262k
FP16
$0.150
$1.500
$15.96/hr 4x H100-80GB
131k
FP16
$0.180
$0.180
$3.99/hr H100-80GB
1049k
FP16
$0.180
$0.590
$31.92/hr 8x H100-80GB
131k
FP8
$0.180
$0.180
$3.99/hr H100-80GB
262k
FP16
$0.180
$0.680
$3.99/hr H100-80GB
262k
FP16
$0.200
$0.500
$7.98/hr 2x H100-80GB
8k
FP16
$0.200
$0.200
$3.99/hr H100-80GB
8k
FP16
$0.200
$0.200
$3.99/hr H100-80GB
262k
FP16
$0.200
$0.200
$3.99/hr H100-80GB
33k
FP16
$0.200
$0.200
$7.98/hr 2x H100-80GB
33k
FP16
$0.200
$0.200
$7.98/hr 2x H100-80GB
262k
FP16
$0.200
$0.600
$15.96/hr 4x H100-80GB
131k
FP16
$0.200
$1.100
$15.96/hr 4x H100-80GB
16k
FP16
$0.200
$0.200
$7.98/hr 2x H100-80GB
Llama Guard 4 12B
1049k
FP16
$0.200
$0.200
Rime Arcana v2
FP16
$0.270
$N/A
$3.99/hr H100-80GB
1049k
FP16
$0.270
$0.850
$31.92/hr 8x H100-80GB
Whisper Large V3
FP16
$0.270
$0.850
$3.99/hr H100-80GB
197k
FP16
$0.300
$1.200
33k
FP8
$0.300
$0.300
$3.99/hr H100-80GB
203k
FP16
$0.450
$2.000
$31.92/hr 8x H100-80GB
262k
FP16
$0.500
$2.800
$43.92/hr 8x H200-140GB
262k
FP16
$0.500
$1.200
$7.98/hr 2x H100-80GB
262k
FP16
$0.500
$1.500
$7.98/hr 2x H100-80GB
131k
FP16
$0.600
$1.700
$43.92/hr 8x H200-140GB
33k
FP16
$0.600
$0.600
$7.98/hr 2x H100-80GB
33k
FP16
$0.600
$0.600
$15.96/hr 4x H100-80GB
262k
FP16
$0.600
$3.600
203k
FP16
$0.600
$2.200
$31.92/hr 8x H100-80GB
262k
FP16
$0.650
$3.000
$21.96/hr 4x H200-140GB
16k
FP16
$0.800
$0.800
$15.96/hr 4x H100-80GB
8k
FP16
$0.800
$0.800
$7.98/hr 2x H100-80GB
33k
FP16
$0.800
$0.800
$7.98/hr 2x H100-80GB
16k
FP16
$0.800
$0.800
$15.96/hr 4x H100-80GB
131k
FP8
$0.880
$0.880
$7.98/hr 2x H100-80GB
131k
FP8
$0.880
$0.880
$15.96/hr 4x H100-80GB
8k
FP8
$0.880
$0.880
$15.96/hr 4x H100-80GB
33k
FP16
$0.880
$0.880
$15.96/hr 4x H100-80GB
203k
FP16
$1.000
$3.200
262k
FP16
$1.200
$4.000
$43.92/hr 8x H200-140GB
33k
FP16
$1.200
$1.200
$15.96/hr 4x H100-80GB
131k
FP8
$1.200
$1.200
$15.96/hr 4x H100-80GB
33k
FP16
$1.200
$1.200
$15.96/hr 4x H100-80GB
131k
FP16
$1.200
$1.200
$15.96/hr 4x H100-80GB
164k
FP16
$1.250
$1.250
164k
FP16
$1.250
$1.250
$43.92/hr 8x H200-140GB
203k
FP16
$1.400
$4.400
$39.80/hr 4x B200-180GB
131k
FP16
$1.600
$1.600
$15.96/hr 4x H100-80GB
33k
FP16
$1.950
$8.000
$15.96/hr 4x H100-80GB
131k
FP16
$2.000
$2.000
$31.92/hr 8x H100-80GB
262k
FP16
$2.000
$2.000
$31.92/hr 8x H100-80GB
164k
FP16
$3.000
$7.000
164k
FP16
$3.000
$7.000
$43.92/hr 8x H200-140GB
4k
FP16
$3.500
$3.500
$31.92/hr 8x H100-80GB
Kokoro 82M
FP16
$4.000
$N/A
Orpheus 3B
FP16
$15.000
$N/A
$3.99/hr H100-80GB
Cartesia Sonic
FP16
$65.000
$N/A
Cartesia Sonic 2
0k
FP16
$65.000
$N/A
Cartesia Sonic 3
0k
FP16
$65.000
$N/A

Pricing from Together AI. Serverless prices per 1M tokens. Dedicated prices per hour (cheapest GPU config shown). Batch API at 50% of serverless.

Quantization: FP16 = Reference, FP8 = Turbo, INT4 = Lite. Image/video pricing not available via API.

Compare Together AI with Other Providers