Price Per TokenPrice Per Token
AWS BedrockvsDeepInfra

AWS Bedrock vs DeepInfra

Compare pricing across 19 shared models. AWS Bedrock offers 71 models, DeepInfra offers 66.

8 Ways to Use Fewer Tokens

Get the free PDF guide — practical tips to cut your token usage and API costs. Subscribe to the Price Per Token newsletter and download instantly.

19
Shared Models
0
AWS Bedrock Cheaper
17
DeepInfra Cheaper
2
Same Price

Price Comparison — Shared Models

Model ↑AWS Bedrock Input DeepInfra Input AWS Bedrock Output DeepInfra Output Cheaper
DeepSeek V3.2 $0.620 $0.260 $1.85 $0.380DeepInfra
Gemma 3 12B $0.090 $0.040 $0.290 $0.130DeepInfra
Gemma 3 27B $0.230 $0.080 $0.380 $0.160DeepInfra
Gemma 3 4B $0.040 $0.040 $0.080 $0.080Same
GLM 4.7 $0.600 $0.400 $2.20 $1.75DeepInfra
GLM 5 $1.00 $0.600 $3.20 $2.08DeepInfra
GLM-4.7-Flash $0.070 $0.060 $0.400 $0.400DeepInfra
GPT-OSS-120b $0.150 $0.150 $0.600 $0.600Same
GPT-OSS-20b $0.070 $0.030 $0.150 $0.140DeepInfra
Kimi K2.5 $0.600 $0.450 $3.00 $2.25DeepInfra
Llama 3.1 70B Instruct $0.720 $0.400 $0.720 $0.400DeepInfra
MiniMax M2.5 $0.300 $0.150 $1.20 $1.15DeepInfra
Mistral Small 3.2 24B $0.500 $0.075 $1.50 $0.200DeepInfra
Nemotron 3 Nano 30B A3B $0.060 $0.050 $0.240 $0.200DeepInfra
Nemotron 3 Super 120B A12B $0.150 $0.100 $0.650 $0.500DeepInfra
Nemotron Nano 9B V2 $0.060 $0.040 $0.230 $0.160DeepInfra
Qwen3 32B $0.200 $0.080 $0.780 $0.280DeepInfra
Qwen3 Next 80B A3B Instruct $0.150 $0.090 $1.20 $1.10DeepInfra
Qwen3 VL 235B A22B Instruct $0.530 $0.200 $2.66 $0.880DeepInfra

Model Coverage

Shared(19)
19
models available on both
AWS BedrockDeepInfra
68 total66 total

Full Provider Pricing

Frequently Asked Questions

DeepInfra is cheaper on 17 out of 19 shared models. AWS Bedrock is cheaper on 0 models. 2 models have the same price.
AWS Bedrock and DeepInfra share 19 models. AWS Bedrock has 49 exclusive models, while DeepInfra has 47 exclusive models.
AWS Bedrock offers 71 models compared to DeepInfra's 66. However, model count alone doesn't determine the better provider — consider pricing, latency, and which specific models you need.