Price Per TokenPrice Per Token
Price Per Token

AI Embedding Model Pricing Comparison

Compare pricing for embedding models across AWS Bedrock and direct APIs. Find the cheapest embedding API for Titan, Cohere Embed, and other embedding models. All prices per 1M input tokens.

Embedding API Pricing Overview

3
Embedding Models
1
Model Authors
3
On AWS Bedrock
$0.1000
Lowest / 1M tokens

All Embedding Model Prices

Author
Model
Dimensions
Max Input
Bedrock / 1M
Direct API / 1M
Cheapest
1,024
512
$0.100
N/A
Bedrock
1,024
512
$0.100
N/A
Bedrock
1,024
128,000
$0.120
N/A
Bedrock

About AI Embedding Model Pricing

Embedding models convert text into dense vector representations used for semantic search, retrieval-augmented generation (RAG), clustering, and classification. Unlike LLMs, embedding models only have input pricing — there are no output tokens.

  • AWS Bedrock provides managed access to embedding models from Amazon, Cohere, and others
  • Direct APIs are available from providers like OpenAI, Cohere, and Voyage AI

All prices shown are per 1 million input tokens. Key factors when choosing an embedding model include dimensions (vector size), max input tokens (context window), and price.

Frequently Asked Questions

What are embedding models?

Embedding models convert text into numerical vectors that capture semantic meaning. These vectors enable similarity search, clustering, and retrieval-augmented generation (RAG). Unlike LLMs that generate text, embedding models produce fixed-size vectors as output.

What is the cheapest embedding API?

The cheapest embedding API starts at $0.1000 per 1M input tokens. Prices vary by model and provider. AWS Bedrock often offers competitive pricing for embedding models. Use our comparison table to find the best deal.

What do embedding dimensions mean?

Dimensions refer to the size of the output vector (e.g., 1024 means each text input is converted to a 1024-number vector). Higher dimensions can capture more nuance but require more storage and compute for similarity searches. Most modern models use 1024 dimensions.

What is max input tokens for embeddings?

Max input tokens is the maximum amount of text the model can process in a single embedding request. Models with larger context windows (like 128K tokens) can embed entire documents at once, while smaller windows (512 tokens) require chunking longer texts.