Supported Models
alembica supports a variety of AI models from leading providers including OpenAI, GoogleAI, Cohere, Anthropic, DeepSeek, Perplexity, AWS Bedrock, Azure AI, Vertex AI, and self-hosted OpenAI-compatible endpoints.
The table below provides an overview of all supported models, organized by provider. For each model, you can find its maximum input token capacity and the cost per million input tokens. This information helps you select the appropriate model based on your context length requirements and budget considerations.
Each model has specific limits for input size and costs, as summarized below. Cloud and self-hosted providers use custom model IDs, and costs are not computed by alembica for those providers.
Self-Hosted (OpenAI-Compatible)
Local endpoints such as Ollama, vLLM, LM Studio, or LocalAI are supported via the SelfHosted provider. Model IDs and limits are defined by your local runtime, and costs are not computed.
OpenAI
| Model | Maximum Input Tokens | Cost of 1M Input Tokens |
|---|---|---|
| GPT-5.2 | 400,000 | $1.75 |
| GPT-5.1 | 400,000 | $1.25 |
| GPT-5 | 400,000 | $1.25 |
| GPT-5 Mini | 128,000 | $0.25 |
| GPT-5 Nano | 128,000 | $0.05 |
| o4 Mini | 200,000 | $1.10 |
| o3 Mini | 200,000 | $1.10 |
| o3 | 200,000 | $2.00 |
| o1 Mini | 128,000 | $1.10 |
| o1 | 200,000 | $15.00 |
| GPT-4.1 Nano | 1,000,000 | $0.10 |
| GPT-4.1 Mini | 1,000,000 | $0.40 |
| GPT-4.1 | 1,000,000 | $2.00 |
| GPT-4o Mini | 128,000 | $0.15 |
| GPT-4o | 128,000 | $2.50 |
| GPT-4 Turbo | 128,000 | $10.00 |
| GPT-3.5 Turbo | 16,385 | $0.50 |
Anthropic
| Model | Maximum Input Tokens | Cost of 1M Input Tokens |
|---|---|---|
| Claude 4.5 Opus | 200,000 | $15.00 |
| Claude 4.5 Sonnet | 200,000 | $3.00 |
| Claude 4.5 Haiku | 200,000 | $1.00 |
| Claude 4.0 Opus | 200,000 | $15.00 |
| Claude 4.0 Sonnet | 200,000 | $3.00 |
| Claude 3.7 Sonnet | 200,000 | $3.00 |
| Claude 3.5 Haiku | 200,000 | $0.80 |
| Claude 3 Opus | 200,000 | $15.00 |
| Claude 3.5 Sonnet | 200,000 | $3.00 |
| Claude 3 Haiku | 200,000 | $0.25 |
Cohere
| Model | Maximum Input Tokens | Cost of 1M Input Tokens |
|---|---|---|
| Command A | 256,000 | $2.5 |
| Command A Reasoning | 256,000 | Free (preview) |
| Command R+ | 128,000 | $2.50 |
| Command R7B | 128,000 | $0.0375 |
| Command R, August 2024 | 128,000 | $0.15 |
| Command R | 128,000 | $0.15 |
| Command Light | 4,096 | $0.30 |
| Command | 4,096 | $1.00 |
Perplexity
Perplexity Sonar models with real-time web search capabilities. OpenAI-compatible API.
| Model | Maximum Input Tokens | Cost of 1M Input Tokens |
|---|---|---|
| Sonar | 128,000 | $1.00 |
| Sonar Pro | 128,000 | $3.00 |
| Sonar Reasoning Pro | 128,000 | $2.00 |
| Sonar Deep Research | 128,000 | $2.00 |
DeepSeek
DeepSeek V3.2 models feature automatic context caching. Cache hit rate: $0.028/M tokens (90% cheaper). Cache miss rate shown below. Maximum output: 8K for chat, 64K for reasoner.
| Model | Maximum Input Tokens | Cost of 1M Input Tokens |
|---|---|---|
| DeepSeek-V3.2-Chat | 128,000 | $0.28 (cache miss) |
| DeepSeek-V3.2-Reasoner | 128,000 | $0.28 (cache miss) |
GoogleAI
| Model | Maximum Input Tokens | Cost of 1M Input Tokens |
|---|---|---|
| Gemini 3 Pro Preview | 1,048,576 | $2.00 (≤200k), $4.00 (>200k) |
| Gemini 3 Flash Preview | 1,048,576 | $0.50 |
| Gemini 2.5 Pro | 1,048,576 | $1.25 (≤200k), $2.50 (>200k) |
| Gemini 2.5 Flash | 1,048,576 | $0.30 |
| Gemini 2.5 Flash-Lite | 1,048,576 | $0.10 |
| Gemini 2.0 Flash Lite | 1,048,576 | $0.075 |
| Gemini 2.0 Flash | 1,048,576 | $0.10 |
| Gemini 1.5 Flash | 1,048,576 | $0.15 |
| Gemini 1.5 Pro | 2,097,152 | $2.50 |
| Gemini 1.0 Pro (deprecated) | 32,760 | $0.50 |
AWS Bedrock (Llama and other models)
AWS Bedrock models are addressed by their Bedrock model IDs. Costs and context limits vary by model and are not tracked by alembica.
Azure AI (Azure OpenAI / Model Catalog)
Azure deployments are addressed by your deployment name in the model field. Costs and limits vary by deployment and are not tracked by alembica.
Vertex AI (Model Garden)
Vertex AI models are addressed by their model IDs in the Model Garden. Costs and context limits vary by model and are not tracked by alembica.