Rate Limits
alembica
supports management of rate limitations imposed by various AI service providers. Understanding these limits is essential for efficient application development and deployment.
Understanding Rate Limits
Rate limits control how frequently you can access AI models and how much data you can process in a given timeframe. These limits vary by provider and subscription tier, and typically include:
- RPM: Requests per minute
- RPD: Requests per day
- TPM: Tokens per minute
- TPD: Tokens per day
Exceeding these limits may result in request throttling or errors. alembica
helps manage these constraints by providing appropriate fallback mechanisms and retry strategies.
Disclaimer: Daily limits (RPD and TPD) are not currently supported by alembica
. Users are responsible for implementing and respecting these constraints on their own within their applications.
Anthropic
(May 2025, tier 1 users)
Model | RPM | TPM | TPD |
---|---|---|---|
Claude 4.0 Opus | 50 | 20,000 | 1,000,000 |
Claude 4.0 Sonnet | 50 | 20,000 | 1,000,000 |
Claude 3.7 Sonnet | 50 | 20,000 | 1,000,000 |
Claude 3.5 Sonnet | 50 | 40,000 | 1,000,000 |
Claude 3.5 Haiku | 50 | 50,000 | 5,000,000 |
Claude 3 Opus | 50 | 20,000 | 1,000,000 |
Claude 3 Sonnet | 50 | 40,000 | 1,000,000 |
Claude 3 Haiku | 50 | 50,000 | 5,000,000 |
Cohere
Cohere production keys have no limit, but trial keys are limited to 20 API calls per minute.
DeepSeek
DeepSeek does not impose rate limits.
GoogleAI
(May 2025)
Tier 1:
Model | RPM | RPD | TPM |
---|---|---|---|
Gemini 2.0 Flash | 2,000 | - | 4,000,000 |
Gemini 2.0 Flash Lite | 4,000 | - | 4,000,000 |
Gemini 1.5 Flash | 2,000 | - | 4,000,000 |
Gemini 1.5 Pro | 1,000 | - | 4,000,000 |
OpenAI
(May 2025, tier 1 users)
Model | RPM | RPD | TPM | Batch Queue Limit |
---|---|---|---|---|
o4-mini | 500 | - | 200,000 | 2,000,000 |
o3-mini | 500 | - | 200,000 | 2,000,000 |
o3 | 500 | - | 30,000 | 90,000 |
o1-mini | 500 | - | 200,000 | 2,000,000 |
o1 | 500 | - | 30,000 | 90,000 |
gpt-4.1-nano | 500 | - | 200,000 | 2,000,000 |
gpt-4.1-mini | 500 | - | 200,000 | 2,000,000 |
gpt-4.1 | 500 | - | 30,000 | 900,000 |
gpt-4o | 500 | - | 30,000 | 90,000 |
gpt-4o-mini | 500 | 10,000 | 200,000 | 2,000,000 |
gpt-4-turbo | 500 | - | 30,000 | 90,000 |
gpt-3.5-turbo | 500 | 10,000 | 200,000 | 2,000,000 |