Rate Limits
alembica
supports management of rate limitations imposed by various AI service providers. Understanding these limits is essential for efficient application development and deployment.
Understanding Rate Limits
Rate limits control how frequently you can access AI models and how much data you can process in a given timeframe. These limits vary by provider and subscription tier, and typically include:
- RPM: Requests per minute
- RPD: Requests per day
- TPM: Tokens per minute
- TPD: Tokens per day
Exceeding these limits may result in request throttling or errors. alembica
helps manage these constraints by providing appropriate fallback mechanisms and retry strategies.
Disclaimer: Daily limits (RPD and TPD) are not currently supported by alembica
. Users are responsible for implementing and respecting these constraints on their own within their applications.
OpenAI
(August 2024, tier 1 users)
Model | RPM | RPD | TPM | Batch Queue Limit |
---|---|---|---|---|
gpt-4o | 500 | - | 30,000 | 90,000 |
gpt-4o-mini | 500 | 10,000 | 200,000 | 2,000,000 |
gpt-4-turbo | 500 | - | 30,000 | 90,000 |
gpt-3.5-turbo | 3,500 | 10,000 | 200,000 | 2,000,000 |
GoogleAI
(October 2024)
Free Tier:
Model | RPM | RPD | TPM |
---|---|---|---|
Gemini 1.5 Flash | 15 | 1,500 | 1,000,000 |
Gemini 1.5 Pro | 2 | 50 | 32,000 |
Gemini 1.0 Pro | 15 | 1,500 | 32,000 |
Pay-as-you-go:
Model | RPM | RPD | TPM |
---|---|---|---|
Gemini 1.5 Flash | 2000 | - | 4,000,000 |
Gemini 1.5 Pro | 1000 | - | 4,000,000 |
Gemini 1.0 Pro | 360 | 30,000 | 120,000 |
Cohere
Cohere production keys have no limit, but trial keys are limited to 20 API calls per minute.
Anthropic
(November 2024, tier 1 users)
Model | RPM | TPM | TPD |
---|---|---|---|
Claude 3.5 Sonnet | 50 | 40,000 | 1,000,000 |
Claude 3.5 Haiku | 50 | 50,000 | 5,000,000 |
Claude 3 Opus | 50 | 20,000 | 1,000,000 |
Claude 3 Sonnet | 50 | 40,000 | 1,000,000 |
Claude 3 Haiku | 50 | 50,000 | 5,000,000 |
DeepSeek
DeepSeek does not impose rate limits.