alembica

Rate Limits

alembica supports management of rate limitations imposed by various AI service providers. Understanding these limits is essential for efficient application development and deployment.

Understanding Rate Limits

Rate limits control how frequently you can access AI models and how much data you can process in a given timeframe. These limits vary by provider and subscription tier, and typically include:

Exceeding these limits may result in request throttling or errors. alembica helps manage these constraints by providing appropriate fallback mechanisms and retry strategies.

Disclaimer: Daily limits (RPD and TPD) are not currently supported by alembica. Users are responsible for implementing and respecting these constraints on their own within their applications.

Anthropic

(May 2025, tier 1 users)

Model RPM TPM TPD
Claude 4.0 Opus 50 20,000 1,000,000
Claude 4.0 Sonnet 50 20,000 1,000,000
Claude 3.7 Sonnet 50 20,000 1,000,000
Claude 3.5 Sonnet 50 40,000 1,000,000
Claude 3.5 Haiku 50 50,000 5,000,000
Claude 3 Opus 50 20,000 1,000,000
Claude 3 Sonnet 50 40,000 1,000,000
Claude 3 Haiku 50 50,000 5,000,000

Cohere

Cohere production keys have no limit, but trial keys are limited to 20 API calls per minute.

DeepSeek

DeepSeek does not impose rate limits.

GoogleAI

(May 2025)

Tier 1:

Model RPM RPD TPM
Gemini 2.0 Flash 2,000 - 4,000,000
Gemini 2.0 Flash Lite 4,000 - 4,000,000
Gemini 1.5 Flash 2,000 - 4,000,000
Gemini 1.5 Pro 1,000 - 4,000,000

OpenAI

(May 2025, tier 1 users)

Model RPM RPD TPM Batch Queue Limit
o4-mini 500 - 200,000 2,000,000
o3-mini 500 - 200,000 2,000,000
o3 500 - 30,000 90,000
o1-mini 500 - 200,000 2,000,000
o1 500 - 30,000 90,000
gpt-4.1-nano 500 - 200,000 2,000,000
gpt-4.1-mini 500 - 200,000 2,000,000
gpt-4.1 500 - 30,000 900,000
gpt-4o 500 - 30,000 90,000
gpt-4o-mini 500 10,000 200,000 2,000,000
gpt-4-turbo 500 - 30,000 90,000
gpt-3.5-turbo 500 10,000 200,000 2,000,000