alembica

Rate Limits

alembica supports management of rate limitations imposed by various AI service providers. Understanding these limits is essential for efficient application development and deployment.

Understanding Rate Limits

Rate limits control how frequently you can access AI models and how much data you can process in a given timeframe. These limits vary by provider and subscription tier, and typically include:

Exceeding these limits may result in request throttling or errors. alembica helps manage these constraints by providing appropriate fallback mechanisms and retry strategies.

Disclaimer: Daily limits (RPD and TPD) are not currently supported by alembica. Users are responsible for implementing and respecting these constraints on their own within their applications.

OpenAI

(August 2024, tier 1 users)

Model RPM RPD TPM Batch Queue Limit
gpt-4o 500 - 30,000 90,000
gpt-4o-mini 500 10,000 200,000 2,000,000
gpt-4-turbo 500 - 30,000 90,000
gpt-3.5-turbo 3,500 10,000 200,000 2,000,000

GoogleAI

(October 2024)

Free Tier:

Model RPM RPD TPM
Gemini 1.5 Flash 15 1,500 1,000,000
Gemini 1.5 Pro 2 50 32,000
Gemini 1.0 Pro 15 1,500 32,000

Pay-as-you-go:

Model RPM RPD TPM
Gemini 1.5 Flash 2000 - 4,000,000
Gemini 1.5 Pro 1000 - 4,000,000
Gemini 1.0 Pro 360 30,000 120,000

Cohere

Cohere production keys have no limit, but trial keys are limited to 20 API calls per minute.

Anthropic

(November 2024, tier 1 users)

Model RPM TPM TPD
Claude 3.5 Sonnet 50 40,000 1,000,000
Claude 3.5 Haiku 50 50,000 5,000,000
Claude 3 Opus 50 20,000 1,000,000
Claude 3 Sonnet 50 40,000 1,000,000
Claude 3 Haiku 50 50,000 5,000,000

DeepSeek

DeepSeek does not impose rate limits.