Reference

Rate limits

Limits are per-key, measured in requests per minute, and bound to your tier. Scoped keys can carry a stricter local cap.

Tiers

Tier	RPM	Notes
Free	`10`	No card; covers prototyping
Starter	`120`	Auto-tier on first top-up
Pro	`600`	Standard PAYG ceiling
Business	`3,000`	Email for activation
Agent	`6,000`	Designed for swarms
Enterprise	`30,000+`	SLA + per-region routing — enterprise@aigateway.sh

The Kimi K2.6 trial is capped at 100 req/day per account through Apr 30, 2026.

Response headers

Every successful response carries:

x-ratelimit-limit-requests — your current RPM ceiling
x-ratelimit-remaining-requests — requests left this minute
x-ratelimit-reset-requests — seconds until the window resets

When you exceed your limit, the gateway returns 429 rate_limited with a Retry-After header — back off for that many seconds. The SDKs honor this automatically with exponential jitter on top.

Bursts

RPMs are measured in a 60s rolling window with a small burst allowance (2× RPM for 10 seconds). If you need steadier high throughput — training-data generation, large eval runs — prefer the Batch API, which doesn't count against RPM.

Need more?

Enterprise limits (30k+ RPM, dedicated capacity, per-region routing) are set in contracts — email enterprise@aigateway.sh.

← PreviousGPT-5.4 Next →Error codes