Rate limits
Limits are per-key, measured in requests per minute, and bound to your tier. Scoped keys can carry a stricter local cap.
Tiers
| Tier | RPM | Notes |
|---|---|---|
| Free | 10 | No card; covers prototyping |
| Starter | 120 | Auto-tier on first top-up |
| Pro | 600 | Standard PAYG ceiling |
| Business | 3,000 | Email for activation |
| Agent | 6,000 | Designed for swarms |
| Enterprise | 30,000+ | SLA + per-region routing — enterprise@aigateway.sh |
The Kimi K2.6 trial is capped at 100 req/day per account through Apr 30, 2026.
Response headers
Every successful response carries:
x-ratelimit-limit-requests— your current RPM ceilingx-ratelimit-remaining-requests— requests left this minutex-ratelimit-reset-requests— seconds until the window resets
When you exceed your limit, the gateway returns 429 rate_limited with a Retry-After header — back off for that many seconds. The SDKs honor this automatically with exponential jitter on top.
Bursts
RPMs are measured in a 60s rolling window with a small burst allowance (2× RPM for 10 seconds). If you need steadier high throughput — training-data generation, large eval runs — prefer the Batch API, which doesn't count against RPM.
Need more?
Enterprise limits (30k+ RPM, dedicated capacity, per-region routing) are set in contracts — email enterprise@aigateway.sh.