Rate limits
Limits are per-key, measured in requests per minute, and bound to your tier. Scoped keys can carry a stricter local cap.
Tiers
| Tier | RPM | Notes |
|---|---|---|
| Free (signup credit) | 10 | $5 signup credit on a curated 7-model edge tier; covers prototyping |
| Paid (PAYG) | 600 | Auto-promoted on first top-up. Cost + 5% on every call. |
| Enterprise | 30,000+ | Custom SLA + per-region routing + dedicated endpoint — enterprise@aigateway.sh |
Sub-accounts inherit their parent's tier ceiling and can carry a stricter local rate_limit_rpm set at creation.
Response headers
Every successful response carries:
x-ratelimit-limit-requests— your current RPM ceilingx-ratelimit-remaining-requests— requests left this minutex-ratelimit-reset-requests— seconds until the window resets
When you exceed your limit, the gateway returns 429 rate_limited with a Retry-After header — back off for that many seconds. The SDKs honor this automatically with exponential jitter on top.
Bursts
RPMs are measured in a 60s rolling window with a small burst allowance (2× RPM for 10 seconds). If you need steadier high throughput — training-data generation, large eval runs — prefer the Batch API, which doesn't count against RPM.
Need more?
Enterprise limits (30k+ RPM, dedicated capacity, per-region routing) are set in contracts — email enterprise@aigateway.sh.