View as /docs.md
Reference

Rate limits

Limits are per-key, measured in requests per minute, and bound to your tier. Scoped keys can carry a stricter local cap.

Tiers

TierRPMNotes
Free (signup credit)10$5 signup credit on a curated 7-model edge tier; covers prototyping
Paid (PAYG)600Auto-promoted on first top-up. Cost + 5% on every call.
Enterprise30,000+Custom SLA + per-region routing + dedicated endpoint — enterprise@aigateway.sh

Sub-accounts inherit their parent's tier ceiling and can carry a stricter local rate_limit_rpm set at creation.

Response headers

Every successful response carries:

When you exceed your limit, the gateway returns 429 rate_limited with a Retry-After header — back off for that many seconds. The SDKs honor this automatically with exponential jitter on top.

Bursts

RPMs are measured in a 60s rolling window with a small burst allowance (2× RPM for 10 seconds). If you need steadier high throughput — training-data generation, large eval runs — prefer the Batch API, which doesn't count against RPM.

Need more?

Enterprise limits (30k+ RPM, dedicated capacity, per-region routing) are set in contracts — email enterprise@aigateway.sh.