Pricing

Cost + 5%. Nothing else.

Pass-through on every model, no monthly fee, no minimum, no seat count. You pay when the request succeeds — cache hits bill at 10% of the uncached rate.

No card for the Kimi trialCredits never expireRefund within 90 days

Kimi K2.6 trial

$0/through Apr 30
Kick the tires on the frontier open model.
  • 100 requests / day on moonshot/kimi-k2.6
  • Full 262K context, vision + tool calling
  • Cache, analytics, playground
  • No card required · expires Apr 30, 2026
Claim trial

Enterprise

Custom/from $10k/mo committed
Everything in PAYG, plus the primitives only an aggregator can ship.
  • Evals — replay real traffic against alternate models, score against your rubric
  • Guardrails — content safety + PII redaction, one policy every provider
  • Replay + shadow A/B — deterministic re-run, canary new models safely
  • Prompt IDs — versioned prompts server-side with auto prefix caching
  • SSO · SCIM · dedicated endpoint · security posture
  • 99.95% SLA · DPA · SOC 2 · private audit export
  • Direct-provider agreements · named engineer on Slack Connect
Talk to sales
Calculator

What you'd actually pay us.

Plug in your monthly volume. We show you side-by-side cost against the aggregators we replace. Cache savings are modeled separately.

$2.50 in / $10 out
120M
40M
30%
$516.95
5.5% platform fee
$524.70
~3% + platform min
$514.50
− $2.45 vs A
BREAKDOWN · AIGATEWAY
Provider token cost$700.005% platform fee$24.50Cache savings− $220.50Net$514.50
How we compare

Cheaper than OpenRouter.
Primitives none of them have.

Numbers pulled from each provider's public pricing page in April 2026. We update this monthly.

FeatureAIgatewayOpenRouterPortkeyHeliconeRequesty
Platform fee5% on credits5.5% on credits$49/mo flat$79/mo+ flatfree
Markup on model cost0% · pass-through0% · pass-through0% · pass-through0% · pass-through0%
Cache-hit discount10% of uncached costnononono
Every modalitytext · image · video · voice · audio · embedtext onlytext onlytext + obsvaries
Sub-account / per-user keysyes, APInoworkspace onlyworkspace onlyno
Cost-attribution tags + hard capsyes, APIbasicyesyesno
BYOKfree · no per-request fee1M free/mo then 5%includedincludedyes
Evals on your trafficEnterprisenorules onlynono
Guardrails (content + PII)Enterprise · one policy, every providernoyesnono
Replay + shadow A/B across modelsEnterprisenonoprompt-level onlyno
Prompt IDs with auto-cachingEnterprisenoworkspace onlynono
FAQ

Questions we actually get.

What's the catch on 5%?
There isn't one. 5% is added when you top up credits, the underlying model rate is pass-through, and there are no monthly fees, seat fees, or per-model surcharges. You only pay for successful runs.
Is there a free tier?
Kimi K2.6 is free through Apr 30, 2026 at 100 requests/day per account — no card required. After that, and for every other model, it's PAYG: top up credits once, pay pass-through + 5%.
What counts as a 'successful run'?
A request that returned a usable model response. Fallbacks that succeed bill once, at the model that actually answered. Failed runs and timeouts don't charge.
How do cache hits price?
Exact-match and semantic hits return in under 10ms and bill at 10% of the uncached cost — a 90% discount. Cache is on by default; override per-request with the x-cache header.
Can I bring my own provider keys?
Yes, free on every paid account. Route through your Anthropic / OpenAI / Google / etc. key and pay zero per-request fees on that traffic — your existing volume discounts apply.
Sub-accounts and per-user keys?
Yes — one API call mints a scoped key with its own spend cap, rate limit, default tag, and analytics. Ideal for marketplaces or multi-tenant apps that want to meter end users.
How do STT, TTS, embeddings, video, image price?
Every modality uses the same math: underlying model rate, pass-through, with the 5% applied at credit top-up. Whisper, ElevenLabs, BGE, Flux, Veo — all one simple line item.
Do credits expire?
Paid credits never expire. Refunds on unused balances within 90 days, no questions.
What's in Enterprise?
Evals, guardrails, replay + shadow A/B, and versioned prompt IDs — the gateway primitives. Plus SSO, 99.95% SLA, DPA, SOC 2, private audit export, dedicated endpoint, and a named engineer on Slack Connect. Details on /enterprise.