Question 1

What's the catch on 5%?

Accepted Answer

There isn't one. 5% per call is our entire platform fee — the only margin we make. We add it to the underlying model cost on every API call. No monthly fees, no seat fees, no per-model surcharges, no markup on tokens, no minimum. You only pay when a request succeeds. (Adding credit carries a separate 5% top-up fee that covers payment processing — disclosed before you pay, like any online card checkout.)

Question 2

So what does it actually cost me?

Accepted Answer

Per call: provider cost × 1.05 — debited from your wallet when the request succeeds, and that 5% is the only fee we earn. Adding credit: a one-time 5% top-up fee covering card processing. Worked example: buy $100 of credit → charged $105 → wallet credited $100 → each call debits provider cost + a 5% platform fee from the wallet.

Question 3

Is there free credit?

Accepted Answer

Adding a payment method (no charge) gets you $1 in credit spendable across a curated trial catalog — GLM-5.2, Kimi K2.7 Code, gpt-oss-120b, FLUX, Whisper Turbo, Aura 2, BGE-M3 and more. The $1 expires 30 days after you add the card. Top up any amount and the full 1000+ catalog opens at pass-through + a 5% platform fee. Topups never expire.

Question 4

What counts as a 'successful run'?

Accepted Answer

A request that returned a usable model response. Fallbacks that succeed bill once, at the model that actually answered. Failed runs and timeouts don't charge.

Question 5

How do cache hits price?

Accepted Answer

Exact-match and semantic hits return in under 10ms and get a flat 50% discount on the uncached price. Cache is on by default; override per-request with the x-cache header.

Question 6

Can I bring my own provider keys?

Accepted Answer

Yes, free on every paid account. Route through your Anthropic / OpenAI / Google / etc. key and pay zero per-request fees on that traffic — your existing volume discounts apply.

Question 7

Sub-accounts and per-user keys?

Accepted Answer

Yes — one API call mints a scoped key with its own spend cap, rate limit, default tag, and analytics. Ideal for marketplaces or multi-tenant apps that want to meter end users.

Question 8

How do STT, TTS, embeddings, video, image price?

Accepted Answer

Every modality uses the same math: underlying model rate plus 5%. Whisper, ElevenLabs, BGE, Flux, Veo — all one simple line item.

Question 9

Do credits expire?

Accepted Answer

Topup credits never expire — once you pay, the balance stays. The $1 card credit is the exception: it expires 30 days after you add a card. Refunds on unused topup balances within 90 days, no questions.

Question 10

What's in Enterprise?

Accepted Answer

Evals, guardrails, replay + shadow A/B, and versioned prompt IDs — the gateway primitives. Plus SSO, 99.95% SLA, DPA, SOC 2, private audit export, dedicated endpoint, and a named engineer on Slack Connect. Details on /enterprise.

Feature	AIgateway	OpenRouter	Portkey	Helicone	Requesty
Platform fee (our only margin)	5% per call	5.5% on credits	$49/mo flat	$79/mo+ flat	free
Card top-up fee	5% · payment processing	in the 5.5%	varies	varies	—
Cache-hit discount	50% off cached requests	no	no	no	no
Every modality	text · image · video · voice · audio · embed	text only	text only	text + obs	varies
Sub-account / per-user keys	yes, API	no	workspace only	workspace only	no
Cost-attribution tags + hard caps	yes, API	basic	yes	yes	no
BYOK	free · no per-request fee	1M free/mo then 5%	included	included	yes
Evals on your traffic	Enterprise	no	rules only	no	no
Guardrails (content + PII)	Enterprise · one policy, every provider	no	yes	no	no
Replay + shadow A/B across models	Enterprise	no	no	prompt-level only	no
Prompt IDs with auto-caching	Enterprise	no	workspace only	no	no

Cost + a 5% platform fee. Nothing else.

Trial · add a card

Pay as you go

Enterprise

What you'd actually pay us.

Cheaper than OpenRouter.
Primitives none of them have.

Let us pick the model.
Pay even less.

Questions we actually get.