AIgateway is a universal AI API that aggregates 1000+ frontier and open-weight models (text, image, audio, video, embeddings, moderation) behind a single OpenAI-compatible endpoint. Change base_url on the OpenAI SDK and every existing integration works.

How is AIgateway priced?

Pay the underlying model cost plus a 5% platform fee on every API call — that's our entire revenue model. No monthly fee, no seat fee, no per-model surcharge. Cached requests get a 50% discount. Adding a payment method (no charge) gets you $1 in credit spendable across a curated trial catalog — GLM-5.2, Kimi K2.7 Code, gpt-oss-120b, FLUX, Whisper Turbo, Aura 2, BGE-M3 and more. The $1 expires 30 days after you add the card. Top up any amount and the full 1000+ catalog opens at pass-through + a 5% platform fee. Topups never expire. Payment-processor fees (~5%) apply once at top-up — that's Stripe's standard rate, passed through.

Is AIgateway OpenAI-compatible?

Yes. The endpoint paths, request bodies, and streaming SSE format match OpenAI's API. Point the official OpenAI SDK (Python, Node, etc.) at https://api.aigateway.sh/v1 with an AIgateway key and everything works unchanged.

Which models are supported?

Claude Opus 4.7, GPT-5.4, Gemini 3.1 Pro, Kimi K2.7 Code, Grok 4, Llama 4, FLUX 2, Imagen 4, Veo 3.1, Deepgram Nova 3, BGE-M3, and 1000+ more across every modality. The full live catalog is at /models and /v1/models.

Do I need separate keys for each provider?

No. One AIgateway key routes to every provider. Billing is unified on a single line. Bring-your-own-keys is also supported for providers where you have direct contracts.

Does AIgateway support tool calling, vision, streaming, and JSON mode?

Yes, across every model that the underlying provider supports. Capabilities are published per-model in /v1/models and rendered on each /models/ page.

How do I integrate with agent frameworks like Claude Code, Cursor, or LangChain?

Any OpenAI-compatible SDK or base_url field works — LangChain, LlamaIndex, Vercel AI SDK, Cursor, Claude Code, Continue, Cline. Autoconfigurable via /llms.txt and MCP at https://api.aigateway.sh/mcp.

Products in production build on AIgateway — FlareCode (hosted coding agents), AudioPod (AI audio), NameMyApp (AI branding), Go2 (agent link infra), MailMolt (agent email), ClawOcean (agent platform), and ChargeFind — plus Cline, Aider, and a handful of agent startups.

Universal inference API · 85+ labs, one schema

One API.
Every model.
Every modality.

Point the OpenAI SDK at one base_url and reach 1000+ models across every modality. Agents autoconfigure from /llms.txt.

Get your key →Read the docs

● 99.99% uptime● 47ms p50 overhead● SOC 2 in progress

python

typescript

curl

aigateway-py

aigateway-cli

quickstart.py

# one-line swap: point your SDK at us.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.openai.com/v1"
             "https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

stream = client.chat.completions.create(
    model="zai-org/glm-5.2",
    messages=msgs, stream=True,
)

STREAMINGmodel: kimi-k2.7-codetok/s: 0ttft: —cached −50%

+ 27 more

TextImageVideoAudioVoiceMusicVisionEmbeddings

MODELS

from 85+ labs

LABS

normalized behind one schema

ADDED LATENCY

0ms

added per request

CACHE DISCOUNT

off list price on cached requests

NEWmodel: "auto"

Routes every request to the cheapest model that clears your quality bar.

Set model:"auto" and you never pay more than the premium model you'd have called yourself. Works across text, image, video, speech, transcription, music, and embeddings — and every response shows what ran and what you saved.

See the Auto Router →

Capabilities

Everything behind
one key.

The whole surface of a modern inference stack — modalities, routing, keys, caching — normalized behind a single OpenAI-compatible API.

Every modality

Text, image, video, audio, voice, music, embeddings, vision — one schema across all of them.

Drop-in for the OpenAI SDK

Change one base_url and every existing integration reaches 1000+ models unchanged.

Auto model routing

model:"auto" picks the cheapest model that clears your quality bar, every request.

Sub-account keys

Mint scoped keys per end user with their own spend caps, rate limits, and analytics.

Semantic caching

Exact and near-duplicate requests return cached — 50% off list price, near-zero latency.

Agent-ready

A capable agent reads /llms.txt or the MCP server once and configures itself.

$1 FREE CREDIT* · ADD A CARD, NO CHARGE

Start on GLM-5.2.
$1 free, no charge.*

A payment method is required to call the API — adding one is free and never charges you. Add a card (no charge) for $1 in credit across a curated trial catalog, led by GLM-5.2. The $1 expires 30 days after you add the card. Top up any amount to open the full 1000+ model catalog at pass-through + a 5% platform fee. Topups never expire.

free_tier$1 freeno chargeGLM-5.2

Add a card, get $1 free*

Adding a card never charges you. The $1 covers a curated catalog of 15 models led by GLM-5.2 — chat, code, image, voice, transcription, vision, embeddings.

GLM-5.2

~340K tok

$1 · with card

KIMI K2.7 CODE

~400K tok

$1 · with card

FLUX KLEIN 9B

~500 images

$1 · with card

WHISPER TURBO

~2.6 hours

$1 · with card

Claim $1 on GLM-5.2 →See the trial catalogtopups never expire

python

first_call.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

r = client.chat.completions.create(
    model="zai-org/glm-5.2",
    messages=[{"role": "user",
               "content": "Plan a"
                          " research agent."}],
    stream=True,
)

# Add a card (no charge) for $1 across the trial catalog.
# Top up to unlock the full 1000+-model catalog.

* Add a card (no charge) for $1 in credit on the selected models above. The $1 expires 30 days after you add the card. Topups never expire.

Catalog

1000+ models.
One key.

Drop-in for the OpenAI SDK. Rotate models mid-conversation, one string.

Anthropic

OpenAI

Google

xAI

Why swap in
one line of code.

Same public pricing and feature matrix as the named alternatives, competitors anonymized.

	AIgateway	Competitor A breadth aggregator	Competitor B platform-native gateway	Competitor C enterprise governance
Models / modalities	1000+ · text · image · video · music · voice · audio · embeddings · vision	~300 · text only	~40 · text + image	~80 · image, video, audio
Open-weight, served first-party	yes	no · pass-through	no · pass-through	no · queue
Latency added (p50)	47ms	~180ms · single region	~60ms · one platform	~190ms · single region
Eval-driven routing (SLO on your data)	yes	—	—	—
Sub-account / per-user key API	yes · programmatic	—	workspace only	—
Replay + shadow A/B across models	yes	—	—	—
OpenAI-compatible	drop-in, zero changes	drop-in	drop-in	—

Sourced from public pricing pages and docs, April 2026. Spot an error? hello@aigateway.sh.

Primitives

Four things a single-provider
SDK can't do.

Eval-routing, scoped sub-account keys, cross-model replay, and per-feature cost tags — each needs a view across every provider, not one.

EVAL-DRIVEN ROUTING

Let your own data pick the model.

Grade every model on your own data. The alias always routes to the current winner — new model lands, rerun, prod code unchanged.

curl -X POST https://api.aigateway.sh/v1/evals \
  -H "Authorization: Bearer $KEY" \
  -d '{
    "name": "prod-summarize",
    "candidate_models": [
      "anthropic/claude-opus-4.7",
      "openai/gpt-5.4",
      "moonshot/kimi-k2.7-code"
    ],
    "dataset": [...],
    "metric": "quality"
  }'

# then just use it
model = "eval:prod-summarize"

SUB-ACCOUNT API

Scoped keys for your end users.

One call mints a per-customer key with its own spend cap, rate limit, and analytics — so you can bill your own users.

POST /v1/sub-accounts
{
  "name": "acme-corp",
  "spend_cap_cents": 50000,     // $500 / mo hard cap
  "rate_limit_rpm": 300,
  "default_tag": "acme"
}

=> { "key": "sk-aig-...",       // hand to customer
     "spend_cap_cents": 50000,
     "id": "sa_9f3k..." }

REPLAY + SHADOW A/B

Test a new model on real traffic.

Replay any past request on a new model — output, cost, latency side by side. Shadow mode tests it on live traffic without touching the user.

POST /v1/replays
{
  "source_request_id": "req_abc123",
  "target_model": "anthropic/claude-opus-4.7",
  "shadow": true
}

=> { "source_output": "...",
     "target_output": "...",
     "cost_source_cents": 1.2,
     "cost_target_cents": 4.7,
     "score_delta": 0.82 }

COST-ATTRIBUTION TAGS

Know what every feature costs.

Tag any request (feature, user, tenant) and query spend by tag. Hard per-tag budget caps stop runaway costs.

# tag each request with the feature it powers
curl https://api.aigateway.sh/v1/chat/completions \
  -H "x-aig-tag: summarize" \
  ...

GET /v1/usage/by-tag?month=2026-04
=> [{ "tag": "summarize", "cost_cents": 4210 },
    { "tag": "chat",      "cost_cents": 9830 },
    { "tag": "rerank",    "cost_cents":  118 }]

For coding agents

Point Claude Code at one URL.
It configures itself.

A capable coding agent reads /llms.txt, /openapi.json, or /agents.md once — then it calls any model (text to video), mints sub-accounts, tags cost, and replays requests, no human editing config.

You can call any AI model through AIgateway, a universal AI API.

- Base URL: https://api.aigateway.sh/v1
- Auth: Authorization: Bearer sk-aig-...
- SDK: drop-in for the OpenAI SDK — only change base_url
- OpenAPI spec: https://api.aigateway.sh/openapi.json
- Capability map: https://aigateway.sh/llms.txt
- Live models: https://api.aigateway.sh/v1/models
- MCP server: https://api.aigateway.sh/mcp

Install (only when the OpenAI SDK isn't enough):
  pip install aigateway-py        # Python — async jobs, sub-accounts, evals
  pnpm add aigateway-js           # Node   — same surface in TypeScript
  npm i -g aigateway-cli          # CLI    — `aig init` walks through everything

Primitives nobody else has:
  POST /v1/sub-accounts — scoped keys + spend caps per end user
  POST /v1/evals        — grade candidate models on your data
  POST /v1/replays      — re-run a past request on a new model
  GET  /v1/usage/by-tag — per-feature cost via x-aig-tag header

LLMS.TXT

Agent-readable capability map

/llms.txt

OPENAPI.JSON

Typed 3.1 spec for code-gen

https://api.aigateway.sh/openapi.json

AGENTS.MD

Integration patterns + error remediation

/agents.md

INSTALL

pip install aigateway-py · pnpm add aigateway-js · npm i -g aigateway-cli

/integrations