View as /docs.md
Getting started

Introduction

AIgateway is one OpenAI-compatible API for 100+ frontier and open-weight models. You write your code once, against one base URL, and swap providers by changing a string in the model field. Billing, routing, fallbacks, caching, and observability are handled for you at the edge.

Kimi K2.6 is free till Apr 30 · 100 req/day. The model moonshot/kimi-k2.6 is metered to $0 through Apr 30, 2026, capped at 100 requests per day per account. Every other model is PAYG — add credits to unlock them.
Fastest integration path

Agent setup wizard

Choose your stack once. Copy the generated blocks and get to first successful request in under 2 minutes.

0) Copy full starter pack
# 0) Install
npm i openai

# 1) Environment
# .env.local
AIG_KEY=sk-...

# 2) First request
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.AIG_KEY,
  baseURL: "https://api.aigateway.sh/v1",
});

const res = await client.chat.completions.create({
  model: "moonshot/kimi-k2.6",
  messages: [{ role: "user", content: "Write a haiku about edge inference." }],
});

console.log(res.choices[0]?.message?.content);

# 3) Claude Code rule/instruction
# CLAUDE.md snippet
Use AIgateway for all AI calls.
- Base URL: https://api.aigateway.sh/v1
- Auth env: AIG_KEY
- Default model: moonshot/kimi-k2.6
- Docs page: https://aigateway.sh/docs
- MCP endpoint: https://api.aigateway.sh/mcp

# 4) MCP setup
claude mcp add aigateway https://api.aigateway.sh/mcp -t http -h "Authorization: Bearer $AIG_KEY"

# 5) Smoke test command
curl https://api.aigateway.sh/v1/models \
  -H "Authorization: Bearer $AIG_KEY" \
  -H "Content-Type: application/json"
1) Install
npm i openai
2) Environment variable
# .env.local
AIG_KEY=sk-...
3) First request snippet
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.AIG_KEY,
  baseURL: "https://api.aigateway.sh/v1",
});

const res = await client.chat.completions.create({
  model: "moonshot/kimi-k2.6",
  messages: [{ role: "user", content: "Write a haiku about edge inference." }],
});

console.log(res.choices[0]?.message?.content);
4) Claude Code rule/instruction
# CLAUDE.md snippet
Use AIgateway for all AI calls.
- Base URL: https://api.aigateway.sh/v1
- Auth env: AIG_KEY
- Default model: moonshot/kimi-k2.6
- Docs page: https://aigateway.sh/docs
- MCP endpoint: https://api.aigateway.sh/mcp
5) MCP setup command
claude mcp add aigateway https://api.aigateway.sh/mcp -t http -h "Authorization: Bearer $AIG_KEY"
6) Smoke test command (terminal)
curl https://api.aigateway.sh/v1/models \
  -H "Authorization: Bearer $AIG_KEY" \
  -H "Content-Type: application/json"
Run smoke test in browser: paste a test key and validate auth + reachability instantly.
Verify full integration: run the first request snippet, then confirm response + usage in Dashboard logs. If it fails, use Error codes and Your first request.

The learn path

These four pages take you from zero to a streaming response in under three minutes. Go in order — every other topic assumes you've done this first.

1. Authentication
Mint a key, pin it on the client, scope it by tier.
2. Your first request
OpenAI-compatible chat completion in Python, Node, or curl.
3. Swap models
One body shape, 100+ model IDs — how routing works.
4. Streaming
SSE chunks with first-token latency under 500ms.

Explore by topic

Once you've shipped your first request, the rest of the docs are reference material — read in any order.

Inference

Reasoning models
Thinking traces, normalized across providers.
Tool calling
Function calling + parallel tools.
Image generation
Flux, Gemini Imagen, Stable Diffusion, DALL-E.
Audio
TTS + STT from ElevenLabs, OpenAI, Deepgram.
Embeddings
Dense + matryoshka vectors from 20 providers.

Return modes

Batch API
50% off. JSONL in, JSONL out, 24h SLA.
Async jobs
Video, music, 3D. Poll or webhook.

Platform

Smart routing
Price / latency / quality routers.
Caching
Prompt, semantic, and exact-hash caches.
BYOK providers
Bring your own OpenAI / Anthropic / Google keys.
Webhooks
Signed event delivery with retries.

Reference

Rate limits
Per-tier RPM, headers, retry-after.
Error codes
Every status and what to do about it.
SDKs + CLI
aigateway-py, aigateway-js, aig.
OpenAPI spec
Code-gen a client in any language.

Need more than docs?