Ship one MCP server on the AIgateway MCP surface and every agent — Claude Code, Cursor, Codex, Cline, anything that speaks MCP — can call its tools. No infra to host, no auth to wire, no schema-sync headaches.
MCP — the Model Context Protocol — is how agents find tools. It's what lets Claude Code search your GitHub, Cursor edit your Linear tickets, Codex query your Postgres. In the last nine months MCP went from a design doc to 97M monthly SDK downloads and adoption by every major lab.
The catch: shipping an MCP server is still more plumbing than it should be. Pick a transport, run a long-lived process, expose a URL, wire auth, keep schemas in sync, and do it again for the next environment. Most teams don't ship one because the wiring is worse than the feature.
This example replaces all of it with a hosted endpoint and a single tool declaration. You publish one JSON schema to AIgateway, point any MCP agent at `https://mcp.aigateway.sh`, and your tool is live across every agent surface at once.
A tool is a name, a description, an input schema, and a handler URL. Describe it once in JSON — the gateway publishes it to every MCP client that connects with your key.
{
"name": "search_wiki",
"description": "Search the internal company wiki and return the top 5 matches.",
"input_schema": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "Natural-language search query" },
"k": { "type": "integer", "default": 5, "minimum": 1, "maximum": 20 }
},
"required": ["query"]
},
"handler_url": "https://wiki.acme.internal/api/mcp-search",
"handler_auth": "Bearer wiki-secret-token"
}POST the spec to `/v1/mcp/tools`. The gateway returns a stable tool ID and makes it discoverable to any agent that authenticates with your key.
curl -X POST https://api.aigateway.sh/v1/mcp/tools \
-H "Authorization: Bearer sk-aig-..." \
-H "Content-Type: application/json" \
-d @wiki-tool.json
# => { "id": "tool_9f3k...",
# "name": "search_wiki",
# "mcp_url": "https://mcp.aigateway.sh" }Add one MCP server block to `~/.claude/settings.json` pointing at `https://mcp.aigateway.sh` with your key. Restart Claude Code — your tool shows up in the tool list.
{
"mcpServers": {
"aigateway": {
"transport": "streamable-http",
"url": "https://mcp.aigateway.sh",
"headers": { "Authorization": "Bearer sk-aig-..." }
}
}
}Cursor reads `.cursor/mcp.json`. Codex reads `~/.codex/config.toml`. Cline reads its settings pane. All three accept the same URL + header. Ship the tool once, reach every agent.
// .cursor/mcp.json
{
"mcpServers": {
"aigateway": {
"url": "https://mcp.aigateway.sh",
"headers": { "Authorization": "Bearer sk-aig-..." }
}
}
}Transport. Streamable HTTP is the preferred MCP transport in 2026; we also serve legacy SSE for older clients. Your tool spec doesn't change — the gateway handles version negotiation with each client.
Auth. Your tool handler authenticates to the gateway, and the gateway authenticates to the agent. Rotating the agent-facing secret is a one-line config change; it never touches your handler.
Observability. Every tool call is logged against your AIgateway account with per-tool latency, error rate, and cost if the handler itself calls LLMs. The `x-aig-tag` cost-attribution header works inside tool handlers.
Because your handler can call back through the same AIgateway, a tool is just an LLM call dressed as a function. A 'search_wiki' tool can be "run a Kimi K2.6 agent with a 3-step plan against our wiki corpus."
This is how the aggregator layer turns into an agent layer — tools fan out to models, models call tools, all metered on one key.
# Your MCP tool handler — itself an LLM call through the gateway.
from openai import OpenAI
client = OpenAI(base_url="https://api.aigateway.sh/v1", api_key="sk-aig-...")
def handle_search_wiki(args: dict) -> dict:
resp = client.chat.completions.create(
model="moonshot/kimi-k2.6",
messages=[{
"role": "system",
"content": "Search the wiki corpus below. Return a JSON list of top matches."
}, {
"role": "user",
"content": f"Query: {args['query']}\nCorpus: {open('corpus.txt').read()}"
}],
response_format={"type": "json_object"},
extra_headers={"x-aig-tag": "mcp.search_wiki"},
)
return {"results": resp.choices[0].message.content}Yes — the gateway proxies to your handler URL. If you'd rather we host the handler too, stand the tool up as a Python or TypeScript function and drop it into the MCP surface with the `aig mcp deploy` CLI. That option lands on free/Pro tiers as of this week.
Any MCP client — Claude Code, Cursor, Codex, Cline, Continue, Windsurf, the OpenAI Agents SDK, the Claude Agent SDK, plus anything in the MCP registry. The gateway speaks both Streamable HTTP and legacy SSE, so old and new clients both work.
Mint a sub-account key with `/v1/sub-accounts`. Each sub-account sees its own set of registered tools. Give your engineers one key, your customers another — tools, usage, and caps are isolated.
Yes. Inside a handler you can POST to `/v1/mcp/tools/<id>/invoke` with the same key. Useful for composition — a `schedule_meeting` tool can call `find_free_slot` before writing to the calendar.
Tools declare their own input schema, which the gateway validates before calling your handler. For stricter isolation, mark a tool as `require_human_approval: true` — the agent will pause for a yes/no before dispatch.
Every POST to `/v1/mcp/tools` creates a new immutable version. Clients pin the version ID they trust; new connections default to latest. Breaking the schema is a deliberate act, not an accident.
The MCP plumbing is free on every tier. You pay only for LLM calls your handler itself makes — priced pass-through with a 5% platform fee like the rest of the catalog.