google/gemini-3.1-pro at $2/M input (up to 200K tokens) and $12/M output is the cheapest model on AIgateway that will reliably use a full 1M-token context window. Beyond 200K tokens it's $4/$18. Cached reads drop to $0.20/M — feeding a whole codebase once and querying it repeatedly works out to cents per question.
This page ranks models on AIgateway that actually expose the capability through a stable, OpenAI-compatible interface — not just "has it in theory". Pricing is compared on blended token cost (2:1 input:output) at list rate, using the published pass-through price; the 5% platform fee applies equally, so it's neutral to the ranking.
Install the OpenAI SDK, set base_url to https://api.aigateway.sh/v1, and set the model string. No provider-specific client libraries, no separate billing setup — just a model swap. If the winning model ever gets undercut, AIgateway's eval+routing layer can automatically shift traffic to a cheaper-or-equal alternative.
POST /v1/evals with the top 2-3 candidate models, a dataset of 20-50 production prompts, and a grader. AIgateway returns a quality score per model; if the cheapest one passes, pin your alias there. Re-run the eval monthly — frontier prices drop by 40-60% a year and the winner changes fast.
# pip install aigateway-py openai
# aigateway-py: sub-accounts, evals, replays, jobs, webhook verify.
# openai SDK: chat/embeddings/images/audio — drop-in compat per our SDK's own guidance.
from openai import OpenAI
client = OpenAI(
base_url="https://api.aigateway.sh/v1",
api_key="sk-aig-...",
)
r = client.chat.completions.create(
model="google/gemini-3.1-pro",
messages=[{"role": "user", "content": "Explain vector databases in two sentences."}],
)
print(r.choices[0].message.content)// npm i aigateway-js openai
// aigateway-js: sub-accounts, evals, replays, jobs, webhook verify.
// openai SDK: chat/embeddings/images/audio — drop-in compat.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.aigateway.sh/v1",
apiKey: process.env.AIGATEWAY_KEY,
});
const r = await client.chat.completions.create({
model: "google/gemini-3.1-pro",
messages: [{ role: "user", content: "Explain vector databases in two sentences." }],
});
console.log(r.choices[0].message.content);Yes — the shortlisted winner on this page is the model our team and customers use in production today. Run a 20-prompt eval on your own workload to confirm; if it fails, the second-cheapest model is one model-string change away.
Frontier model prices drop 40-60% per year. Expect the ranking to change every 2-4 months as new models ship. Pin an alias in AIgateway's router and re-run your eval monthly to stay on the current winner.
Yes. Every model on AIgateway speaks OpenAI's request/response format, so picking a cheaper model is a one-line change in your application code.
No. Pass-through pricing plus a flat 5% platform fee applied at credit top-up. The fee doesn't change by model; the per-token rate is what each provider publishes.