questions/Pricing

How much does Gemini 3 Flash cost?

Gemini 3 Flash (google/gemini-3-flash) is priced at $0.500/M input and $3.00/M output tokens through AIgateway. Cached input reads are $0.050/M tokens. 1,000,000-token context window. AIgateway charges pass-through on the underlying Google rate plus a 5% platform fee applied at credit top-up — nothing per-call, no monthly minimum. Free tier available.

How it works

Pass-through on the underlying provider

AIgateway bills at the same per-token rate Google publishes. There's no inflated sticker price; the model-vendor's invoice and your gateway invoice agree on the token count and unit cost. Usage is aggregated to the nearest fraction of a cent in D1 and surfaced live in your dashboard.

5% platform fee at top-up

The only margin on top is a 5% platform fee, applied at credit top-up — not per-call. A $100 top-up funds $95 of provider calls. Cache hits (when you use prompt caching or semantic cache) bill at 10% of the uncached cost, so long-context agent workloads often run net cheaper than calling Google directly.

Free tier + pay-as-you-go

No subscription and no monthly minimum. Free tier gives you 100 requests/day on Kimi K2.6. Paid tier starts at a $5 top-up with no auto-renew. BYOK (bring your own Google key) is supported on Enterprise if you already have a direct contract you want to preserve.

Code example

Python
# pip install aigateway-py openai
# aigateway-py: sub-accounts, evals, replays, jobs, webhook verify.
# openai SDK: chat/embeddings/images/audio — drop-in compat per our SDK's own guidance.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

r = client.chat.completions.create(
    model="google/gemini-3-flash",
    messages=[{"role": "user", "content": "Explain vector databases in two sentences."}],
)
print(r.choices[0].message.content)
Node / TypeScript
// npm i aigateway-js openai
// aigateway-js: sub-accounts, evals, replays, jobs, webhook verify.
// openai SDK: chat/embeddings/images/audio — drop-in compat.
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.aigateway.sh/v1",
  apiKey: process.env.AIGATEWAY_KEY,
});

const r = await client.chat.completions.create({
  model: "google/gemini-3-flash",
  messages: [{ role: "user", content: "Explain vector databases in two sentences." }],
});
console.log(r.choices[0].message.content);

Related

FAQ

How much does Gemini 3 Flash cost per token?

Input is $0.5/M tokens, output is $3/M tokens. A typical 1K-in / 500-out chat turn costs about $0.00200.

Is there a free tier for Gemini 3 Flash?

Gemini 3 Flash is not on the free tier; the 100 free requests/day go to Kimi K2.6. Paid tier starts at a $5 top-up.

Does Gemini 3 Flash support prompt caching?

Yes. Cached input reads bill at $0.05/M. For long-context agent workloads this routinely cuts bills by 70%+.

Are there volume discounts on Gemini 3 Flash?

Pass-through pricing means you automatically get whatever tier discount Google publishes. AIgateway doesn't mark up; the 5% platform fee applies flat regardless of volume. Enterprise customers can negotiate committed-use discounts on top.

How is Gemini 3 Flash usage metered?

Every call returns usage.prompt_tokens and usage.completion_tokens in the response body, same as OpenAI. We also write the exact cost into x-aigateway-cost response header. Full audit log is in the dashboard.

TRY IT NOW

One key, every model. Free tier, no card.

Get an AIgateway keyOpen the playground