questions/Pricing

How much does Gemini 3 Flash cost?

Gemini 3 Flash (google/gemini-3-flash) is priced at $0.500/M input and $3.00/M output tokens through AIgateway. Cached input reads are $0.050/M tokens. 1,000,000-token context window. AIgateway charges pass-through on the underlying Google rate plus a 5% platform fee applied at credit top-up — nothing per-call, no monthly minimum. Free tier available.

How it works

Pass-through on the underlying provider

AIgateway bills at the same per-token rate Google publishes. There's no inflated sticker price; the model-vendor's invoice and your gateway invoice agree on the token count and unit cost. Usage is aggregated to the nearest fraction of a cent in D1 and surfaced live in your dashboard.

5% platform fee at top-up

The only margin on top is a 5% platform fee, applied at credit top-up — not per-call. A $100 top-up funds $95 of provider calls. Cache hits (when you use prompt caching or semantic cache) bill at 10% of the uncached cost, so long-context agent workloads often run net cheaper than calling Google directly.

Free tier + pay-as-you-go

No subscription and no monthly minimum. Free tier gives you 100 requests/day on Kimi K2.6. Paid tier starts at a $5 top-up with no auto-renew. BYOK (bring your own Google key) is supported on Enterprise if you already have a direct contract you want to preserve.

Code example

Python

# pip install aigateway-py openai
# aigateway-py: sub-accounts, evals, replays, jobs, webhook verify.
# openai SDK: chat/embeddings/images/audio — drop-in compat per our SDK's own guidance.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

r = client.chat.completions.create(
    model="google/gemini-3-flash",
    messages=[{"role": "user", "content": "Explain vector databases in two sentences."}],
)
print(r.choices[0].message.content)

Node / TypeScript

// npm i aigateway-js openai
// aigateway-js: sub-accounts, evals, replays, jobs, webhook verify.
// openai SDK: chat/embeddings/images/audio — drop-in compat.
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.aigateway.sh/v1",
  apiKey: process.env.AIGATEWAY_KEY,
});

const r = await client.chat.completions.create({
  model: "google/gemini-3-flash",
  messages: [{ role: "user", content: "Explain vector databases in two sentences." }],
});
console.log(r.choices[0].message.content);

FAQ

How much does Gemini 3 Flash cost per token?

Input is $0.5/M tokens, output is $3/M tokens. A typical 1K-in / 500-out chat turn costs about $0.00200.

Is there a free tier for Gemini 3 Flash?

Gemini 3 Flash is not on the free tier; the 100 free requests/day go to Kimi K2.6. Paid tier starts at a $5 top-up.

Does Gemini 3 Flash support prompt caching?

Yes. Cached input reads bill at $0.05/M. For long-context agent workloads this routinely cuts bills by 70%+.

Are there volume discounts on Gemini 3 Flash?

Pass-through pricing means you automatically get whatever tier discount Google publishes. AIgateway doesn't mark up; the 5% platform fee applies flat regardless of volume. Enterprise customers can negotiate committed-use discounts on top.

How is Gemini 3 Flash usage metered?

Every call returns usage.prompt_tokens and usage.completion_tokens in the response body, same as OpenAI. We also write the exact cost into x-aigateway-cost response header. Full audit log is in the dashboard.

TRY IT NOW

One key, every model. Free tier, no card.

Get an AIgateway key Open the playground

How much does Gemini 3 Flash cost?

How it works

Code example

Related

FAQ

One key, every model. Free tier, no card.