Compare

Claude Opus 4.8 vs Gemini 3.1 Pro

Pricing per million tokens, context window, capabilities — pulled from each provider's public docs. All 2 are available via the same AIgateway OpenAI-compatible endpoint; flip the model string to switch.

Search2/4
Claude Opus 4.8
anthropic/claude-opus-4.8
Gemini 3.1 Pro
google/gemini-3.1-pro
Provider
Anthropic
Google
Family
Claude 4
Gemini 3
Modality
text
text
Context window
1,000,000 tok
1,000,000 tok
Max output
128,000 tok
65,536 tok
Released
2026-05-28
2026-05-22
License
Proprietary
Proprietary
Input price
$5.00 /1M
$2.00 /1M
Output price
$25.00 /1M
$12.00 /1M
Cache read
$0.500 /1M
$0.200 /1M
Cache write
$6.25 /1M
Tools
yes
yes
Streaming
yes
yes
Vision
yes
yes
JSON mode
yes
yes
Reasoning
yes
yes
Prompt caching
yes
yes
Batch API
yes
Try it
Open in playground →
Open in playground →
Claude Opus 4.8
anthropic/claude-opus-4.8
Full spec →

Claude Opus 4.8 is Anthropic's most capable generally available model, with a step-change improvement in agentic coding over Claude Opus 4.7. It uses adaptive thinking to calibrate reasoning per task and supports a one million token context window at standard pricing.

Strengths
  • Anthropic's most capable model — #1 on the Artificial Analysis Intelligence Index
  • Best computer-use / browser agent tested (84% on Online-Mind2Web)
  • Adaptive thinking — calibrates reasoning depth per task
Use cases
Autonomous coding agentsCodebase-scale migrationsComputer use / browser agentsHigh-stakes reasoning + analysisLong-document work (1M context)
Gemini 3.1 Pro
google/gemini-3.1-pro
Full spec →

Google's most intelligent Gemini model with improved reasoning, a medium thinking level, and a 1M token context window.

Strengths
  • 2M-token context
  • Native video + audio input
  • Cheapest long-context reasoning
Use cases
Video understandingVery long-document RAGCodebase-scale reasoning

Benchmarks

Claude Opus 4.8
Gemini 3.1 Pro
AA Intelligence Index
61.0
Long-Context
94.2
MMLU
89.6
Online-Mind2Web (computer use)
84.0

Source: each provider's published benchmarks. Higher is better. Run an eval to compare on your own data.

Compare with another

Gemini 3.1 Pro vs Claude Opus 4.7
google/gemini-3.1-pro · anthropic/claude-opus-4.7
GPT-5.4 vs Gemini 3.1 Pro
openai/gpt-5.4 · google/gemini-3.1-pro
Claude Sonnet 4.6 vs Gemini 3.1 Pro
anthropic/claude-sonnet-4.6 · google/gemini-3.1-pro
Gemini 3.1 Pro vs Kimi-K2.6
google/gemini-3.1-pro · moonshot/kimi-k2.6
Claude Opus 4.8 vs Claude Sonnet 4.6
anthropic/claude-opus-4.8 · anthropic/claude-sonnet-4.6
Claude Opus 4.8 vs Grok 4.20 Multi-Agent
anthropic/claude-opus-4.8 · xai/grok-4.20-multi-agent-0309
Claude Opus 4.8 vs GPT-5.4
anthropic/claude-opus-4.8 · openai/gpt-5.4
Claude Opus 4.8 vs Gemini 2.5 Pro
anthropic/claude-opus-4.8 · google/gemini-2.5-pro
Claude Opus 4.8 vs GPT-5.5 Pro
anthropic/claude-opus-4.8 · openai/gpt-5.5-pro
SWITCH BETWEEN THEM

One key, all 2, one line different.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

# Claude Opus 4.8
client.chat.completions.create(
    model="anthropic/claude-opus-4.8",
    messages=[{"role":"user","content":"hello"}],
)

# Gemini 3.1 Pro
client.chat.completions.create(
    model="google/gemini-3.1-pro",
    messages=[{"role":"user","content":"hello"}],
)
Get an AIgateway keyRun an eval on these →