Pricing, context window, capabilities, and release date — pulled from each provider's public docs. Both are available via the same AIgateway OpenAI-compatible endpoint; flip the model string to switch.
Both models stream in parallel through your own AIgateway key. Tokens, latency, and cost update as they arrive.
| Claude Opus 4.7 anthropic/claude-opus-4.7 | Llama-Guard-3-8b meta/llama-guard-3-8b | |
|---|---|---|
| Provider | Anthropic | Meta |
| Family | Claude 4 | Llama Guard |
| Modality | text | moderation |
| Context window | 1,000,000 tok | 131,072 tok |
| Max output | 128,000 tok | 4,096 tok |
| Released | 2026-04-16 | 2025-01-22 |
| Input price | $5.00 /1M | $0.480 /1M |
| Output price | $25.00 /1M | $0.030 /1M |
| Cache read | $0.500 /1M | — |
| Tools | yes | — |
| Streaming | yes | yes |
| Vision | yes | — |
| JSON mode | yes | — |
| Reasoning | yes | — |
| Prompt caching | yes | — |
Claude Opus 4.7 is Anthropic's most capable generally available model to date. It is highly autonomous and performs exceptionally well on long-horizon agentic work, knowledge work, vision tasks, and memory tasks.
Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.
from openai import OpenAI
client = OpenAI(
base_url="https://api.aigateway.sh/v1",
api_key="sk-aig-...",
)
# Try Claude Opus 4.7
client.chat.completions.create(
model="anthropic/claude-opus-4.7",
messages=[{"role":"user","content":"hello"}],
)
# Try Llama-Guard-3-8b — same client, same key
client.chat.completions.create(
model="meta/llama-guard-3-8b",
messages=[{"role":"user","content":"hello"}],
)