Compare

Llama-3.2-1b-Instruct vs Llama-3.3-70b-Instruct-Fp8-Fast

Pricing per million tokens, context window, capabilities — pulled from each provider's public docs. All 2 are available via the same AIgateway OpenAI-compatible endpoint; flip the model string to switch.

Search2/4
Llama-3.2-1b-Instruct
meta/llama-3.2-1b-instruct
Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.3-70b-instruct-fp8-fast
Provider
Meta
Meta
Family
Llama 3
Llama 3
Modality
text
text
Context window
128,000 tok
131,072 tok
Max output
4,096 tok
8,192 tok
Released
2024-09-25
2024-12-06
License
Open-weight
Open-weight
Input price
$0.015 /1M
$0.290 /1M
Output price
$0.030 /1M
$2.25 /1M
Tools
yes
Streaming
yes
yes
Vision
JSON mode
yes
Reasoning
Prompt caching
Batch API
Try it
Open in playground →
Open in playground →
Llama-3.2-1b-Instruct
meta/llama-3.2-1b-instruct
Full spec →

The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

Strengths
  • General-purpose chat
  • Long context
  • Tool use
Use cases
ChatbotsContent generationAgentic workflows
Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.3-70b-instruct-fp8-fast
Full spec →

Llama 3.3 70B quantized to fp8 precision, optimized to be faster.

Strengths
  • Strong general-purpose open model
  • FP8-fast variant
  • Open-weight license
Use cases
ChatRAGClassification

Compare with another

Llama-3.2-1b-Instruct vs Llama-3.2-3b-Instruct
meta/llama-3.2-1b-instruct · meta/llama-3.2-3b-instruct
Llama-3.2-3b-Instruct vs Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.2-3b-instruct · meta/llama-3.3-70b-instruct-fp8-fast
Llama-3.1-8b-Instruct-Fp8 vs Llama-3.2-1b-Instruct
meta/llama-3.1-8b-instruct-fp8 · meta/llama-3.2-1b-instruct
Llama-3.1-8b-Instruct-Fp8 vs Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.1-8b-instruct-fp8 · meta/llama-3.3-70b-instruct-fp8-fast
SWITCH BETWEEN THEM

One key, all 2, one line different.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

# Llama-3.2-1b-Instruct
client.chat.completions.create(
    model="meta/llama-3.2-1b-instruct",
    messages=[{"role":"user","content":"hello"}],
)

# Llama-3.3-70b-Instruct-Fp8-Fast
client.chat.completions.create(
    model="meta/llama-3.3-70b-instruct-fp8-fast",
    messages=[{"role":"user","content":"hello"}],
)
Get an AIgateway keyRun an eval on these →