Compare

Llama-3.3-70b-Instruct-Fp8-Fast — and what?

Pricing per million tokens, context window, capabilities — pulled from each provider's public docs. All 1 are available via the same AIgateway OpenAI-compatible endpoint; flip the model string to switch.

Search1/4
Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.3-70b-instruct-fp8-fast
Provider
Meta
Family
Llama 3
Modality
text
Context window
131,072 tok
Max output
8,192 tok
Released
2024-12-06
License
Open-weight
Input price
$0.290 /1M
Output price
$2.25 /1M
Tools
yes
Streaming
yes
Vision
JSON mode
yes
Reasoning
Prompt caching
Batch API
Try it
Open in playground →
Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.3-70b-instruct-fp8-fast
Full spec →

Llama 3.3 70B quantized to fp8 precision, optimized to be faster.

Strengths
  • Strong general-purpose open model
  • FP8-fast variant
  • Open-weight license
Use cases
ChatRAGClassification

Compare with another

Llama-3.2-3b-Instruct vs Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.2-3b-instruct · meta/llama-3.3-70b-instruct-fp8-fast
Llama-3.1-8b-Instruct-Fp8 vs Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.1-8b-instruct-fp8 · meta/llama-3.3-70b-instruct-fp8-fast
Llama-3.2-1b-Instruct vs Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.2-1b-instruct · meta/llama-3.3-70b-instruct-fp8-fast