providers/Meta
Meta · Menlo Park, CA

Meta models on AIgateway — pricing, context, capabilities

Meta ships 22 models on AIgateway spanning image, moderation, object-detection, text, translation. Call any of them via the OpenAI-compatible endpoint at api.aigateway.sh/v1 with one key. Pass-through inference pricing plus a 5% platform fee at credit top-up. No per-call markups, no seat fees, no minimum.

Get your key →See pricingVisit Meta
models · 22modalities · image, moderation, object-detection, text, translationlocation · Menlo Park, CA
image

Meta image models

1 image model from Meta.

DETR ResNet-50
facebook/detr-resnet-50
Meta's transformer-based object detector. Returns bounding boxes + labels.
$0.000 / image
moderation

Meta moderation models

1 moderation model from Meta.

Llama-Guard-3-8b
meta/llama-guard-3-8b
Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.
$0.480 in · $0.030 out / 1M131,072 ctx
object-detection

Meta object-detection models

1 object-detection model from Meta.

detr-resnet-50
facebook/detr-resnet-50
$0.000 / image
text

Meta text models

18 text models from Meta.

Llama-4-Scout-17b-16e-Instruct
★ featured
meta/llama-4-scout-17b-16e-instruct
Meta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.
$0.270 in · $0.850 out / 1M131,000 ctx
Llama-3.3-70b-Instruct-Fp8-Fast
meta/llama-3.3-70b-instruct-fp8-fast
Llama 3.3 70B quantized to fp8 precision, optimized to be faster.
$0.290 in · $2.25 out / 1M24,000 ctx
Llama-3.2-11b-Vision-Instruct
meta/llama-3.2-11b-vision-instruct
The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.
$0.049 in · $0.680 out / 1M128,000 ctx
Llama-3.2-1b-Instruct
meta/llama-3.2-1b-instruct
The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
$0.027 in · $0.200 out / 1M60,000 ctx
Llama-3.2-3b-Instruct
meta/llama-3.2-3b-instruct
The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
$0.051 in · $0.340 out / 1M80,000 ctx
Llama-3.1-8b-Instruct-Awq
meta/llama-3.1-8b-instruct-awq
Quantized (int4) generative text model with 8 billion parameters from Meta.
$0.120 in · $0.270 out / 1M8,192 ctx
Llama-3.1-8b-Instruct-Fp8
meta/llama-3.1-8b-instruct-fp8
Llama 3.1 8B quantized to FP8 precision
$0.150 in · $0.290 out / 1M32,000 ctx
Llama-3.1-70b-Instruct
meta/llama-3.1-70b-instruct
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
$0.290 in · $0.600 out / 1M131,072 ctx
Llama-3.1-8b-Instruct
meta/llama-3.1-8b-instruct
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
$0.050 in · $0.100 out / 1M131,072 ctx
Llama-3.1-8b-Instruct-Fast
meta/llama-3.1-8b-instruct-fast
[Fast version] The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
$0.050 in · $0.100 out / 1M131,072 ctx
Llama-3-8b-Instruct-Awq
meta/llama-3-8b-instruct-awq
Quantized (int4) generative text model with 8 billion parameters from Meta.
$0.120 in · $0.270 out / 1M8,192 ctx
Llama-3-8b-Instruct
meta/llama-3-8b-instruct
Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.
$0.280 in · $0.830 out / 1M7,968 ctx
Llama-2-7b-Chat-HF-Lora
meta-llama/llama-2-7b-chat-hf-lora
This is a Llama2 base model that Cloudflare dedicated for inference with LoRA adapters. Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.
$0.040 in · $0.080 out / 1M8,192 ctx
Bart-Large-CNN
facebook/bart-large-cnn
BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. You can use this model for text summarization.
see pricing
Llama-2-7b-Chat-Fp16
meta/llama-2-7b-chat-fp16
Full precision (fp16) generative text model with 7 billion parameters from Meta
$0.560 in · $6.67 out / 1M4,096 ctx
Llama-2-7b-Chat-Int8
meta/llama-2-7b-chat-int8
Quantized (int8) generative text model with 7 billion parameters from Meta
$0.040 in · $0.080 out / 1M8,192 ctx
BART Large CNN
facebook/bart-large-cnn
Classic BART fine-tuned on CNN/DailyMail. Cheap summarization workhorse.
$0.050 in · $0.100 out / 1M
M2M100 1.2B
meta/m2m100-1.2b
Many-to-many translation across 100 languages without pivoting through English.
$0.021 in · $0.042 out / 1M
translation

Meta translation models

1 translation model from Meta.

M2m100-1.2b
meta/m2m100-1.2b
Multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation
$0.340 in · $0.340 out / 1M
About Meta

Who they are, what they focus on

Meta AI ships the Llama family — the most-used open-weight LLMs in production. Llama 4 Scout (MoE, 17B active) is the current flagship; Llama 3.3 70B is the reliable workhorse. Llama Guard handles moderation.

Headquartered in Menlo Park, CA. Homepage: ai.meta.com.

FAQ

Common questions about Meta on AIgateway

Which Meta models does AIgateway support?
AIgateway routes 22 Meta models including Llama-4-Scout-17b-16e-Instruct. Full catalog with pricing and context windows is in the sections above.
How do I call a Meta model from my code?
Point the OpenAI SDK at https://api.aigateway.sh/v1 with your AIgateway key and set model to the Meta slug (e.g. "meta/llama-4-scout-17b-16e-instruct"). Request and response shapes are identical to OpenAI.
How much do Meta models cost on AIgateway?
Pass-through Meta pricing plus a 5% platform fee applied at credit top-up, not per call. No seat fees, no minimum beyond the $5 top-up floor.
Can I bring my own Meta API key (BYOK)?
Yes. Attach your Meta key in the AIgateway dashboard. Calls to Meta models flip to pass-through and AIgateway waives the 5% platform fee on those calls.
Where is Meta based?
Meta is headquartered in Menlo Park, CA.
Is there a free tier?
AIgateway's free tier is 100 requests/day on Kimi K2.6 — any account can test without a card. Paid Meta models require a $5 minimum credit top-up.
Other providers

Browse other labs

AnthropicOpenAIGooglexAIMoonshotDeepSeekMistralAlibabaDeepgramBlack Forest LabsBAAIAll providers →