# AIgateway — full model catalog > Every model routed through AIgateway, with price, context, capabilities, and release date. > Same endpoint (https://api.aigateway.sh/v1) for all of them. Same auth. OpenAI-compatible request shape. > Generated 2026-04-24T10:23:38.501Z · catalog size 169 ## Comparison matrix — frontier models | Model | Context | Input / 1M | Output / 1M | Tools | Vision | Reasoning | |---|---|---|---|---|---|---| | anthropic/claude-opus-4.7 | 1M | $5 | $25 | yes | yes | yes | | anthropic/claude-sonnet-4.6 | 200K | $3 | $15 | yes | yes | yes | | anthropic/claude-haiku-4.5 | 200K | $1 | $5 | yes | yes | no | | openai/gpt-5.4 | 400K | $2.5 | $15 | yes | yes | yes | | openai/gpt-5.4-mini | 256K | $0.75 | $4.5 | yes | yes | no | | google/gemini-3.1-pro | 2M | $2 | $12 | yes | no | no | | google/gemini-3-flash | 1M | $0.5 | $3 | yes | no | no | | moonshot/kimi-k2.6 | 256K | $0.95 | $4 | yes | yes | yes | ## Pricing table (all catalog models) | Slug | Modality | Input / 1M | Output / 1M | |---|---|---|---| | anthropic/claude-opus-4.7 | text | $5.00 | $25.00 | | openai/gpt-5.4 | text | $2.50 | $15.00 | | google/gemini-3.1-pro | text | $2.00 | $12.00 | | moonshot/kimi-k2.6 | text | $0.950 | $4.00 | | anthropic/claude-sonnet-4.6 | text | $3.00 | $15.00 | | minimax/m2.7 | text | $0.300 | $1.20 | | openai/gpt-5.4-mini | text | $0.750 | $4.50 | | xai/grok-4 | text | $5.00 | $15.00 | | google/gemini-3-flash | text | $0.500 | $3.00 | | anthropic/claude-haiku-4.5 | text | $1.00 | $5.00 | | black-forest-labs/flux-2-klein-9b | image | $0.0400/img | — | | google/imagen-4 | image | $0.0400/img | — | | google/veo-3.1 | video | $0.400/sec | — | | deepgram/nova-3 | audio-stt | $0.0052/min | — | | deepgram/aura-2-en | audio-tts | $0.0300/1K-chars | — | | baai/bge-m3 | embedding | $0.012 | — | | meta/llama-4-scout-17b-16e-instruct | text | $0.270 | $0.850 | | openai/gpt-oss-120b | text | $0.350 | $0.750 | | qwen/qwen2.5-coder-32b-instruct | text | $0.660 | $1.00 | | alibaba/qwen3-max | text | $1.20 | $6.00 | | alibaba/qwen3.5-397b-a17b | text | $0.600 | $3.60 | | pixverse/v5.6 | video | $0.080/sec | — | | pixverse/v6 | video | $0.025/sec | — | | vidu/q3-pro | video | $0.050/sec | — | | vidu/q3-turbo | video | $0.040/sec | — | | alibaba/wan-2.6-image | image | $0.0300/img | — | | openai/gpt-image-1.5 | image | $5.00 | $10.00 | | runwayml/gen-4.5 | video | $0.120/sec | — | | minimax/music-2.6 | music | $0.150/sec | — | | assemblyai/universal-3-pro | audio-stt | $0.0035/min | — | | anthropic/claude-opus-4.6 | text | $5.00 | $25.00 | | anthropic/claude-sonnet-4 | text | $3.00 | $15.00 | | anthropic/claude-sonnet-4.5 | text | $3.00 | $15.00 | | google/gemini-3.1-flash-lite | text | $0.250 | $1.50 | | openai/gpt-4.1 | text | $2.00 | $8.00 | | openai/gpt-4.1-mini | text | $0.400 | $1.60 | | openai/gpt-5 | text | $1.25 | $10.00 | | openai/gpt-5.4-nano | text | $0.200 | $1.25 | | recraft/recraftv4 | image | $0.0400/img | — | | recraft/recraftv4-pro | image | $0.2500/img | — | | recraft/recraftv4-pro-vector | image | $0.3000/img | — | | recraft/recraftv4-vector | image | $0.0800/img | — | | inworld/tts-1.5-max | audio-tts | $0.0500/1K-chars | — | | inworld/tts-1.5-mini | audio-tts | $0.0250/1K-chars | — | | minimax/speech-2.8-hd | audio-tts | $0.1000/1K-chars | — | | minimax/speech-2.8-turbo | audio-tts | $0.0600/1K-chars | — | | openai/tts-1 | audio-tts | $0.0150/1K-chars | — | | minimax/hailuo-2.3 | video | $0.047/sec | — | | openai/gpt-4o-transcribe | stt | $0.0060/min | — | | openai/o4-mini | text | $1.10 | $4.40 | | openai/tts-1-hd | tts | $0.0300/1K-chars | — | | minimax/hailuo-2.3-fast | video | $0.032/sec | — | | moonshot/kimi-k2.5 | text | $0.600 | $3.00 | | bytedance/seedream-4.0 | image | $0.0300/img | — | | bytedance/seedream-4.5 | image | $0.0400/img | — | | bytedance/seedream-5-lite | image | $0.0350/img | — | | google/nano-banana | image | $0.300 | $30.00 | | google/nano-banana-2 | image | $0.500 | $60.00 | | google/nano-banana-pro | image | $2.00 | $120.00 | | google/veo-3 | video | $0.200/sec | — | | google/veo-3-fast | video | $0.080/sec | — | | google/veo-3.1-fast | video | $0.080/sec | — | | google/gemma-4-26b-a4b-it | text | $0.100 | $0.300 | | cohere/cohere-transcribe-03-2026 | audio-stt | $0.0060/min | — | | mistral/voxtral-tts-26-03 | audio-tts | $12.0000/1K-chars | — | | mistral/mistral-small-4-0-26-03 | text | $0.200 | $0.600 | | nvidia/nemotron-3-120b-a12b | text | $0.500 | $1.50 | | mistral/voxtral-mini-transcribe-realtime-26-02 | audio-stt | $0.0080/min | — | | mistral/voxtral-mini-transcribe-26-02 | audio-stt | $0.0060/min | — | | zai-org/glm-4.7-flash | text | $0.060 | $0.400 | | black-forest-labs/flux-2-klein-4b | image | $0.0000/img | — | | black-forest-labs/flux-2-dev | image | $0.0000/img | — | | cartesia/sonic-3 | audio-tts | $30.0000/1K-chars | — | | deepgram/aura-2-es | audio-tts | $0.0300/1K-chars | — | | ibm-granite/granite-4.0-h-micro | text | $0.017 | $0.110 | | deepgram/flux | audio-stt | $0.0077/min | — | | pfnet/plamo-embedding-1b | embedding | $0.019 | — | | aisingapore/gemma-sea-lion-v4-27b-it | text | $0.350 | $0.560 | | ai4bharat/indictrans2-en-indic-1B | translation | $0.340 | $0.340 | | xai/grok-4-fast | text | $0.500 | $2.00 | | google/embeddinggemma-300m | embedding | $0.020 | — | | deepgram/aura-1 | audio-tts | $0.0150/1K-chars | — | | leonardo/lucid-origin | image | $0.0039/img | — | | leonardo/phoenix-1.0 | image | $0.0033/img | — | | openai/gpt-oss-20b | text | $0.200 | $0.300 | | pipecat-ai/smart-turn-v2 | classification | $0.0003/min | — | | qwen/qwen3-embedding-0.6b | embedding | $0.012 | — | | qwen/qwen3-30b-a3b-fp8 | text | $0.051 | $0.340 | | google/gemma-3-12b-it | text | $0.350 | $0.560 | | mistralai/mistral-small-3.1-24b-instruct | text | $0.350 | $0.550 | | perplexity/sonar-reasoning-pro | text | $2.00 | $8.00 | | qwen/qwq-32b | reasoning | $0.200 | $0.400 | | perplexity/sonar-deep-research | text | $2.00 | $8.00 | | baai/bge-reranker-base | reranking | $0.003 | — | | deepseek/deepseek-r1-distill-qwen-32b | reasoning | $0.500 | $4.88 | | meta/llama-guard-3-8b | moderation | $0.480 | $0.030 | | meta/llama-3.3-70b-instruct-fp8-fast | text | $0.290 | $2.25 | | meta/llama-3.2-11b-vision-instruct | text | $0.049 | $0.680 | | meta/llama-3.2-1b-instruct | text | $0.027 | $0.200 | | meta/llama-3.2-3b-instruct | text | $0.051 | $0.340 | | black-forest-labs/flux-1-schnell | image | $0.0004/img | — | | meta/llama-3.1-8b-instruct-awq | text | $0.120 | $0.270 | | meta/llama-3.1-8b-instruct-fp8 | text | $0.150 | $0.290 | | meta/llama-3.1-70b-instruct | text | $0.290 | $0.600 | | meta/llama-3.1-8b-instruct | text | $0.050 | $0.100 | | meta/llama-3.1-8b-instruct-fast | text | $0.050 | $0.100 | | myshell-ai/melotts | audio-tts | $0.0002/min | — | | openai/whisper-large-v3-turbo | audio-stt | $0.0005/min | — | | meta/llama-3-8b-instruct-awq | text | $0.120 | $0.270 | | openai/whisper-tiny-en | audio-stt | $0.0005/min | — | | meta/llama-3-8b-instruct | text | $0.280 | $0.830 | | hf/meta-llama/meta-llama-3-8b-instruct | text | $0.050 | $0.100 | | google/gemma-2b-it-lora | text | $0.030 | $0.060 | | google/gemma-7b-it-lora | text | $0.080 | $0.160 | | meta-llama/llama-2-7b-chat-hf-lora | text | $0.040 | $0.080 | | hf/mistral/mistral-7b-instruct-v0.2 | text | $0.050 | $0.100 | | mistral/mistral-7b-instruct-v0.2-lora | text | $0.050 | $0.100 | | hf/google/gemma-7b-it | text | $0.080 | $0.160 | | hf/nousresearch/hermes-2-pro-mistral-7b | text | $0.050 | $0.100 | | hf/nexusflow/starling-lm-7b-beta | text | $0.050 | $0.100 | | unum/uform-gen2-qwen-500m | vision | $0.0000/img | — | | deepseek/deepseek-math-7b-instruct | text | — | — | | defog/sqlcoder-7b-2 | text | $0.050 | $0.100 | | microsoft/phi-2 | text | $0.020 | $0.040 | | bytedance/stable-diffusion-xl-lightning | image | $0.0000/img | — | | lykon/dreamshaper-8-lcm | image | $0.0008/img | — | | facebook/bart-large-cnn | text | — | — | | runwayml/stable-diffusion-v1-5-img2img | image | $0.0000/img | — | | runwayml/stable-diffusion-v1-5-inpainting | image | $0.0000/img | — | | qwen/qwen1.5-0.5b-chat | text | $0.010 | $0.020 | | qwen/qwen1.5-1.8b-chat | text | $0.020 | $0.040 | | qwen/qwen1.5-14b-chat-awq | text | $0.120 | $0.240 | | qwen/qwen1.5-7b-chat-awq | text | $0.060 | $0.120 | | thebloke/discolm-german-7b-v1-awq | text | $0.050 | $0.100 | | openchat/openchat-3.5-0106 | text | $0.050 | $0.100 | | tinyllama/tinyllama-1.1b-chat-v1.0 | text | $0.008 | $0.016 | | hf/thebloke/llamaguard-7b-awq | text | $0.040 | $0.080 | | fblgit/una-cybertron-7b-v2-bf16 | text | $0.050 | $0.100 | | hf/thebloke/neural-chat-7b-v3-1-awq | text | $0.050 | $0.100 | | stabilityai/stable-diffusion-xl-base-1.0 | image | $0.0000/img | — | | baai/bge-large-en-v1.5 | embedding | $0.204 | — | | baai/bge-small-en-v1.5 | embedding | $0.020 | — | | meta/llama-2-7b-chat-fp16 | text | $0.560 | $6.67 | | mistral/mistral-7b-instruct-v0.1 | text | $0.110 | $0.190 | | hf/thebloke/deepseek-coder-6.7b-base-awq | text | $0.050 | $0.100 | | hf/thebloke/deepseek-coder-6.7b-instruct-awq | text | $0.050 | $0.100 | | hf/thebloke/openhermes-2.5-mistral-7b-awq | text | $0.050 | $0.100 | | hf/thebloke/zephyr-7b-beta-awq | text | $0.050 | $0.100 | | llava-hf/llava-1.5-7b-hf | vision | $0.500 | $0.000 | | hf/thebloke/mistral-7b-instruct-v0.1-awq | text | $0.050 | $0.100 | | openai/whisper | audio-stt | $0.0005/min | — | | baai/bge-base-en-v1.5 | embedding | $0.067 | — | | meta/llama-2-7b-chat-int8 | text | $0.040 | $0.080 | | microsoft/resnet-50 | classification | $0.0000/img | — | | huggingface/distilbert-sst-2-int8 | classification | $0.026 | — | | meta/m2m100-1.2b | translation | $0.340 | $0.340 | | hf/thebloke/llama-2-13b-chat-awq | text | $0.070 | $0.140 | | tiiuae/falcon-7b-instruct | text | $0.050 | $0.100 | | facebook/detr-resnet-50 | object-detection | $0.0000/img | — | | ai4bharat/indictrans2-en-indic-1B | text | $0.021 | $0.042 | | baai/bge-reranker-base | rerank | — | — | | facebook/bart-large-cnn | text | $0.050 | $0.100 | | facebook/detr-resnet-50 | image | $0.0002/img | — | | huggingface/distilbert-sst-2-int8 | text | — | — | | meta/m2m100-1.2b | text | $0.021 | $0.042 | | microsoft/resnet-50 | image | $0.0002/img | — | | minimax/music-2.6 | audio-tts | $0.052/sec | — | | mistral/mistral-moderation-26-03 | moderation | — | — | | pipecat-ai/smart-turn-v2 | audio-stt | $0.0021/min | — | ## Start here (top picks across modalities) If you're an agent picking a default, use this ordered list. Top-1 is the model we'd pick for a general-purpose, quality-first task; further down trades quality for cost or speed. - anthropic/claude-opus-4.7 [text] Claude Opus 4.7 · Anthropic released=2026-04-16 ctx=1000000 max_out=128000 caps=[tools,streaming,vision,json,caching,reasoning] in=$5.00/Mtok out=$25.00/Mtok cache-read=$0.500/Mtok - openai/gpt-5.4 [text] GPT-5.4 · OpenAI released=2026-04-08 ctx=128000 max_out=16384 caps=[tools,streaming,vision,json,caching,reasoning,batch] in=$2.50/Mtok out=$15.00/Mtok cache-read=$0.250/Mtok - google/gemini-3.1-pro [text] Gemini 3.1 Pro · Google released=2026-04-13 ctx=1000000 max_out=65536 caps=[tools,streaming,vision,json,caching,reasoning] in=$2.00/Mtok out=$12.00/Mtok cache-read=$0.200/Mtok - moonshot/kimi-k2.6 [text] Kimi K2.6 · Moonshot released=2026-04-20 ctx=262144 max_out=16384 caps=[tools,streaming,vision,json,caching,reasoning] in=$0.950/Mtok out=$4.00/Mtok cache-read=$0.160/Mtok - anthropic/claude-sonnet-4.6 [text] Claude Sonnet 4.6 · Anthropic released=2026-04-13 ctx=200000 max_out=128000 caps=[tools,streaming,vision,json,caching,reasoning] in=$3.00/Mtok out=$15.00/Mtok cache-read=$0.300/Mtok - minimax/m2.7 [text] M2.7 · MiniMax released=2026-04-13 ctx=128000 max_out=4096 caps=[streaming] in=$0.300/Mtok out=$1.20/Mtok - openai/gpt-5.4-mini [text] GPT-5.4 Mini · OpenAI released=2026-04-13 ctx=128000 max_out=16384 caps=[tools,streaming,vision,json,caching,reasoning,batch] in=$0.750/Mtok out=$4.50/Mtok cache-read=$0.075/Mtok - xai/grok-4 [text] Grok 4 · xAI released=2025-07-09 ctx=256000 max_out=16384 caps=[tools,streaming,vision,json,reasoning] in=$5.00/Mtok out=$15.00/Mtok - google/gemini-3-flash [text] Gemini 3 Flash · Google released=2026-04-13 ctx=1000000 max_out=8192 caps=[tools,streaming,vision,json,caching,reasoning] in=$0.500/Mtok out=$3.00/Mtok cache-read=$0.050/Mtok - anthropic/claude-haiku-4.5 [text] Claude Haiku 4.5 · Anthropic released=2026-04-13 ctx=200000 max_out=8192 caps=[tools,streaming,vision,json,caching,reasoning] in=$1.00/Mtok out=$5.00/Mtok cache-read=$0.100/Mtok ## Full catalog — by modality ### text (86) - anthropic/claude-opus-4.7 [text] Claude Opus 4.7 · Anthropic released=2026-04-16 ctx=1000000 max_out=128000 caps=[tools,streaming,vision,json,caching,reasoning] in=$5.00/Mtok out=$25.00/Mtok cache-read=$0.500/Mtok - openai/gpt-5.4 [text] GPT-5.4 · OpenAI released=2026-04-08 ctx=128000 max_out=16384 caps=[tools,streaming,vision,json,caching,reasoning,batch] in=$2.50/Mtok out=$15.00/Mtok cache-read=$0.250/Mtok - google/gemini-3.1-pro [text] Gemini 3.1 Pro · Google released=2026-04-13 ctx=1000000 max_out=65536 caps=[tools,streaming,vision,json,caching,reasoning] in=$2.00/Mtok out=$12.00/Mtok cache-read=$0.200/Mtok - moonshot/kimi-k2.6 [text] Kimi K2.6 · Moonshot released=2026-04-20 ctx=262144 max_out=16384 caps=[tools,streaming,vision,json,caching,reasoning] in=$0.950/Mtok out=$4.00/Mtok cache-read=$0.160/Mtok - anthropic/claude-sonnet-4.6 [text] Claude Sonnet 4.6 · Anthropic released=2026-04-13 ctx=200000 max_out=128000 caps=[tools,streaming,vision,json,caching,reasoning] in=$3.00/Mtok out=$15.00/Mtok cache-read=$0.300/Mtok - minimax/m2.7 [text] M2.7 · MiniMax released=2026-04-13 ctx=128000 max_out=4096 caps=[streaming] in=$0.300/Mtok out=$1.20/Mtok - openai/gpt-5.4-mini [text] GPT-5.4 Mini · OpenAI released=2026-04-13 ctx=128000 max_out=16384 caps=[tools,streaming,vision,json,caching,reasoning,batch] in=$0.750/Mtok out=$4.50/Mtok cache-read=$0.075/Mtok - xai/grok-4 [text] Grok 4 · xAI released=2025-07-09 ctx=256000 max_out=16384 caps=[tools,streaming,vision,json,reasoning] in=$5.00/Mtok out=$15.00/Mtok - google/gemini-3-flash [text] Gemini 3 Flash · Google released=2026-04-13 ctx=1000000 max_out=8192 caps=[tools,streaming,vision,json,caching,reasoning] in=$0.500/Mtok out=$3.00/Mtok cache-read=$0.050/Mtok - anthropic/claude-haiku-4.5 [text] Claude Haiku 4.5 · Anthropic released=2026-04-13 ctx=200000 max_out=8192 caps=[tools,streaming,vision,json,caching,reasoning] in=$1.00/Mtok out=$5.00/Mtok cache-read=$0.100/Mtok - meta/llama-4-scout-17b-16e-instruct [text] Llama-4-Scout-17b-16e-Instruct · Meta released=2025-04-05 ctx=131000 max_out=4096 caps=[tools,streaming,vision,json] in=$0.270/Mtok out=$0.850/Mtok - openai/gpt-oss-120b [text] Gpt-Oss-120b · OpenAI released=2025-08-05 ctx=128000 max_out=4096 caps=[tools,streaming,json,reasoning] in=$0.350/Mtok out=$0.750/Mtok - qwen/qwen2.5-coder-32b-instruct [text] Qwen2.5-Coder-32b-Instruct · Alibaba Qwen released=2025-02-27 ctx=32768 max_out=4096 caps=[tools,streaming,json] in=$0.660/Mtok out=$1.00/Mtok - alibaba/qwen3-max [text] Qwen 3 Max · Alibaba released=2026-04-15 ctx=262144 max_out=4096 caps=[streaming,reasoning] in=$1.20/Mtok out=$6.00/Mtok - alibaba/qwen3.5-397b-a17b [text] Qwen 3.5 397B A17B · Alibaba released=2026-04-15 ctx=262144 max_out=4096 caps=[tools,streaming,json,reasoning] in=$0.600/Mtok out=$3.60/Mtok - anthropic/claude-opus-4.6 [text] Claude Opus 4.6 · Anthropic released=2026-04-13 ctx=1000000 max_out=128000 caps=[tools,streaming,vision,json,caching,reasoning] in=$5.00/Mtok out=$25.00/Mtok cache-read=$0.500/Mtok - anthropic/claude-sonnet-4 [text] Claude Sonnet 4 · Anthropic released=2026-04-13 ctx=200000 max_out=16000 caps=[tools,streaming,json,reasoning] in=$3.00/Mtok out=$15.00/Mtok - anthropic/claude-sonnet-4.5 [text] Claude Sonnet 4.5 · Anthropic released=2026-04-13 ctx=200000 max_out=8192 caps=[tools,streaming,json,reasoning] in=$3.00/Mtok out=$15.00/Mtok - google/gemini-3.1-flash-lite [text] Gemini 3.1 Flash Lite · Google released=2026-04-13 ctx=1000000 max_out=8192 caps=[tools,streaming,vision,json,caching] in=$0.250/Mtok out=$1.50/Mtok cache-read=$0.025/Mtok - openai/gpt-4.1 [text] GPT-4.1 · OpenAI released=2026-04-13 ctx=1047576 max_out=32768 caps=[tools,streaming,json,reasoning] in=$2.00/Mtok out=$8.00/Mtok - openai/gpt-4.1-mini [text] GPT-4.1 Mini · OpenAI released=2026-04-13 ctx=1047576 max_out=32768 caps=[tools,streaming,json] in=$0.400/Mtok out=$1.60/Mtok - openai/gpt-5 [text] GPT-5 · OpenAI released=2026-04-13 ctx=128000 max_out=16384 caps=[tools,streaming,json,reasoning] in=$1.25/Mtok out=$10.00/Mtok - openai/gpt-5.4-nano [text] GPT-5.4 Nano · OpenAI released=2026-04-13 ctx=128000 max_out=16384 caps=[tools,streaming,json,reasoning,batch] in=$0.200/Mtok out=$1.25/Mtok cache-read=$0.020/Mtok - openai/o4-mini [text] o4-mini · openai released=2026-04-13 ctx=200000 max_out=100000 caps=[streaming,reasoning] in=$1.10/Mtok out=$4.40/Mtok - moonshot/kimi-k2.5 [text] Kimi-K2.5 · Moonshot released=2026-04-08 ctx=128000 max_out=8192 caps=[tools,streaming,vision,json,reasoning] in=$0.600/Mtok out=$3.00/Mtok - google/gemma-4-26b-a4b-it [text] Gemma-4-26b-A4b-IT · Google released=2026-04-02 ctx=256000 max_out=4096 caps=[tools,streaming,vision,json,reasoning] in=$0.100/Mtok out=$0.300/Mtok - mistral/mistral-small-4-0-26-03 [text] Mistral Small 4 · Mistral released=2026-03-01 ctx=131072 max_out=16384 caps=[tools,streaming,vision,json] in=$0.200/Mtok out=$0.600/Mtok - nvidia/nemotron-3-120b-a12b [text] Nemotron-3-120b-A12b · NVIDIA released=2026-02-24 ctx=256000 max_out=4096 caps=[tools,streaming,json,reasoning] in=$0.500/Mtok out=$1.50/Mtok - zai-org/glm-4.7-flash [text] Glm-4.7-Flash · Zhipu AI released=2026-01-28 ctx=131072 max_out=4096 caps=[tools,streaming,json,reasoning] in=$0.060/Mtok out=$0.400/Mtok - ibm-granite/granite-4.0-h-micro [text] Granite-4.0-H-Micro · IBM released=2025-10-07 ctx=131000 max_out=4096 caps=[tools,streaming,json] in=$0.017/Mtok out=$0.110/Mtok - aisingapore/gemma-sea-lion-v4-27b-it [text] Gemma-Sea-Lion-V4-27b-IT · AI Singapore released=2025-09-23 ctx=128000 max_out=4096 caps=[streaming] in=$0.350/Mtok out=$0.560/Mtok - xai/grok-4-fast [text] Grok 4 Fast · xAI released=2025-09-19 ctx=256000 max_out=16384 caps=[tools,streaming,json] in=$0.500/Mtok out=$2.00/Mtok - openai/gpt-oss-20b [text] Gpt-Oss-20b · OpenAI released=2025-08-05 ctx=128000 max_out=4096 caps=[tools,streaming,json,reasoning] in=$0.200/Mtok out=$0.300/Mtok - qwen/qwen3-30b-a3b-fp8 [text] Qwen3-30b-A3b-Fp8 · Alibaba Qwen released=2025-04-30 ctx=32768 max_out=4096 caps=[tools,streaming,json,reasoning] in=$0.051/Mtok out=$0.340/Mtok - google/gemma-3-12b-it [text] Gemma-3-12b-IT · Google released=2025-03-18 ctx=80000 max_out=4096 caps=[streaming,vision] in=$0.350/Mtok out=$0.560/Mtok - mistralai/mistral-small-3.1-24b-instruct [text] Mistral-Small-3.1-24b-Instruct · Mistral released=2025-03-18 ctx=128000 max_out=4096 caps=[tools,streaming,vision,json] in=$0.350/Mtok out=$0.550/Mtok - perplexity/sonar-reasoning-pro [text] Sonar Reasoning Pro · Perplexity released=2025-03-07 ctx=127000 max_out=8192 caps=[streaming,reasoning] in=$2.00/Mtok out=$8.00/Mtok - perplexity/sonar-deep-research [text] Sonar Deep Research · Perplexity released=2025-02-14 ctx=127000 max_out=16384 caps=[streaming,reasoning] in=$2.00/Mtok out=$8.00/Mtok - meta/llama-3.3-70b-instruct-fp8-fast [text] Llama-3.3-70b-Instruct-Fp8-Fast · Meta released=2024-12-06 ctx=24000 max_out=8192 caps=[tools,streaming,json] in=$0.290/Mtok out=$2.25/Mtok - meta/llama-3.2-11b-vision-instruct [text] Llama-3.2-11b-Vision-Instruct · Meta released=2024-09-25 ctx=128000 max_out=4096 caps=[streaming,vision] in=$0.049/Mtok out=$0.680/Mtok - meta/llama-3.2-1b-instruct [text] Llama-3.2-1b-Instruct · Meta released=2024-09-25 ctx=60000 max_out=4096 caps=[streaming] in=$0.027/Mtok out=$0.200/Mtok - meta/llama-3.2-3b-instruct [text] Llama-3.2-3b-Instruct · Meta released=2024-09-25 ctx=80000 max_out=4096 caps=[streaming] in=$0.051/Mtok out=$0.340/Mtok - meta/llama-3.1-8b-instruct-awq [text] Llama-3.1-8b-Instruct-Awq · Meta released=2024-07-25 ctx=8192 max_out=4096 caps=[streaming] in=$0.120/Mtok out=$0.270/Mtok - meta/llama-3.1-8b-instruct-fp8 [text] Llama-3.1-8b-Instruct-Fp8 · Meta released=2024-07-25 ctx=32000 max_out=4096 caps=[streaming] in=$0.150/Mtok out=$0.290/Mtok - meta/llama-3.1-70b-instruct [text] Llama-3.1-70b-Instruct · Meta released=2024-07-23 ctx=131072 max_out=8192 caps=[streaming] in=$0.290/Mtok out=$0.600/Mtok - meta/llama-3.1-8b-instruct [text] Llama-3.1-8b-Instruct · Meta released=2024-07-23 ctx=131072 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - meta/llama-3.1-8b-instruct-fast [text] Llama-3.1-8b-Instruct-Fast · Meta released=2024-07-23 ctx=131072 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - meta/llama-3-8b-instruct-awq [text] Llama-3-8b-Instruct-Awq · Meta released=2024-05-09 ctx=8192 max_out=4096 caps=[streaming] in=$0.120/Mtok out=$0.270/Mtok - meta/llama-3-8b-instruct [text] Llama-3-8b-Instruct · Meta released=2024-04-18 ctx=7968 max_out=4096 caps=[streaming] in=$0.280/Mtok out=$0.830/Mtok - hf/meta-llama/meta-llama-3-8b-instruct [text] Meta-Llama-3-8b-Instruct · Hugging Face released=2024-04-18 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - google/gemma-2b-it-lora [text] Gemma-2b-IT-Lora · Google released=2024-04-02 ctx=8192 max_out=4096 caps=[streaming] in=$0.030/Mtok out=$0.060/Mtok - google/gemma-7b-it-lora [text] Gemma-7b-IT-Lora · Google released=2024-04-02 ctx=3500 max_out=4096 caps=[streaming] in=$0.080/Mtok out=$0.160/Mtok - meta-llama/llama-2-7b-chat-hf-lora [text] Llama-2-7b-Chat-HF-Lora · Meta released=2024-04-02 ctx=8192 max_out=4096 caps=[streaming] in=$0.040/Mtok out=$0.080/Mtok - hf/mistral/mistral-7b-instruct-v0.2 [text] Mistral-7b-Instruct-V0.2 · Hugging Face released=2024-04-02 ctx=3072 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - mistral/mistral-7b-instruct-v0.2-lora [text] Mistral-7b-Instruct-V0.2-Lora · Mistral released=2024-04-01 ctx=15000 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - hf/google/gemma-7b-it [text] Gemma-7b-IT · Hugging Face released=2024-04-01 ctx=8192 max_out=4096 caps=[streaming] in=$0.080/Mtok out=$0.160/Mtok - hf/nousresearch/hermes-2-pro-mistral-7b [text] Hermes-2-Pro-Mistral-7b · Hugging Face released=2024-04-01 ctx=24000 max_out=4096 caps=[tools,streaming,json] in=$0.050/Mtok out=$0.100/Mtok - hf/nexusflow/starling-lm-7b-beta [text] Starling-LM-7b-Beta · Hugging Face released=2024-03-19 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - deepseek/deepseek-math-7b-instruct [text] Deepseek-Math-7b-Instruct · DeepSeek released=2024-02-27 ctx=4096 max_out=4096 caps=[streaming] - defog/sqlcoder-7b-2 [text] Sqlcoder-7b-2 · Defog released=2024-02-27 ctx=10000 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - microsoft/phi-2 [text] Phi-2 · Microsoft released=2024-02-27 ctx=2048 max_out=4096 caps=[streaming] in=$0.020/Mtok out=$0.040/Mtok - facebook/bart-large-cnn [text] Bart-Large-CNN · Meta released=2024-02-27 - qwen/qwen1.5-0.5b-chat [text] Qwen1.5-0.5b-Chat · Alibaba Qwen released=2024-02-05 ctx=4096 max_out=4096 caps=[streaming] in=$0.010/Mtok out=$0.020/Mtok - qwen/qwen1.5-1.8b-chat [text] Qwen1.5-1.8b-Chat · Alibaba Qwen released=2024-02-05 ctx=4096 max_out=4096 caps=[streaming] in=$0.020/Mtok out=$0.040/Mtok - qwen/qwen1.5-14b-chat-awq [text] Qwen1.5-14b-Chat-Awq · Alibaba Qwen released=2024-02-05 ctx=4096 max_out=4096 caps=[streaming] in=$0.120/Mtok out=$0.240/Mtok - qwen/qwen1.5-7b-chat-awq [text] Qwen1.5-7b-Chat-Awq · Alibaba Qwen released=2024-02-05 ctx=4096 max_out=4096 caps=[streaming] in=$0.060/Mtok out=$0.120/Mtok - thebloke/discolm-german-7b-v1-awq [text] Discolm-German-7b-V1-Awq · TheBloke released=2024-01-24 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - openchat/openchat-3.5-0106 [text] Openchat-3.5-0106 · OpenChat released=2024-01-06 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - tinyllama/tinyllama-1.1b-chat-v1.0 [text] Tinyllama-1.1b-Chat-V1.0 · TinyLlama released=2024-01-04 ctx=2048 max_out=4096 caps=[streaming] in=$0.008/Mtok out=$0.016/Mtok - hf/thebloke/llamaguard-7b-awq [text] Llamaguard-7b-Awq · Hugging Face released=2023-12-11 ctx=4096 max_out=4096 caps=[streaming] in=$0.040/Mtok out=$0.080/Mtok - fblgit/una-cybertron-7b-v2-bf16 [text] Una-Cybertron-7b-V2-Bf16 · FBL released=2023-12-01 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - hf/thebloke/neural-chat-7b-v3-1-awq [text] Neural-Chat-7b-V3-1-Awq · Hugging Face released=2023-11-14 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - meta/llama-2-7b-chat-fp16 [text] Llama-2-7b-Chat-Fp16 · Meta released=2023-11-07 ctx=4096 max_out=4096 caps=[streaming] in=$0.560/Mtok out=$6.67/Mtok - mistral/mistral-7b-instruct-v0.1 [text] Mistral-7b-Instruct-V0.1 · Mistral released=2023-11-07 ctx=2824 max_out=4096 caps=[streaming] in=$0.110/Mtok out=$0.190/Mtok - hf/thebloke/deepseek-coder-6.7b-base-awq [text] Deepseek-Coder-6.7b-Base-Awq · Hugging Face released=2023-11-03 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - hf/thebloke/deepseek-coder-6.7b-instruct-awq [text] Deepseek-Coder-6.7b-Instruct-Awq · Hugging Face released=2023-11-03 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - hf/thebloke/openhermes-2.5-mistral-7b-awq [text] Openhermes-2.5-Mistral-7b-Awq · Hugging Face released=2023-11-02 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - hf/thebloke/zephyr-7b-beta-awq [text] Zephyr-7b-Beta-Awq · Hugging Face released=2023-10-27 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - hf/thebloke/mistral-7b-instruct-v0.1-awq [text] Mistral-7b-Instruct-V0.1-Awq · Hugging Face released=2023-09-27 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - meta/llama-2-7b-chat-int8 [text] Llama-2-7b-Chat-Int8 · Meta released=2023-09-25 ctx=8192 max_out=4096 caps=[streaming] in=$0.040/Mtok out=$0.080/Mtok - hf/thebloke/llama-2-13b-chat-awq [text] Llama-2-13b-Chat-Awq · Hugging Face released=2023-07-18 ctx=4096 max_out=4096 caps=[streaming] in=$0.070/Mtok out=$0.140/Mtok - tiiuae/falcon-7b-instruct [text] Falcon-7b-Instruct · TII released=2023-05-25 ctx=4096 max_out=4096 caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - ai4bharat/indictrans2-en-indic-1B [text] IndicTrans2 EN→Indic 1B · AI4Bharat in=$0.021/Mtok out=$0.042/Mtok - facebook/bart-large-cnn [text] BART Large CNN · Meta caps=[streaming] in=$0.050/Mtok out=$0.100/Mtok - huggingface/distilbert-sst-2-int8 [text] DistilBERT SST-2 · Hugging Face $0.00000/req - meta/m2m100-1.2b [text] M2M100 1.2B · Meta in=$0.021/Mtok out=$0.042/Mtok ### reasoning (2) - qwen/qwq-32b [reasoning] Qwq-32b · Alibaba Qwen released=2025-03-05 ctx=24000 max_out=4096 caps=[streaming,json,reasoning] in=$0.200/Mtok out=$0.400/Mtok - deepseek/deepseek-r1-distill-qwen-32b [reasoning] Deepseek-R1-Distill-Qwen-32b · DeepSeek released=2025-01-22 ctx=80000 max_out=4096 caps=[streaming,json,reasoning] in=$0.500/Mtok out=$4.88/Mtok ### vision (2) - unum/uform-gen2-qwen-500m [vision] Uform-Gen2-Qwen-500m · Unum released=2024-02-27 ctx=4096 max_out=4096 caps=[streaming,vision] $0.0000/img - llava-hf/llava-1.5-7b-hf [vision] Llava-1.5-7b-HF · LLaVA released=2023-10-05 ctx=4096 max_out=4096 caps=[streaming,vision] in=$0.500/Mtok out=$0.000/Mtok ### image (26) - black-forest-labs/flux-2-klein-9b [image] Flux-2-Klein-9b · Black Forest Labs released=2026-01-14 $0.0400/img - google/imagen-4 [image] Imagen 4 · Google released=2026-04-14 $0.0400/img - alibaba/wan-2.6-image [image] Wan 2.6 Image · Alibaba released=2026-04-14 $0.0300/img - openai/gpt-image-1.5 [image] GPT Image 1.5 · OpenAI released=2026-04-14 in=$5.00/Mtok out=$10.00/Mtok - recraft/recraftv4 [image] Recraft V4 · Recraft released=2026-04-13 $0.0400/img - recraft/recraftv4-pro [image] Recraft V4 Pro · Recraft released=2026-04-13 $0.2500/img - recraft/recraftv4-pro-vector [image] Recraft V4 Pro Vector · Recraft released=2026-04-13 $0.3000/img - recraft/recraftv4-vector [image] Recraft V4 Vector · Recraft released=2026-04-13 $0.0800/img - bytedance/seedream-4.0 [image] Seedream 4.0 · ByteDance released=2026-04-08 $0.0300/img - bytedance/seedream-4.5 [image] Seedream 4.5 · ByteDance released=2026-04-08 $0.0400/img - bytedance/seedream-5-lite [image] Seedream 5 Lite · ByteDance released=2026-04-08 $0.0350/img - google/nano-banana [image] Nano Banana · Google released=2026-04-08 in=$0.300/Mtok out=$30.00/Mtok - google/nano-banana-2 [image] Nano Banana 2 · Google released=2026-04-08 in=$0.500/Mtok out=$60.00/Mtok - google/nano-banana-pro [image] Nano Banana Pro · Google released=2026-04-08 in=$2.00/Mtok out=$120.00/Mtok $0.1340/img - black-forest-labs/flux-2-klein-4b [image] Flux-2-Klein-4b · Black Forest Labs released=2026-01-14 $0.0000/img - black-forest-labs/flux-2-dev [image] Flux-2-DEV · Black Forest Labs released=2025-11-24 $0.0000/img - leonardo/lucid-origin [image] Lucid-Origin · Leonardo.AI released=2025-08-25 $0.0039/img - leonardo/phoenix-1.0 [image] Phoenix-1.0 · Leonardo.AI released=2025-08-25 $0.0033/img - black-forest-labs/flux-1-schnell [image] Flux-1-Schnell · Black Forest Labs released=2024-08-29 $0.0004/img - bytedance/stable-diffusion-xl-lightning [image] Stable-Diffusion-XL-Lightning · ByteDance released=2024-02-27 $0.0000/img - lykon/dreamshaper-8-lcm [image] Dreamshaper-8-Lcm · Lykon released=2024-02-27 $0.0008/img - runwayml/stable-diffusion-v1-5-img2img [image] Stable-Diffusion-V1-5-Img2img · Runway released=2024-02-27 $0.0000/img - runwayml/stable-diffusion-v1-5-inpainting [image] Stable-Diffusion-V1-5-Inpainting · Runway released=2024-02-27 $0.0000/img - stabilityai/stable-diffusion-xl-base-1.0 [image] Stable-Diffusion-XL-Base-1.0 · Stability AI released=2023-11-10 $0.0000/img - facebook/detr-resnet-50 [image] DETR ResNet-50 · Meta $0.0002/img - microsoft/resnet-50 [image] ResNet-50 · Microsoft $0.0002/img ### video (11) - google/veo-3.1 [video] Veo 3.1 · Google released=2026-04-08 $0.400/sec - pixverse/v5.6 [video] Pixverse V5.6 · PixVerse released=2026-04-15 $0.080/sec - pixverse/v6 [video] Pixverse V6 · PixVerse released=2026-04-15 $0.025/sec - vidu/q3-pro [video] Vidu Q3 Pro · Vidu released=2026-04-15 $0.050/sec - vidu/q3-turbo [video] Vidu Q3 Turbo · Vidu released=2026-04-15 $0.040/sec - runwayml/gen-4.5 [video] RunwayML Gen-4.5 · Runway released=2026-04-14 $0.120/sec - minimax/hailuo-2.3 [video] Hailuo 2.3 · MiniMax released=2026-04-13 $0.047/sec - minimax/hailuo-2.3-fast [video] hailuo-2.3-fast · minimax released=2026-04-13 $0.032/sec - google/veo-3 [video] Veo 3 · Google released=2026-04-08 $0.200/sec - google/veo-3-fast [video] Veo 3 Fast · Google released=2026-04-08 $0.080/sec - google/veo-3.1-fast [video] Veo 3.1 Fast · Google released=2026-04-08 $0.080/sec ### audio-stt (10) - deepgram/nova-3 [audio-stt] Nova-3 · Deepgram released=2025-06-05 caps=[streaming] $0.0052/min - assemblyai/universal-3-pro [audio-stt] Universal 3 Pro · AssemblyAI released=2026-04-13 caps=[streaming] $0.0035/min - cohere/cohere-transcribe-03-2026 [audio-stt] Cohere Transcribe · Cohere released=2026-03-26 $0.0060/min - mistral/voxtral-mini-transcribe-realtime-26-02 [audio-stt] Voxtral Mini Transcribe Realtime · Mistral released=2026-02-01 $0.0080/min - mistral/voxtral-mini-transcribe-26-02 [audio-stt] Voxtral Mini Transcribe · Mistral released=2026-02-01 $0.0060/min - deepgram/flux [audio-stt] Flux · Deepgram released=2025-09-29 caps=[streaming] $0.0077/min - openai/whisper-large-v3-turbo [audio-stt] Whisper-Large-V3-Turbo · OpenAI released=2024-05-22 caps=[streaming] $0.0005/min - openai/whisper-tiny-en [audio-stt] Whisper-Tiny-EN · OpenAI released=2024-04-22 caps=[streaming] $0.0005/min - openai/whisper [audio-stt] Whisper · OpenAI released=2023-09-25 caps=[streaming] $0.0005/min - pipecat-ai/smart-turn-v2 [audio-stt] Smart Turn v2 · Pipecat $0.0021/min ### audio-tts (12) - deepgram/aura-2-en [audio-tts] Aura-2-EN · Deepgram released=2025-10-09 caps=[streaming] $0.0300/1K-chars - inworld/tts-1.5-max [audio-tts] TTS 1.5 Max · Inworld released=2026-04-13 $0.0500/1K-chars - inworld/tts-1.5-mini [audio-tts] TTS 1.5 Mini · Inworld released=2026-04-13 $0.0250/1K-chars - minimax/speech-2.8-hd [audio-tts] Speech 2.8 HD · MiniMax released=2026-04-13 $0.1000/1K-chars - minimax/speech-2.8-turbo [audio-tts] Speech 2.8 Turbo · MiniMax released=2026-04-13 $0.0600/1K-chars - openai/tts-1 [audio-tts] TTS-1 · OpenAI released=2026-04-13 $0.0150/1K-chars - mistral/voxtral-tts-26-03 [audio-tts] Voxtral TTS · Mistral released=2026-03-26 $12.0000/1K-chars - cartesia/sonic-3 [audio-tts] Sonic 3 · Cartesia released=2025-10-29 $30.0000/1K-chars - deepgram/aura-2-es [audio-tts] Aura-2-ES · Deepgram released=2025-10-09 $0.0300/1K-chars - deepgram/aura-1 [audio-tts] Aura-1 · Deepgram released=2025-08-27 $0.0150/1K-chars - myshell-ai/melotts [audio-tts] Melotts · MyShell released=2024-07-19 $0.0002/min - minimax/music-2.6 [audio-tts] MiniMax Music 2.6 · MiniMax $0.052/sec ### embedding (7) - baai/bge-m3 [embedding] Bge-M3 · BAAI released=2024-05-22 ctx=60000 in=$0.012/Mtok - pfnet/plamo-embedding-1b [embedding] Plamo-Embedding-1b · PFN released=2025-09-24 in=$0.019/Mtok - google/embeddinggemma-300m [embedding] Embeddinggemma-300m · Google released=2025-09-04 caps=[vision] in=$0.020/Mtok - qwen/qwen3-embedding-0.6b [embedding] Qwen3-Embedding-0.6b · Alibaba Qwen released=2025-06-18 ctx=8192 in=$0.012/Mtok - baai/bge-large-en-v1.5 [embedding] Bge-Large-EN-V1.5 · BAAI released=2023-11-07 in=$0.204/Mtok - baai/bge-small-en-v1.5 [embedding] Bge-Small-EN-V1.5 · BAAI released=2023-11-07 in=$0.020/Mtok - baai/bge-base-en-v1.5 [embedding] Bge-Base-EN-V1.5 · BAAI released=2023-09-25 ctx=153600 in=$0.067/Mtok ### rerank (1) - baai/bge-reranker-base [rerank] BGE Reranker Base · BAAI $0.00000/req ### moderation (2) - meta/llama-guard-3-8b [moderation] Llama-Guard-3-8b · Meta released=2025-01-22 ctx=131072 max_out=4096 caps=[streaming] in=$0.480/Mtok out=$0.030/Mtok - mistral/mistral-moderation-26-03 [moderation] Mistral Moderation 2 · Mistral $0.00010/req ### music (1) - minimax/music-2.6 [music] music-2.6 · minimax released=2026-04-14 $0.150/sec ### stt (1) - openai/gpt-4o-transcribe [stt] gpt-4o-transcribe · openai released=2026-04-13 caps=[streaming] $0.0060/min ### tts (1) - openai/tts-1-hd [tts] tts-1-hd · openai released=2026-04-13 $0.0300/1K-chars ### translation (2) - ai4bharat/indictrans2-en-indic-1B [translation] Indictrans2-EN-Indic-1B · AI4Bharat released=2025-09-23 in=$0.340/Mtok out=$0.340/Mtok - meta/m2m100-1.2b [translation] M2m100-1.2b · Meta released=2023-09-25 in=$0.340/Mtok out=$0.340/Mtok ### classification (3) - pipecat-ai/smart-turn-v2 [classification] Smart-Turn-V2 · Pipecat released=2025-08-04 $0.0003/min - microsoft/resnet-50 [classification] Resnet-50 · Microsoft released=2023-09-25 $0.0000/img - huggingface/distilbert-sst-2-int8 [classification] Distilbert-SST-2-Int8 · Hugging Face released=2023-09-25 in=$0.026/Mtok ### reranking (1) - baai/bge-reranker-base [reranking] BGE-Reranker-Base · BAAI released=2025-02-14 in=$0.003/Mtok ### object-detection (1) - facebook/detr-resnet-50 [object-detection] detr-resnet-50 · facebook $0.0000/img ## How to read a line Format: `- / [] · [released=] [ctx=] [max_out=] [caps=[...]] ` Pricing units: - `in=$X/Mtok` / `out=$X/Mtok`: per 1M tokens - `cache-read=$X/Mtok`: cache hits bill at ~10% of uncached - `$X/img`: per image (image models) - `$X/sec`: per video-second (video + music models) - `$X/min`: per audio minute (STT) - `$X/1K-chars`: per 1,000 characters (TTS) ## Agent decision rules (recommended) 1. Hard reasoning / code agents / complex tool chains → `anthropic/claude-opus-4.7` 2. General chat, RAG, production default → `anthropic/claude-sonnet-4.6` or `openai/gpt-5.4` 3. Very long context (>500K tokens) → `google/gemini-3.1-pro` (2M) or `openai/gpt-5.4` (1M) 4. Cheap/bulk work → `openai/gpt-5.4-mini`, `google/gemini-3.1-flash-lite`, or `moonshot/kimi-k2.6` (free tier) 5. Image gen → `black-forest-labs/flux-2-klein-9b` (SOTA open) or `google/imagen-4` 6. Video → `google/veo-3.1` 7. STT → `deepgram/nova-3` · TTS → `deepgram/aura-2-en` 8. Embeddings → `baai/bge-m3` · Rerank → `baai/bge-reranker-base` 9. Moderation → `meta/llama-guard-3-8b` See /llms.txt for the editorial overview and /reference for full API docs.