Aggregate token volume routed through AIgateway over the last day, week, and month. Pick a model the market trusts — or pick the underdog it hasn't noticed yet.
| # | Model | Provider | Modality | Tokens | p50 latency | Δ |
|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 anthropic/claude-opus-4.7 | Anthropic | text | 4.92B | 1299ms | -37% |
| 2 | Gemini 3.1 Pro google/gemini-3.1-pro | text | 2.50B | 1111ms | +55% | |
| 3 | GPT-5.4 Mini openai/gpt-5.4-mini | OpenAI | text | 2.49B | 1374ms | -16% |
| 4 | Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 | Anthropic | text | 2.46B | 1035ms | +0% |
| 5 | Kimi K2.6 moonshot/kimi-k2.6 | Moonshot | text | 2.34B | 1035ms | -47% |
| 6 | Claude Haiku 4.5 anthropic/claude-haiku-4.5 | Anthropic | text | 2.25B | 235ms | +46% |
| 7 | Imagen 4 google/imagen-4 | image | 2.13B | 1349ms | +62% | |
| 8 | GPT-5.4 openai/gpt-5.4 | OpenAI | text | 2.05B | 828ms | -7% |
| 9 | Gpt-Oss-120b openai/gpt-oss-120b | OpenAI | text | 1.42B | 1026ms | -25% |
| 10 | Llama-4-Scout-17b-16e-Instruct meta/llama-4-scout-17b-16e-instruct | Meta | text | 1.28B | 674ms | +73% |
| 11 | Qwen2.5-Coder-32b-Instruct qwen/qwen2.5-coder-32b-instruct | Alibaba Qwen | text | 1.19B | 829ms | -9% |
| 12 | Veo 3.1 google/veo-3.1 | video | 1.18B | 985ms | -26% | |
| 13 | Sonar Deep Research perplexity/sonar-deep-research | Perplexity | text | 1.08B | 1209ms | +39% |
| 14 | Resnet-50 microsoft/resnet-50 | Microsoft | classification | 1.07B | 201ms | -43% |
| 15 | ResNet-50 microsoft/resnet-50 | Microsoft | image | 1.07B | 201ms | -43% |
| 16 | Gemma-3-12b-IT google/gemma-3-12b-it | text | 1.06B | 198ms | +37% | |
| 17 | Nano Banana Pro google/nano-banana-pro | image | 1.05B | 1379ms | +47% | |
| 18 | Glm-4.7-Flash zai-org/glm-4.7-flash | Zhipu AI | text | 1.04B | 1038ms | +16% |
| 19 | Whisper openai/whisper | OpenAI | audio-stt | 1.03B | 1021ms | +63% |
| 20 | Bge-M3 baai/bge-m3 | BAAI | embedding | 1.02B | 922ms | -34% |
| 21 | Tinyllama-1.1b-Chat-V1.0 tinyllama/tinyllama-1.1b-chat-v1.0 | TinyLlama | text | 1.01B | 1133ms | +42% |
| 22 | Stable-Diffusion-V1-5-Img2img runwayml/stable-diffusion-v1-5-img2img | Runway | image | 1.01B | 400ms | -2% |
| 23 | Uform-Gen2-Qwen-500m unum/uform-gen2-qwen-500m | Unum | vision | 1.00B | 328ms | -51% |
| 24 | Stable-Diffusion-XL-Base-1.0 stabilityai/stable-diffusion-xl-base-1.0 | Stability AI | image | 1.00B | 190ms | -22% |
| 25 | Veo 3 Fast google/veo-3-fast | video | 998.6M | 389ms | -3% | |
| 26 | Granite-4.0-H-Micro ibm-granite/granite-4.0-h-micro | IBM | text | 997.7M | 187ms | -3% |
| 27 | Flux-1-Schnell black-forest-labs/flux-1-schnell | Black Forest Labs | image | 993.3M | 182ms | -34% |
| 28 | Mistral Small 4 mistral/mistral-small-4-0-26-03 | Mistral | text | 992.0M | 181ms | -34% |
| 29 | Qwen 3 Max alibaba/qwen3-max | Alibaba | text | 991.4M | 1183ms | +9% |
| 30 | GPT-5.4 Nano openai/gpt-5.4-nano | OpenAI | text | 973.7M | 963ms | +56% |
| 31 | Pixverse V5.6 pixverse/v5.6 | PixVerse | video | 967.7M | 1153ms | -26% |
| 32 | Llama-3.1-8b-Instruct-Fp8 meta/llama-3.1-8b-instruct-fp8 | Meta | text | 950.1M | 1066ms | +23% |
| 33 | Claude Sonnet 4 anthropic/claude-sonnet-4 | Anthropic | text | 945.9M | 1333ms | -10% |
| 34 | TTS-1 openai/tts-1 | OpenAI | audio-tts | 939.4M | 193ms | +51% |
| 35 | Deepseek-Coder-6.7b-Base-Awq hf/thebloke/deepseek-coder-6.7b-base-awq | Hugging Face | text | 932.3M | 1247ms | +2% |
| 36 | Meta-Llama-3-8b-Instruct hf/meta-llama/meta-llama-3-8b-instruct | Hugging Face | text | 928.2M | 1180ms | -12% |
| 37 | Llama-3.1-8b-Instruct meta/llama-3.1-8b-instruct | Meta | text | 925.0M | 909ms | -31% |
| 38 | o4-mini openai/o4-mini | openai | text | 924.5M | 975ms | +50% |
| 39 | Vidu Q3 Turbo vidu/q3-turbo | Vidu | video | 921.9M | 236ms | +78% |
| 40 | Mistral-7b-Instruct-V0.2 hf/mistral/mistral-7b-instruct-v0.2 | Hugging Face | text | 921.4M | 1306ms | +49% |
| Provider | Models | Tokens |
|---|---|---|
| 16 | 15.46B | |
| OpenAI | 13 | 12.61B |
| Anthropic | 6 | 11.41B |
| Meta | 21 | 9.92B |
| Hugging Face | 15 | 7.69B |
| Mistral | 8 | 5.48B |
| Alibaba Qwen | 8 | 5.09B |
| Recraft | 4 | 2.89B |
| Moonshot | 2 | 2.73B |
| BAAI | 6 | 2.52B |
| Alibaba | 3 | 2.40B |
| MiniMax | 5 | 2.35B |
Every request through the gateway increments a counter keyed on the routed model. We anonymize before aggregation, publish daily, and never expose a single customer's workload. If you want your app attributed on the leaderboard, opt in from Dashboard → Settings.