Pricing per million tokens, context window, capabilities — pulled from each provider's public docs. All 1 are available via the same AIgateway OpenAI-compatible endpoint; flip the model string to switch.
ByteDance's next-generation video model with a unified multimodal architecture. Generates high-quality video with synchronized audio from text, images, video clips, and audio inputs. Supports multimodal references (up to 9 images, 3 videos, 3 audio files), native audio generation, video editing, video extension, intelligent duration, and adaptive aspect ratio.