1 audio-tts model from Google.
1 embedding model from Google.
13 image models from Google.
9 text models from Google.
19 video models from Google.
Google DeepMind ships the Gemini family (2M-token context on Gemini 3.1 Pro), Imagen (image generation), Veo (video), and Gemma (open-weight). Gemini is the go-to when you need massive context windows or native multi-modal input.
Headquartered in Mountain View, CA. Homepage: ai.google.dev.