Embeddings
Dense vector embeddings through the OpenAI-shape /v1/embeddings endpoint. Swap model to hit OpenAI text-embedding-3-large, Cohere embed-v4, Voyage voyage-3-large, BGE M3, Jina, Mixedbread, Snowflake Arctic. Matryoshka models let you truncate dimensions after the fact without a re-embed.
Embed text
POST /v1/embeddings { "model": "openai/text-embedding-3-large", "input": ["cat", "dog", "airplane"], "dimensions": 512 } // → { "data": [{ "embedding": [0.012, -0.4, ...], "index": 0 }, ...], "usage": {...} }
Picking a model
| Model | Dims | Best for |
|---|---|---|
openai/text-embedding-3-small | 1536 | Cheap, fast, solid English baseline. |
openai/text-embedding-3-large | 3072 (matryoshka → 256–3072) | Higher recall when accuracy matters. |
cohere/embed-v4 | 1536 | Best multilingual + RAG-tuned. |
voyage/voyage-3-large | 2048 | Top-of-leaderboard code + law + finance. |
baai/bge-m3 | 1024 | Open-weight, multilingual, free self-host path. |
Batching
input accepts a string or an array (up to 2,048 items / 300k tokens per call). We auto-batch across provider limits — one request in, one response out, regardless of provider batch size. For tens of millions of embeddings, use the Batch API at 50% off.
Matryoshka truncation
Models trained with matryoshka representation learning (OpenAI 3-large, Nomic v1.5, Jina v3) expose a dimensions parameter. Requesting fewer dimensions keeps the highest-information prefix of the full vector — smaller storage, comparable recall. Not all models support this; unsupported values throw 422 unsupported_parameter.
Are vectors normalized?
Yes — all embedding responses are L2-normalized, so cosine similarity ≡ dot product. If a provider returns unnormalized vectors, we normalize before returning to you. Override with normalize: false if you need the raw output.