Inference

Image generation

Text-to-image and image-to-image generation through the OpenAI-shape /v1/images/generations endpoint. Swap models by ID — Black Forest Labs Flux, DALL-E 3, Google Imagen 3, Stable Diffusion XL, Ideogram, Recraft. Cheap edge models return inline; the heaviest frontier models use async jobs.

Generate an image

POST /v1/images/generations
{
  "model": "bfl/flux-2-klein-9b",
  "prompt": "a neon ramen shop in the rain, cinematic, 35mm",
  "size": "1024x1024",
  "n": 1,
  "response_format": "url"
}
// → { "created": ..., "data": [{ "url": "https://media.aigateway.sh/..." }] }

Picking a model

Model	Strength
`bfl/flux-2-klein-9b`	Fast, photorealistic, cheap — great default.
`bfl/flux-2-pro`	Top-quality Flux, slower, higher cost.
`openai/dall-e-3`	Strong prompt adherence, good typography.
`google/imagen-3`	Best hands + anatomy, photorealism.
`stability-ai/sdxl`	Open-weight, tune with custom LoRAs via BYOK.
`ideogram/ideogram-v3`	Best-in-class text rendering in images.

Response formats

response_format accepts "url" (default — 24h-signed URL from media.aigateway.sh) or "b64_json" (inline base64, ideal for serverless).

Image-to-image + edits

/v1/images/edits takes a source image file (multipart) or image_url (JSON), an optional mask, and a prompt. Default model is bria/fibo-edit/edit — a fast, general-purpose editor. Override with any model whose capabilities include image-to-image (Bria Fibo family, Bytedance Seedream edit, Flux dev).

POST /v1/images/edits  // multipart/form-data
image=@cat.png
prompt="make the cat wear sunglasses"
model="bria/fibo-edit/edit"

// JSON variant (no file upload)
{ "image_url": "https://…/cat.png", "prompt": "sunglasses" }
// → { "created": ..., "data": [{ "url": "https://media.aigateway.sh/..." }] }

Uploads are capped at 25 MB per file. The mask field is honored on Bria Fibo and eraser endpoints; other edit models silently ignore it.

Async / long-running models

High-quality text-to-video models and the heaviest Flux variants run on the async-job pattern — submit returns 202, poll or webhook on completion. The body shape is the same; the gateway auto-detects which pattern a given model needs.

← PreviousTool calling Next →Audio · TTS / STT