Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.
Whisper-Large-V3-Turbo (openai/whisper-large-v3-turbo) is a audio-stt model from OpenAI, released 2024-05-22. Pricing via AIgateway: $0.0060 per minute. Capabilities: async. Call it via https://api.aigateway.sh/v1/audio/transcriptions — set model="openai/whisper-large-v3-turbo". Best for: Meeting transcripts, Captions, Voice agents.
curl https://api.aigateway.sh/v1/audio/transcriptions \
-H "Authorization: Bearer $AIGATEWAY_API_KEY" \
-F model="openai/whisper-large-v3-turbo" \
-F file="@audio.mp3"/v1/jobs/<id>, or have the result pushed to your webhook_url. Best for long files and batch pipelines.# Submit (returns immediately with a job id)
curl -X POST https://api.aigateway.sh/v1/audio/transcriptions \
-H "Authorization: Bearer $AIGATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/whisper-large-v3-turbo","audio_url":"https://example.com/audio.wav","async":true}'
# -> {"id":"<job_id>","status":"processing"}
# Poll for the transcript
curl https://api.aigateway.sh/v1/jobs/<job_id> \
-H "Authorization: Bearer $AIGATEWAY_API_KEY"
# ...or skip polling: pass "webhook_url" and we POST the signed result when ready
# {"model":"openai/whisper-large-v3-turbo","audio_url":"...","webhook_url":"https://you.example.com/hook"}# multipart/form-data — use curl -F or SDK file upload model="openai/whisper-large-v3-turbo" file=@audio.mp3 response_format=json # or "verbose_json", "text", "srt", "vtt" language=en # optional
{
"text": "Hello from AIgateway.",
"language": "en",
"duration": 1.82
}from openai import OpenAI
client = OpenAI(base_url="https://api.aigateway.sh/v1", api_key="sk-aig-...")
with open("audio.mp3", "rb") as f:
r = client.audio.transcriptions.create(model="openai/whisper-large-v3-turbo", file=f)
print(r.text)