models/Deepseek/Deepseek-R1-Distill-Qwen-32b
Deepseek

Deepseek-R1-Distill-Qwen-32b

reasoning
Playground →Compare

DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.

MODALITIES
text
INPUT
$0.500 /1M
OUTPUT
$4.88 /1M
CONTEXT
131K tok
MAX OUTPUT
4K tok
USAGE
198.9M
0% market share
RELEASED
2025-01-22

Deepseek-R1-Distill-Qwen-32b (deepseek/deepseek-r1-distill-qwen-32b) is a reasoning model from Deepseek, released 2025-01-22. Context window: 131,072 tokens; max output 4,096. Pricing via AIgateway: input $0.500/M tokens, output $4.88/M tokens. Capabilities: streaming, json, reasoning. Call it via https://api.aigateway.sh/v1/chat/completions with the OpenAI SDK — set model="deepseek/deepseek-r1-distill-qwen-32b". Best for: Math solvers, Code review, Step-by-step reasoning.

model · deepseek/deepseek-r1-distill-qwen-32bfamily · Qwen

Use this model

model: deepseek/deepseek-r1-distill-qwen-32b
curl https://api.aigateway.sh/v1/chat/completions \
  -H "Authorization: Bearer $AIGATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek/deepseek-r1-distill-qwen-32b","messages":[{"role":"user","content":"hello"}],"stream":true}'

Capabilities

StreamingJSON modeReasoning
CONTEXT
131,072 tok
MAX OUTPUT
4,096 tok

Strengths

  • Strong on math + code
  • R1 reasoning in a 32B Qwen shell
  • Open-weight

Use cases

Math solversCode reviewStep-by-step reasoning

Adoption

198.9M tokens
151.0K requests · 0% of tracked market volume
See the full leaderboard →
Aggregate usage across the open model ecosystem (as of 2026-05-30).

Pricing

Input$0.500 / 1M tokens
Output$4.88 / 1M tokens
Open-weight
You pay pass-through · 5% applied at credit top-up, not per-call.
Try in playground →CompareAPI referenceSee usage ranking →

Collections

More text models →More from DeepseekFrontier models →Free-tier models →
API schema

Call Deepseek-R1-Distill-Qwen-32b from any OpenAI SDK

POST https://api.aigateway.sh/v1/chat/completions·Content-Type: application/json·Auth: Bearer sk-aig-...

Request body

json
{
  "model": "deepseek/deepseek-r1-distill-qwen-32b",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user",   "content": "Hello!" }
  ],
  "temperature": 0.7,
  "top_p": 0.95,
  "max_tokens": 1024,
  "stream": false,
  "response_format": { "type": "json_object" }

}

Response

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1776947082,
  "model": "deepseek/deepseek-r1-distill-qwen-32b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 12,
    "total_tokens": 36
  }
}

Streaming (SSE) — set "stream": true

// 1. Role announcement (first chunk):
data: {"choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

// 2. Content chunks (final answer):
data: {"choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

// Finish chunk:
data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

// Terminator:
data: [DONE]

Quickstart

# pip install aigateway-py openai
# aigateway-py adds sub-accounts, evals, replays, jobs, webhook verify.
# openai SDK covers chat — drop-in per our SDK's own guidance.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.aigateway.sh/v1",
    api_key="sk-aig-...",
)

stream = client.chat.completions.create(
    model="deepseek/deepseek-r1-distill-qwen-32b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Errors

401authentication_errorInvalid or missing API key
402insufficient_creditsWallet empty (PAYG only)
404not_foundUnknown model or endpoint
429rate_limit_errorOver per-minute limit — see Retry-After header
500server_errorUpstream provider failed (retryable)
503service_unavailableUpstream saturated (retryable)
Full docs →API reference →OpenAPI spec →llms.txt →

Frequently asked questions

What is Deepseek-R1-Distill-Qwen-32b?
DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. It is a reasoning model from Deepseek, accessible via AIgateway's OpenAI-compatible API at slug deepseek/deepseek-r1-distill-qwen-32b.
How much does Deepseek-R1-Distill-Qwen-32b cost via AIgateway?
Input costs $0.500 per 1M tokens; output costs $4.88 per 1M tokens. Pass-through plus a 5% platform fee applied at top-up, not per call.
What is the context window of Deepseek-R1-Distill-Qwen-32b?
131,072 tokens. Maximum output is 4,096 tokens.
How do I call Deepseek-R1-Distill-Qwen-32b from my code?
Point the OpenAI SDK at https://api.aigateway.sh/v1 with your AIgateway key and set model to "deepseek/deepseek-r1-distill-qwen-32b". The request and response shapes match OpenAI exactly.
Does Deepseek-R1-Distill-Qwen-32b support streaming, tool calling, vision, and JSON mode?
Streaming — yes. Tool calling — no. Vision — no. JSON mode — yes. Prompt caching — no.
What are the best use cases for Deepseek-R1-Distill-Qwen-32b?
Math solvers, Code review, Step-by-step reasoning. Key strengths: Strong on math + code; R1 reasoning in a 32B Qwen shell; Open-weight.
Can I bring my own Deepseek API key (BYOK)?
Yes. Attach a Deepseek key in your AIgateway dashboard and this model flips to pass-through — you pay Deepseek directly and AIgateway waives the 5% platform fee on those calls.