DEV Community

Cover image for DeepSeek API Quickstart: Call It from Anywhere in Under 5 Minutes
apirouter
apirouter

Posted on • Originally published at apirouter.chat

DeepSeek API Quickstart: Call It from Anywhere in Under 5 Minutes

Repository with runnable examples: github.com/apirouter-chat/apirouter-examples

DeepSeek is one of the strongest low-cost reasoning models available right now. The problem for many developers outside China is access: separate account, regional payment, different SDK.

This guide shows a shorter path. You can call DeepSeek using the OpenAI Python SDK through an OpenAI-compatible endpoint. Same SDK, same request format, two lines changed.

What you need

  • Python 3.8+
  • An API key from APIRouter (free $0.50 trial credit)
  • The openai Python package

Current DeepSeek models

Two DeepSeek models are available in the public catalog:

Model Input / 1M Output / 1M Best for
deepseek-ai/DeepSeek-V4-Flash $0.056 $0.112 Fast reasoning, coding, agent workflows
deepseek-ai/DeepSeek-R1 $0.200 $0.872 Multi-step analysis, math, planning

Other Chinese AI models on the same endpoint:

Model Input / 1M Output / 1M Best for
Qwen/Qwen3.6-35B-A3B $0.0228 $0.1828 Latest-gen coding, multilingual
moonshotai/Kimi-K2.6 $0.380 $1.60 Long-context documents
zai-org/GLM-5.1 $0.560 $1.76 Structured engineering output
MiniMaxAI/MiniMax-M2.5 $0.120 $0.480 Workflow automation

Prices are USD per 1M tokens. Last updated 2026-05-12.

For many teams, the right pattern is not one model for everything. Use V4-Flash for high-volume everyday prompts, then route harder analysis to R1 only when the extra cost is justified.

Step 1: Install the SDK

pip install openai
Enter fullscreen mode Exit fullscreen mode

If you've used OpenAI before, you already have this.

Step 2: Set up the client

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_APIROUTER_KEY",
    base_url="https://apirouter.chat/v1"
)
Enter fullscreen mode Exit fullscreen mode

Two changes from a standard OpenAI setup:

  • api_key — your APIRouter key
  • base_urlhttps://apirouter.chat/v1

Step 3: Send your first request

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V4-Flash",
    messages=[
        {"role": "user", "content": "Summarize the trade-offs between Redis and PostgreSQL for caching."},
    ],
)

print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

The response follows the same chat.completions shape you already know:

{
  "id": "chatcmpl_...",
  "object": "chat.completion",
  "model": "deepseek-ai/DeepSeek-V4-Flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "**Redis:** In-memory, sub-ms latency..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 187,
    "total_tokens": 209
  }
}
Enter fullscreen mode Exit fullscreen mode

Streaming responses

For chat UIs and coding assistants:

stream = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V4-Flash",
    messages=[{"role": "user", "content": "Explain API routing in 3 bullets."}],
    stream=True,
)

for event in stream:
    delta = event.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
Enter fullscreen mode Exit fullscreen mode

Error handling

import openai

try:
    response = client.chat.completions.create(
        model="deepseek-ai/DeepSeek-V4-Flash",
        messages=[{"role": "user", "content": "Hello"}],
    )
except openai.AuthenticationError:
    print("Invalid API key — check your APIRouter key")
except openai.RateLimitError:
    print("Rate limited — retry with exponential backoff")
except openai.APIStatusError as e:
    print(f"Error {e.status_code}: {e.message}")
Enter fullscreen mode Exit fullscreen mode

Common status codes:

Status Cause Fix
401 Missing or invalid key Create a new key in console
402 Insufficient balance Add credit or lower request size
404 Wrong model name Copy exact name from the pricing page
429 Rate limited Retry with backoff, reduce concurrency
503 Model unavailable Try another model, check usage logs

Before you go to production

A useful first evaluation does not need to be large. Use five prompts that represent your real product:

  1. Coding prompt — a realistic code-generation or review task
  2. Long prompt — a support ticket or document-heavy request
  3. Summarization prompt — a concise output task
  4. JSON-format prompt — structured output to validate parseability
  5. Refusal prompt — a safety-boundary check

Record output quality, latency, token usage, and whether the response can be consumed by the next step in your application.

If V4-Flash handles the set cleanly, keep it as the default. If it misses only the hardest reasoning prompt, keep R1 as a targeted second route instead of replacing everything with a more expensive model.

Avoid these mistakes

  1. Don't hard-code model names from blog posts. The pricing page is the source of truth. Copy the exact model string.
  2. Don't judge only by input price. Output length can dominate cost when the model writes long explanations or code blocks.
  3. Don't test with one prompt. One impressive demo answer doesn't prove the model fits your actual workload.

When DeepSeek is the right choice

DeepSeek is strong when you need a low-cost model that handles coding, structured analysis, and agent-style task breakdown.

If your workload is coding-led or multilingual, compare with Qwen. If it's document-heavy with long context, look at Kimi. If it's office workflow automation, try MiniMax.

All of them are accessible through the same endpoint, same API key.

Try it now

  1. Sign up at apirouter.chat/en — $0.50 free trial credit, no Chinese phone needed
  2. Create an API key in the console
  3. Copy the Python snippet above and run it

No separate DeepSeek account. No regional payment setup. One key, one balance, OpenAI-compatible from the first request.

Runnable examples: github.com/apirouter-chat/apirouter-examples


This guide uses APIRouter — an OpenAI-compatible API gateway for Chinese AI models including DeepSeek, Qwen, Kimi, GLM, and MiniMax. See live pricing for current token rates. No Chinese phone number or domestic payment method required.

Top comments (0)