Build a Production-Ready Chatbot with DeepSeek and Python in 10 Minutes

What We're Building

A production-ready chatbot with:

Streaming responses (tokens appear in real-time)
Conversation memory
Error handling and retries
Cost tracking

Prerequisites

Python 3.10+
An API key (get one at Token China - free 100K tokens)

Step 1: Install Dependencies

pip install openai

Step 2: Basic Chatbot

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.token-china.cc/v1"
)

class ChatBot:
    def __init__(self, system_prompt="You are a helpful assistant."):
        self.client = client
        self.model = "deepseek-v4-pro"
        self.messages = [{"role": "system", "content": system_prompt}]
        self.total_tokens = 0
        self.total_cost = 0.0

    def chat(self, user_input: str) -> str:
        self.messages.append({"role": "user", "content": user_input})

        try:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=self.messages,
                temperature=0.7,
                max_tokens=2000
            )

            assistant_msg = response.choices[0].message.content
            self.messages.append({"role": "assistant", "content": assistant_msg})

            usage = response.usage
            self.total_tokens += usage.total_tokens
            self.total_cost += (usage.prompt_tokens / 1_000_000 * 2.0 + 
                              usage.completion_tokens / 1_000_000 * 2.0)

            return assistant_msg

        except Exception as e:
            return f"Error: {str(e)}"

# Usage
bot = ChatBot("You are a Python expert who gives concise answers.")

while True:
    user_input = input("\nYou: ")
    if user_input.lower() in ["quit", "exit", "q"]:
        break

    response = bot.chat(user_input)
    print(f"\nBot: {response}")

Step 3: Add Streaming

Streaming makes the bot feel much more responsive:

def chat_stream(self, user_input: str):
    self.messages.append({"role": "user", "content": user_input})

    try:
        stream = self.client.chat.completions.create(
            model=self.model,
            messages=self.messages,
            temperature=0.7,
            max_tokens=2000,
            stream=True
        )

        full_response = ""
        for chunk in stream:
            if chunk.choices[0].delta.content:
                content = chunk.choices[0].delta.content
                full_response += content
                print(content, end="", flush=True)

        self.messages.append({"role": "assistant", "content": full_response})
        return full_response

    except Exception as e:
        print(f"\nError: {str(e)}")
        return None

Step 4: Production Considerations

Error Handling

import time
from openai import APIError, RateLimitError, APITimeoutError

def chat_with_retry(self, user_input: str, max_retries=3):
    for attempt in range(max_retries):
        try:
            return self.chat(user_input)
        except RateLimitError:
            wait_time = 2 ** attempt
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)
        except APITimeoutError:
            print(f"Timeout on attempt {attempt + 1}. Retrying...")
            time.sleep(1)
    return "Sorry, I'm having trouble connecting."

Why DeepSeek for Chatbots?

Cost effective - $2/1M tokens vs $15/1M for GPT-4o
Fast - DeepSeek V4 Flash is optimized for speed
128K context - handle long conversations
OpenAI compatible - use existing tools and libraries

Try It Yourself

Get a free API key at Token China and start building. You get 100K free tokens to test with.

Built something cool with DeepSeek? Share it in the comments!

DEV Community