DEV Community

Nika Kudukhashvili
Nika Kudukhashvili

Posted on

I built DeepWrap: a Python SDK and CLI for DeepSeek Chat

DeepSeek Chat was free in the browser.
But the moment I wanted to use it like a developer, it became a different story.

That annoyed me more than it probably should have.

If something is usable in the browser, why should it suddenly feel blocked, awkward, or artificially expensive the moment you want to call it from Python, from your terminal, or from your own local tools?

That mismatch was the whole reason I built DeepWrap.

What started as “let me inspect a few requests and see how hard this is” turned into a full project: a Python SDK, a terminal CLI, and a local HTTP API wrapper for DeepSeek Chat.

Why I built it

The real motivation was simple:

DeepSeek was free in the browser, but not free in the way developers actually want to use it.

And that felt absurd.

I wanted to use it in real workflows:

  • Python scripts
  • terminal sessions
  • reusable chat sessions
  • streaming responses
  • local tools
  • local HTTP integrations

I did not want to babysit browser tabs forever just to use a model in a developer-friendly way.

So the idea behind DeepWrap was basically:

take the browser experience and make it usable from code.

How I built it

At first, I genuinely thought this was going to be easy.

I opened DevTools, went to the Network tab, started looking at the XHR requests, and my first thought was:

“Okay, this is probably just a matter of replaying the right requests with the right headers.”

That optimism did not survive very long.

1. Inspect the browser requests

The first step was just observing the flow:

  • how a chat session gets created
  • what headers are sent
  • what the prompt payload looks like
  • how the response stream comes back

At a glance, the flow looked simple:

  1. create a session
  2. send a prompt
  3. read the stream

That was enough to sketch the basic client structure.

2. Hit the first wall: PoW + WASM

Then came the first real problem: proof-of-work.

Some requests were not just normal authenticated calls. The browser had to solve a PoW challenge, and that logic was implemented with WASM.

So the project instantly got more interesting.

Instead of only replaying requests, I had to:

  • inspect how the challenge is fetched
  • understand the challenge payload
  • find where the browser solves it
  • reverse the WASM input/output behavior
  • replicate the solve step from Python

That was the point where this stopped being a “quick wrapper.”

3. Add auth that feels usable

I also didn’t want the whole thing to rely only on pasting bearer tokens manually.

Yes, that works.
No, that doesn’t feel great.

So I added browser-based auth too.

That meant building a flow that could:

  • launch a browser
  • connect through DevTools / remote debugging
  • observe authenticated traffic
  • extract the bearer token
  • normalize it
  • reuse it for the SDK and CLI

That part was messy, but necessary if I wanted the project to feel real.

4. Parse the stream properly

Once auth and PoW worked, I still had to handle the actual chat stream.

And it was not just “text in, text out.”

The stream included:

  • thinking fragments
  • response fragments
  • partial updates
  • metadata events
  • close events

So I built a parser that could:

  • preserve multi-turn state
  • keep track of parent_message_id
  • support both streaming and non-streaming
  • optionally separate thinking from final output

That let the public API stay simple even if the internals were not.

5. Turn it into a real tool

Once the Python API felt stable, I pushed it further:

  • an interactive CLI
  • a local FastAPI server

So DeepWrap gradually became three things:

  1. a Python SDK
  2. a terminal chat interface
  3. a local HTTP API wrapper

That was the point where it stopped feeling like a hack and started feeling like a product.

What using it looks like

If you just want to try it, the flow is simple.

Install

pip install deepwrap
Enter fullscreen mode Exit fullscreen mode

Create a client

from deepwrap import Client

client = Client()
Enter fullscreen mode Exit fullscreen mode

If you already have a token configured, that’s enough.

Create a chat session

chat = client.chats.create_session(model="expert")
Enter fullscreen mode Exit fullscreen mode

Send a normal response

response = chat.respond(
    "Explain quantum computing in one short sentence.",
    thinking=True,
    search=True,
    stream=False,
)

print(response)
Enter fullscreen mode Exit fullscreen mode

Stream the output

for chunk in chat.respond(
    "Write a short explanation of black holes.",
    thinking=True,
    search=True,
    stream=True,
):
    print(chunk, end="", flush=True)

print()
Enter fullscreen mode Exit fullscreen mode

Keep context across turns

print(chat.respond("My name is Nika.", stream=False))
print(chat.respond("What is my name?", stream=False))
Enter fullscreen mode Exit fullscreen mode

That works because the session keeps track of the conversation state internally.

CLI usage

I also wanted it to feel good in the terminal.

You can start the interactive interface with:

deepwrap
Enter fullscreen mode Exit fullscreen mode

Or use one-shot terminal calls:

deepwrap chat "Explain recursion in one sentence."
Enter fullscreen mode Exit fullscreen mode

That was important to me because sometimes you don’t want a script — you just want a fast terminal workflow.

God Mode

I also added an experimental feature called God Mode.

This is not some magical hidden model. It is a session-level behavior override implemented through prompt injection on the first user turn.

In practice, that means:

  • it changes how the model behaves for that session
  • it is intentionally intrusive
  • it can diverge from normal behavior in unpredictable ways

I added it mostly as a developer/testing feature, not as a normal mode for everyday use.

So I treat it as exactly what it is:
an experimental override for controlled testing, not something meant for general use.

Security notes

A few important notes:

  • Do not commit your bearer token
  • Prefer environment variables or local saved config over hardcoding secrets
  • Tokens saved by DeepWrap are stored locally in the user config directory
  • Browser auth should only be used in environments you trust
  • Experimental features like God Mode should be treated as development-only behavior modifiers

DeepWrap is an unofficial wrapper, so use it responsibly.

GitHub

The project is open source here:

GitHub: https://github.com/Kuduxaaa/deepwrap

Feedback

This whole project started from a very simple frustration:

“Why is it free in the browser, but awkward the moment I want to use it like a developer?”

Then it turned into a much bigger project than I expected:
reverse-engineering requests, dealing with PoW WASM, browser auth, session handling, streaming, CLI UX, and local API design.

If you check it out, I’d genuinely love feedback on:

  • API design
  • CLI UX
  • auth flow
  • architecture
  • docs
  • feature ideas

If something feels awkward, overbuilt, underbuilt, or just weird, tell me.

That kind of feedback is the most useful.

Top comments (0)