Nika Kudukhashvili

Posted on May 27

I built DeepWrap: a Python SDK and CLI for DeepSeek Chat

#ai #python #cli #showdev

DeepSeek Chat was free in the browser.
But the moment I wanted to use it like a developer, it became a different story.

That annoyed me more than it probably should have.

If something is usable in the browser, why should it suddenly feel blocked, awkward, or artificially expensive the moment you want to call it from Python, from your terminal, or from your own local tools?

That mismatch was the whole reason I built DeepWrap.

What started as “let me inspect a few requests and see how hard this is” turned into a full project: a Python SDK, a terminal CLI, and a local HTTP API wrapper for DeepSeek Chat.

Why I built it

The real motivation was simple:

DeepSeek was free in the browser, but not free in the way developers actually want to use it.

And that felt absurd.

I wanted to use it in real workflows:

Python scripts
terminal sessions
reusable chat sessions
streaming responses
local tools
local HTTP integrations

I did not want to babysit browser tabs forever just to use a model in a developer-friendly way.

So the idea behind DeepWrap was basically:

take the browser experience and make it usable from code.

How I built it

At first, I genuinely thought this was going to be easy.

I opened DevTools, went to the Network tab, started looking at the XHR requests, and my first thought was:

“Okay, this is probably just a matter of replaying the right requests with the right headers.”

That optimism did not survive very long.

1. Inspect the browser requests

The first step was just observing the flow:

how a chat session gets created
what headers are sent
what the prompt payload looks like
how the response stream comes back

At a glance, the flow looked simple:

create a session
send a prompt
read the stream

That was enough to sketch the basic client structure.

2. Hit the first wall: PoW + WASM

Then came the first real problem: proof-of-work.

Some requests were not just normal authenticated calls. The browser had to solve a PoW challenge, and that logic was implemented with WASM.

So the project instantly got more interesting.

Instead of only replaying requests, I had to:

inspect how the challenge is fetched
understand the challenge payload
find where the browser solves it
reverse the WASM input/output behavior
replicate the solve step from Python

That was the point where this stopped being a “quick wrapper.”

3. Add auth that feels usable

I also didn’t want the whole thing to rely only on pasting bearer tokens manually.

Yes, that works.
No, that doesn’t feel great.

So I added browser-based auth too.

That meant building a flow that could:

launch a browser
connect through DevTools / remote debugging
observe authenticated traffic
extract the bearer token
normalize it
reuse it for the SDK and CLI

That part was messy, but necessary if I wanted the project to feel real.

4. Parse the stream properly

Once auth and PoW worked, I still had to handle the actual chat stream.

And it was not just “text in, text out.”

The stream included:

thinking fragments
response fragments
partial updates
metadata events
close events

So I built a parser that could:

preserve multi-turn state
keep track of parent_message_id
support both streaming and non-streaming
optionally separate thinking from final output

That let the public API stay simple even if the internals were not.

5. Turn it into a real tool

Once the Python API felt stable, I pushed it further:

an interactive CLI
a local FastAPI server

So DeepWrap gradually became three things:

a Python SDK
a terminal chat interface
a local HTTP API wrapper

That was the point where it stopped feeling like a hack and started feeling like a product.

What using it looks like

If you just want to try it, the flow is simple.

Install

pip install deepwrap

Create a client

from deepwrap import Client

client = Client()

If you already have a token configured, that’s enough.

Create a chat session

chat = client.chats.create_session(model="expert")

Send a normal response

response = chat.respond(
    "Explain quantum computing in one short sentence.",
    thinking=True,
    search=True,
    stream=False,
)

print(response)

Stream the output

for chunk in chat.respond(
    "Write a short explanation of black holes.",
    thinking=True,
    search=True,
    stream=True,
):
    print(chunk, end="", flush=True)

print()

Keep context across turns

print(chat.respond("My name is Nika.", stream=False))
print(chat.respond("What is my name?", stream=False))

That works because the session keeps track of the conversation state internally.

CLI usage

I also wanted it to feel good in the terminal.

You can start the interactive interface with:

deepwrap

Or use one-shot terminal calls:

deepwrap chat "Explain recursion in one sentence."

That was important to me because sometimes you don’t want a script — you just want a fast terminal workflow.

God Mode

I also added an experimental feature called God Mode.

This is not some magical hidden model. It is a session-level behavior override implemented through prompt injection on the first user turn.

In practice, that means:

it changes how the model behaves for that session
it is intentionally intrusive
it can diverge from normal behavior in unpredictable ways

I added it mostly as a developer/testing feature, not as a normal mode for everyday use.

So I treat it as exactly what it is:
an experimental override for controlled testing, not something meant for general use.

Security notes

A few important notes:

Do not commit your bearer token
Prefer environment variables or local saved config over hardcoding secrets
Tokens saved by DeepWrap are stored locally in the user config directory
Browser auth should only be used in environments you trust
Experimental features like God Mode should be treated as development-only behavior modifiers

DeepWrap is an unofficial wrapper, so use it responsibly.

GitHub

The project is open source here:

GitHub: https://github.com/Kuduxaaa/deepwrap

Feedback

This whole project started from a very simple frustration:

“Why is it free in the browser, but awkward the moment I want to use it like a developer?”

Then it turned into a much bigger project than I expected:
reverse-engineering requests, dealing with PoW WASM, browser auth, session handling, streaming, CLI UX, and local API design.

If you check it out, I’d genuinely love feedback on:

API design
CLI UX
auth flow
architecture
docs
feature ideas

If something feels awkward, overbuilt, underbuilt, or just weird, tell me.

That kind of feedback is the most useful.

DEV Community