DeepSeek Chat was free in the browser.
But the moment I wanted to use it like a developer, it became a different story.
That annoyed me more than it probably should have.
If something is usable in the browser, why should it suddenly feel blocked, awkward, or artificially expensive the moment you want to call it from Python, from your terminal, or from your own local tools?
That mismatch was the whole reason I built DeepWrap.
What started as “let me inspect a few requests and see how hard this is” turned into a full project: a Python SDK, a terminal CLI, and a local HTTP API wrapper for DeepSeek Chat.
Why I built it
The real motivation was simple:
DeepSeek was free in the browser, but not free in the way developers actually want to use it.
And that felt absurd.
I wanted to use it in real workflows:
- Python scripts
- terminal sessions
- reusable chat sessions
- streaming responses
- local tools
- local HTTP integrations
I did not want to babysit browser tabs forever just to use a model in a developer-friendly way.
So the idea behind DeepWrap was basically:
take the browser experience and make it usable from code.
How I built it
At first, I genuinely thought this was going to be easy.
I opened DevTools, went to the Network tab, started looking at the XHR requests, and my first thought was:
“Okay, this is probably just a matter of replaying the right requests with the right headers.”
That optimism did not survive very long.
1. Inspect the browser requests
The first step was just observing the flow:
- how a chat session gets created
- what headers are sent
- what the prompt payload looks like
- how the response stream comes back
At a glance, the flow looked simple:
- create a session
- send a prompt
- read the stream
That was enough to sketch the basic client structure.
2. Hit the first wall: PoW + WASM
Then came the first real problem: proof-of-work.
Some requests were not just normal authenticated calls. The browser had to solve a PoW challenge, and that logic was implemented with WASM.
So the project instantly got more interesting.
Instead of only replaying requests, I had to:
- inspect how the challenge is fetched
- understand the challenge payload
- find where the browser solves it
- reverse the WASM input/output behavior
- replicate the solve step from Python
That was the point where this stopped being a “quick wrapper.”
3. Add auth that feels usable
I also didn’t want the whole thing to rely only on pasting bearer tokens manually.
Yes, that works.
No, that doesn’t feel great.
So I added browser-based auth too.
That meant building a flow that could:
- launch a browser
- connect through DevTools / remote debugging
- observe authenticated traffic
- extract the bearer token
- normalize it
- reuse it for the SDK and CLI
That part was messy, but necessary if I wanted the project to feel real.
4. Parse the stream properly
Once auth and PoW worked, I still had to handle the actual chat stream.
And it was not just “text in, text out.”
The stream included:
- thinking fragments
- response fragments
- partial updates
- metadata events
- close events
So I built a parser that could:
- preserve multi-turn state
- keep track of
parent_message_id - support both streaming and non-streaming
- optionally separate thinking from final output
That let the public API stay simple even if the internals were not.
5. Turn it into a real tool
Once the Python API felt stable, I pushed it further:
- an interactive CLI
- a local FastAPI server
So DeepWrap gradually became three things:
- a Python SDK
- a terminal chat interface
- a local HTTP API wrapper
That was the point where it stopped feeling like a hack and started feeling like a product.
What using it looks like
If you just want to try it, the flow is simple.
Install
pip install deepwrap
Create a client
from deepwrap import Client
client = Client()
If you already have a token configured, that’s enough.
Create a chat session
chat = client.chats.create_session(model="expert")
Send a normal response
response = chat.respond(
"Explain quantum computing in one short sentence.",
thinking=True,
search=True,
stream=False,
)
print(response)
Stream the output
for chunk in chat.respond(
"Write a short explanation of black holes.",
thinking=True,
search=True,
stream=True,
):
print(chunk, end="", flush=True)
print()
Keep context across turns
print(chat.respond("My name is Nika.", stream=False))
print(chat.respond("What is my name?", stream=False))
That works because the session keeps track of the conversation state internally.
CLI usage
I also wanted it to feel good in the terminal.
You can start the interactive interface with:
deepwrap
Or use one-shot terminal calls:
deepwrap chat "Explain recursion in one sentence."
That was important to me because sometimes you don’t want a script — you just want a fast terminal workflow.
God Mode
I also added an experimental feature called God Mode.
This is not some magical hidden model. It is a session-level behavior override implemented through prompt injection on the first user turn.
In practice, that means:
- it changes how the model behaves for that session
- it is intentionally intrusive
- it can diverge from normal behavior in unpredictable ways
I added it mostly as a developer/testing feature, not as a normal mode for everyday use.
So I treat it as exactly what it is:
an experimental override for controlled testing, not something meant for general use.
Security notes
A few important notes:
- Do not commit your bearer token
- Prefer environment variables or local saved config over hardcoding secrets
- Tokens saved by DeepWrap are stored locally in the user config directory
- Browser auth should only be used in environments you trust
- Experimental features like God Mode should be treated as development-only behavior modifiers
DeepWrap is an unofficial wrapper, so use it responsibly.
GitHub
The project is open source here:
GitHub: https://github.com/Kuduxaaa/deepwrap
Feedback
This whole project started from a very simple frustration:
“Why is it free in the browser, but awkward the moment I want to use it like a developer?”
Then it turned into a much bigger project than I expected:
reverse-engineering requests, dealing with PoW WASM, browser auth, session handling, streaming, CLI UX, and local API design.
If you check it out, I’d genuinely love feedback on:
- API design
- CLI UX
- auth flow
- architecture
- docs
- feature ideas
If something feels awkward, overbuilt, underbuilt, or just weird, tell me.
That kind of feedback is the most useful.

Top comments (0)