Plug DeepSeek V4-Pro into Cursor with the default OpenAI-compatible settings and the first tool call can fail with HTTP 400. V4-Pro returns a reasoning_content block, Cursor drops that field on follow-up tool-call requests, and DeepSeek rejects the request because the reasoning chain is missing. The open-source yxlao/deepseek-cursor-proxy fixes this by caching reasoning_content and re-injecting it before forwarding requests to DeepSeek.
TL;DR
- Cursor + DeepSeek V4-Pro can return 400 errors on tool calls because Cursor strips
reasoning_content. -
deepseek-cursor-proxysits between Cursor and DeepSeek, cachesreasoning_content, and restores it on follow-up requests. - Install it with
uvorpip, run the proxy, then configure Cursor with the proxy’s HTTPS ngrok URL and your DeepSeek API key. - V4-Pro inside Cursor uses DeepSeek API pricing. See DeepSeek V4-Pro 75% Price Cut Is Now Permanent for the pricing context.
Why Cursor needs a proxy for V4-Pro
DeepSeek V4-Pro responses include:
-
content: the normal assistant response -
reasoning_content: the model’s reasoning block
For plain chat, dropping reasoning_content may not matter. For tool calls, it does.
DeepSeek’s API contract for thinking models requires follow-up requests to include the previous reasoning_content alongside tool results. Cursor uses an OpenAI-style chat schema, and reasoning_content is not part of that schema, so Cursor drops it.
The next request reaches DeepSeek without the required reasoning chain, and DeepSeek returns HTTP 400.
This is not exactly a Cursor bug. It is an API-contract mismatch between an OpenAI-compatible client and a DeepSeek-specific extension. Until Cursor supports V4-Pro natively, the practical fix is a proxy.
What the proxy does
deepseek-cursor-proxy does three things:
- Listens locally for Cursor chat requests.
- Caches
reasoning_contentfrom DeepSeek responses. - Re-injects the cached
reasoning_contentinto follow-up tool-call requests before forwarding them to DeepSeek.
By default, it listens on port 9000.
It also exposes the local server through ngrok because Cursor’s custom model settings require an HTTPS endpoint and usually reject localhost.
The cache is stored here:
~/.deepseek-cursor-proxy/reasoning_content.sqlite3
The proxy keys cached reasoning blocks by a SHA-256 hash of the canonical conversation prefix, so parallel conversations do not collide.
Prerequisites
You need:
- Cursor 2.0 or newer
- A DeepSeek API key from platform.deepseek.com
- Python 3.11 or newer
- An ngrok account and authtoken
If you do not have uv, install it from the official uv installation docs.
For ngrok setup, follow the ngrok quickstart.
Step 1: Install the proxy
Using uv:
uv tool install deepseek-cursor-proxy
Or with pip:
git clone https://github.com/yxlao/deepseek-cursor-proxy.git
cd deepseek-cursor-proxy
pip install -e .
Verify the command is available:
deepseek-cursor-proxy --help
Step 2: Configure ngrok
Cursor needs a public HTTPS URL, so configure your ngrok authtoken:
ngrok config add-authtoken YOUR_NGROK_AUTHTOKEN
On the free tier, ngrok gives you a random domain each time the tunnel starts.
If you want a stable URL, reserve a domain in the ngrok dashboard and pass it to the proxy:
deepseek-cursor-proxy --ngrok-url https://your-reserved.ngrok-free.app
Step 3: Start the proxy
Run:
deepseek-cursor-proxy
On first run, the proxy creates:
~/.deepseek-cursor-proxy/config.yaml
Example output:
Starting deepseek-cursor-proxy
Tunnel: https://random-name.ngrok-free.app
Local: http://127.0.0.1:9000
Cache: /Users/you/.deepseek-cursor-proxy/reasoning_content.sqlite3
Useful flags:
deepseek-cursor-proxy --port 9001
Change the local port.
deepseek-cursor-proxy --verbose
Print request and response bodies for debugging.
deepseek-cursor-proxy --no-ngrok
Run locally without an ngrok tunnel.
deepseek-cursor-proxy --no-display-reasoning
Hide collapsible reasoning blocks in Cursor while still passing reasoning through to DeepSeek.
Keep the proxy running while using Cursor.
Step 4: Configure Cursor
In Cursor:
- Open Settings
- Go to Models
- Add a custom model
Use these values:
| Field | Value |
|---|---|
| Model name | deepseek-v4-pro |
| Base URL | https://random-name.ngrok-free.app/v1 |
| API key | Your DeepSeek API key |
The model name is forwarded directly to DeepSeek. If you want the cheaper variant, use:
deepseek-v4-flash
Make sure the base URL ends with:
/v1
Cursor will run a model verification request. If it fails, check:
- The proxy is still running
- The ngrok URL is correct
- The URL ends with
/v1 - The DeepSeek API key is valid
Step 5: Test a tool call
Pick the custom model in Cursor’s chat panel.
Use a prompt that forces tool usage:
Open the README in this repo, list every code block, and tell me which ones are missing language hints.
Expected flow:
- Cursor sends the user prompt to the proxy.
- The proxy forwards it to DeepSeek.
- DeepSeek returns
content,reasoning_content, and atool_callsrequest. - The proxy caches
reasoning_content. - Cursor runs the tool and sends the tool result back.
- Cursor omits
reasoning_content. - The proxy restores the cached
reasoning_content. - DeepSeek accepts the request and continues.
To confirm this, run the proxy with:
deepseek-cursor-proxy --verbose
You should see the reasoning injection in the logs.
Cost model
V4-Pro inside Cursor uses DeepSeek’s API pricing, not Cursor’s bundled-credit pricing.
As of May 2026:
| Token type | Rate per 1M tokens |
|---|---|
| Input cache miss | $0.435 |
| Input cache hit | $0.003625 |
| Output | $0.87 |
Example heavy Cursor day:
- 50 chat turns
- 20 tool-call chains
- Around 8,000 prompt tokens per turn
- Around 1,500 output tokens per turn
Worst-case input cost:
50 × 8,000 × $0.435 / 1,000,000 = $0.174
Output cost:
50 × 1,500 × $0.87 / 1,000,000 = $0.065
With prompt-cache hits, repeated system and context prefixes can reduce the input cost further.
For the full pricing breakdown, see DeepSeek V4-Pro 75% Price Cut Is Now Permanent.
For more DeepSeek context, see:
What changes inside Cursor
1. Reasoning blocks become visible
By default, the proxy renders DeepSeek reasoning as a collapsible Markdown block using <details>.
If you do not want to see it:
deepseek-cursor-proxy --no-display-reasoning
2. First tool-call latency is higher
V4-Pro is a thinking model, so it reasons before calling tools. Expect a few seconds before the first tool fires.
3. Complex refactors can improve
The main benefit is multi-step reasoning across files. For renames, signature changes, and config-driven refactors, V4-Pro can catch dependencies that simpler completion models may miss.
For older Cursor + DeepSeek workflows, see:
Testing your DeepSeek setup with Apidog
The Cursor setup only validates requests coming from Cursor. If you use V4-Pro in a CI bot, backend agent, IDE plugin, or internal tool, test the DeepSeek API path directly.
Use Apidog as a repeatable API test harness:
- Create an Apidog environment.
- Set the base URL to:
https://api.deepseek.com/v1
- Add your DeepSeek API key.
- Import the OpenAI Chat Completion schema.
- Create test cases for your prompts and tool-call payloads.
You can use this to:
- Record golden V4-Pro responses and replay them after prompt changes
- Validate
tool_callspayloads with JSON Schema assertions - Compare V4-Pro and GPT-5.5 on the same input batch
- Catch API contract drift before it reaches production
Download Apidog here: Download Apidog.
The same workflow is covered in How to use the DeepSeek V4 API.
Common pitfalls
400 errors after the first tool call
This usually means Cursor is not going through the proxy.
Check:
- The proxy process is running
- Cursor’s base URL points to the ngrok URL
- The base URL ends with
/v1 - The proxy logs show incoming requests
ngrok URL keeps changing
Free ngrok tunnels rotate on restart.
Fix it by reserving a domain in the ngrok dashboard, then starting the proxy with:
deepseek-cursor-proxy --ngrok-url https://your-reserved.ngrok-free.app
Duplicated reasoning content
This can happen if two proxy instances use the same SQLite cache.
Stop both, delete the cache, and start one proxy:
rm ~/.deepseek-cursor-proxy/reasoning_content.sqlite3
deepseek-cursor-proxy
Low prompt-cache hit ratio
DeepSeek prompt caching requires byte-identical prefixes.
Cursor may inject timestamps or session IDs into system prompts, which changes the prefix and kills cache hits.
Possible fixes:
- Remove variable content from the system prompt
- Move changing context into user messages
- Accept the extra input cost for Cursor sessions
Cursor says “model not found”
The model name must match a real DeepSeek model identifier.
Examples:
deepseek-v4-pro
deepseek-v4-flash
deepseek-v3-2-pro
deepseek-r1-1
The proxy does not translate model names.
Alternatives
If you do not want to run the proxy, you have two practical alternatives.
Use V4-Flash directly
deepseek-v4-flash is not a thinking model and does not return reasoning_content, so Cursor can talk to it without the proxy.
You lose the V4-Pro reasoning behavior, but setup is simpler.
Use another IDE assistant
Tools like Cline, Continue, or other AI IDE plugins may support thinking-model fields directly.
If you are not committed to Cursor, switching tools may be easier than running a proxy.
See Best open source coding assistants in 2026: free Cursor alternatives.
Other Cursor model integrations:
FAQ
Why does Cursor not support DeepSeek V4-Pro natively?
Cursor’s chat client follows the OpenAI Chat Completions schema. reasoning_content is a DeepSeek-specific extension, so Cursor would need provider-specific handling to preserve it across tool calls.
Does the proxy work with DeepSeek R1 or V3.2?
Yes. It works with DeepSeek thinking models that return reasoning_content and require it on tool-call follow-ups.
Set Cursor’s model name to the actual DeepSeek model identifier.
Is the proxy safe to leave running?
Yes, but the SQLite cache contains raw reasoning content from your sessions.
If you share the machine or run a multi-user setup, restrict permissions on:
~/.deepseek-cursor-proxy/
Can I use the proxy without ngrok?
Yes:
deepseek-cursor-proxy --no-ngrok
That exposes only:
http://127.0.0.1:9000
Most Cursor builds require HTTPS for custom models, so ngrok or an equivalent tunnel is usually required.
Alternatives include:
- Cloudflare Tunnel
- Tailscale Funnel
- A reverse proxy with HTTPS
Does this work with Cursor Composer?
Yes. Composer uses the same model-routing pipeline as Cursor chat, so the same reasoning_content issue applies and the proxy fixes it the same way.
What is the proxy latency overhead?
The proxy adds:
- One local network hop
- One SQLite lookup
- Small JSON modifications
The overhead is typically negligible compared with model latency. ngrok may add extra network latency depending on the edge location.
How does the proxy decide what to cache?
It hashes the conversation prefix and stores the matching reasoning_content in SQLite.
On the next request, it hashes the new prefix and looks up the cached reasoning block. Partial-prefix matches do not count, which prevents similar conversations from polluting each other.
Next steps
DeepSeek V4-Pro is usable in Cursor today if you handle the reasoning_content contract correctly. The proxy does that with a small local service and an HTTPS tunnel.
Recommended workflow:
- Install and run
deepseek-cursor-proxy. - Add
deepseek-v4-proas a Cursor custom model. - Test with a prompt that forces tool usage.
- Compare it against your current Cursor default on real pull requests.
- Use Apidog to build regression tests against
api.deepseek.com.
The thinking-token tax is paid. The price tag is not.

Top comments (0)