Hassann

Posted on May 25 • Originally published at apidog.com

How to Use DeepSeek V4-Pro with Cursor: The Reasoning Proxy Setup Guide (2026)

Plug DeepSeek V4-Pro into Cursor with the default OpenAI-compatible settings and the first tool call can fail with HTTP 400. V4-Pro returns a reasoning_content block, Cursor drops that field on follow-up tool-call requests, and DeepSeek rejects the request because the reasoning chain is missing. The open-source yxlao/deepseek-cursor-proxy fixes this by caching reasoning_content and re-injecting it before forwarding requests to DeepSeek.

Try Apidog today

TL;DR

Cursor + DeepSeek V4-Pro can return 400 errors on tool calls because Cursor strips reasoning_content.
deepseek-cursor-proxy sits between Cursor and DeepSeek, caches reasoning_content, and restores it on follow-up requests.
Install it with uv or pip, run the proxy, then configure Cursor with the proxy’s HTTPS ngrok URL and your DeepSeek API key.
V4-Pro inside Cursor uses DeepSeek API pricing. See DeepSeek V4-Pro 75% Price Cut Is Now Permanent for the pricing context.

Why Cursor needs a proxy for V4-Pro

DeepSeek V4-Pro responses include:

content: the normal assistant response
reasoning_content: the model’s reasoning block

For plain chat, dropping reasoning_content may not matter. For tool calls, it does.

DeepSeek’s API contract for thinking models requires follow-up requests to include the previous reasoning_content alongside tool results. Cursor uses an OpenAI-style chat schema, and reasoning_content is not part of that schema, so Cursor drops it.

The next request reaches DeepSeek without the required reasoning chain, and DeepSeek returns HTTP 400.

This is not exactly a Cursor bug. It is an API-contract mismatch between an OpenAI-compatible client and a DeepSeek-specific extension. Until Cursor supports V4-Pro natively, the practical fix is a proxy.

What the proxy does

deepseek-cursor-proxy does three things:

Listens locally for Cursor chat requests.
Caches reasoning_content from DeepSeek responses.
Re-injects the cached reasoning_content into follow-up tool-call requests before forwarding them to DeepSeek.

By default, it listens on port 9000.

It also exposes the local server through ngrok because Cursor’s custom model settings require an HTTPS endpoint and usually reject localhost.

The cache is stored here:

~/.deepseek-cursor-proxy/reasoning_content.sqlite3

The proxy keys cached reasoning blocks by a SHA-256 hash of the canonical conversation prefix, so parallel conversations do not collide.

Prerequisites

You need:

Cursor 2.0 or newer
A DeepSeek API key from platform.deepseek.com
Python 3.11 or newer
An ngrok account and authtoken

If you do not have uv, install it from the official uv installation docs.

For ngrok setup, follow the ngrok quickstart.

Step 1: Install the proxy

Using uv:

uv tool install deepseek-cursor-proxy

Or with pip:

git clone https://github.com/yxlao/deepseek-cursor-proxy.git
cd deepseek-cursor-proxy
pip install -e .

Verify the command is available:

deepseek-cursor-proxy --help

Step 2: Configure ngrok

Cursor needs a public HTTPS URL, so configure your ngrok authtoken:

ngrok config add-authtoken YOUR_NGROK_AUTHTOKEN

On the free tier, ngrok gives you a random domain each time the tunnel starts.

If you want a stable URL, reserve a domain in the ngrok dashboard and pass it to the proxy:

deepseek-cursor-proxy --ngrok-url https://your-reserved.ngrok-free.app

Step 3: Start the proxy

Run:

deepseek-cursor-proxy

On first run, the proxy creates:

~/.deepseek-cursor-proxy/config.yaml

Example output:

Starting deepseek-cursor-proxy
Tunnel: https://random-name.ngrok-free.app
Local:  http://127.0.0.1:9000
Cache:  /Users/you/.deepseek-cursor-proxy/reasoning_content.sqlite3

Useful flags:

deepseek-cursor-proxy --port 9001

Change the local port.

deepseek-cursor-proxy --verbose

Print request and response bodies for debugging.

deepseek-cursor-proxy --no-ngrok

Run locally without an ngrok tunnel.

deepseek-cursor-proxy --no-display-reasoning

Hide collapsible reasoning blocks in Cursor while still passing reasoning through to DeepSeek.

Keep the proxy running while using Cursor.

Step 4: Configure Cursor

In Cursor:

Open Settings
Go to Models
Add a custom model

Use these values:

Field	Value
Model name	`deepseek-v4-pro`
Base URL	`https://random-name.ngrok-free.app/v1`
API key	Your DeepSeek API key

The model name is forwarded directly to DeepSeek. If you want the cheaper variant, use:

deepseek-v4-flash

Make sure the base URL ends with:

/v1

Cursor will run a model verification request. If it fails, check:

The proxy is still running
The ngrok URL is correct
The URL ends with /v1
The DeepSeek API key is valid

Step 5: Test a tool call

Pick the custom model in Cursor’s chat panel.

Use a prompt that forces tool usage:

Open the README in this repo, list every code block, and tell me which ones are missing language hints.

Expected flow:

Cursor sends the user prompt to the proxy.
The proxy forwards it to DeepSeek.
DeepSeek returns content, reasoning_content, and a tool_calls request.
The proxy caches reasoning_content.
Cursor runs the tool and sends the tool result back.
Cursor omits reasoning_content.
The proxy restores the cached reasoning_content.
DeepSeek accepts the request and continues.

To confirm this, run the proxy with:

deepseek-cursor-proxy --verbose

You should see the reasoning injection in the logs.

Cost model

V4-Pro inside Cursor uses DeepSeek’s API pricing, not Cursor’s bundled-credit pricing.

As of May 2026:

Token type	Rate per 1M tokens
Input cache miss	`$0.435`
Input cache hit	`$0.003625`
Output	`$0.87`

Example heavy Cursor day:

50 chat turns
20 tool-call chains
Around 8,000 prompt tokens per turn
Around 1,500 output tokens per turn

Worst-case input cost:

50 × 8,000 × $0.435 / 1,000,000 = $0.174

Output cost:

50 × 1,500 × $0.87 / 1,000,000 = $0.065

With prompt-cache hits, repeated system and context prefixes can reduce the input cost further.

For the full pricing breakdown, see DeepSeek V4-Pro 75% Price Cut Is Now Permanent.

For more DeepSeek context, see:

What changes inside Cursor

1. Reasoning blocks become visible

By default, the proxy renders DeepSeek reasoning as a collapsible Markdown block using <details>.

If you do not want to see it:

deepseek-cursor-proxy --no-display-reasoning

2. First tool-call latency is higher

V4-Pro is a thinking model, so it reasons before calling tools. Expect a few seconds before the first tool fires.

3. Complex refactors can improve

The main benefit is multi-step reasoning across files. For renames, signature changes, and config-driven refactors, V4-Pro can catch dependencies that simpler completion models may miss.

For older Cursor + DeepSeek workflows, see:

Testing your DeepSeek setup with Apidog

The Cursor setup only validates requests coming from Cursor. If you use V4-Pro in a CI bot, backend agent, IDE plugin, or internal tool, test the DeepSeek API path directly.

Use Apidog as a repeatable API test harness:

Create an Apidog environment.
Set the base URL to:

https://api.deepseek.com/v1

Add your DeepSeek API key.
Import the OpenAI Chat Completion schema.
Create test cases for your prompts and tool-call payloads.

You can use this to:

Record golden V4-Pro responses and replay them after prompt changes
Validate tool_calls payloads with JSON Schema assertions
Compare V4-Pro and GPT-5.5 on the same input batch
Catch API contract drift before it reaches production

Download Apidog here: Download Apidog.

The same workflow is covered in How to use the DeepSeek V4 API.

Common pitfalls

400 errors after the first tool call

This usually means Cursor is not going through the proxy.

Check:

The proxy process is running
Cursor’s base URL points to the ngrok URL
The base URL ends with /v1
The proxy logs show incoming requests

ngrok URL keeps changing

Free ngrok tunnels rotate on restart.

Fix it by reserving a domain in the ngrok dashboard, then starting the proxy with:

deepseek-cursor-proxy --ngrok-url https://your-reserved.ngrok-free.app

Duplicated reasoning content

This can happen if two proxy instances use the same SQLite cache.

Stop both, delete the cache, and start one proxy:

rm ~/.deepseek-cursor-proxy/reasoning_content.sqlite3
deepseek-cursor-proxy

Low prompt-cache hit ratio

DeepSeek prompt caching requires byte-identical prefixes.

Cursor may inject timestamps or session IDs into system prompts, which changes the prefix and kills cache hits.

Possible fixes:

Remove variable content from the system prompt
Move changing context into user messages
Accept the extra input cost for Cursor sessions

Cursor says “model not found”

The model name must match a real DeepSeek model identifier.

Examples:

deepseek-v4-pro
deepseek-v4-flash
deepseek-v3-2-pro
deepseek-r1-1

The proxy does not translate model names.

Alternatives

If you do not want to run the proxy, you have two practical alternatives.

Use V4-Flash directly

deepseek-v4-flash is not a thinking model and does not return reasoning_content, so Cursor can talk to it without the proxy.

You lose the V4-Pro reasoning behavior, but setup is simpler.

Use another IDE assistant

Tools like Cline, Continue, or other AI IDE plugins may support thinking-model fields directly.

If you are not committed to Cursor, switching tools may be easier than running a proxy.

See Best open source coding assistants in 2026: free Cursor alternatives.

Other Cursor model integrations:

FAQ

Why does Cursor not support DeepSeek V4-Pro natively?

Cursor’s chat client follows the OpenAI Chat Completions schema. reasoning_content is a DeepSeek-specific extension, so Cursor would need provider-specific handling to preserve it across tool calls.

Does the proxy work with DeepSeek R1 or V3.2?

Yes. It works with DeepSeek thinking models that return reasoning_content and require it on tool-call follow-ups.

Set Cursor’s model name to the actual DeepSeek model identifier.

Is the proxy safe to leave running?

Yes, but the SQLite cache contains raw reasoning content from your sessions.

If you share the machine or run a multi-user setup, restrict permissions on:

~/.deepseek-cursor-proxy/

Can I use the proxy without ngrok?

Yes:

deepseek-cursor-proxy --no-ngrok

That exposes only:

http://127.0.0.1:9000

Most Cursor builds require HTTPS for custom models, so ngrok or an equivalent tunnel is usually required.

Alternatives include:

Cloudflare Tunnel
Tailscale Funnel
A reverse proxy with HTTPS

Does this work with Cursor Composer?

Yes. Composer uses the same model-routing pipeline as Cursor chat, so the same reasoning_content issue applies and the proxy fixes it the same way.

What is the proxy latency overhead?

The proxy adds:

One local network hop
One SQLite lookup
Small JSON modifications

The overhead is typically negligible compared with model latency. ngrok may add extra network latency depending on the edge location.

How does the proxy decide what to cache?

It hashes the conversation prefix and stores the matching reasoning_content in SQLite.

On the next request, it hashes the new prefix and looks up the cached reasoning block. Partial-prefix matches do not count, which prevents similar conversations from polluting each other.

Next steps

DeepSeek V4-Pro is usable in Cursor today if you handle the reasoning_content contract correctly. The proxy does that with a small local service and an HTTPS tunnel.

Recommended workflow:

Install and run deepseek-cursor-proxy.
Add deepseek-v4-pro as a Cursor custom model.
Test with a prompt that forces tool usage.
Compare it against your current Cursor default on real pull requests.
Use Apidog to build regression tests against api.deepseek.com.

The thinking-token tax is paid. The price tag is not.

DEV Community