DEV Community

re-ten
re-ten

Posted on • Edited on

33 3 3 2 3

DeepSeek-R1 on Cursor with Ollama

So guys, there are many options using local llm but the DeepSeek-R1 is drop a weeks ago. If you want use the ollama/local llm in cursor i got u.

First u need a ollama, Ollama then after installing it u need cors for ollama, it required or cursor give 403 Forbidden as u can see we need define OLLAMA_ORIGINS in windows environment.

Image description

Ok, next we need the deepseek-r1 models, i try deepseek-r1:8b because this model have good benchmark the model running on my pc with Nvidia RTX 3070 8GB(enough vram i got 60-70t/s). We can use

ollama run deepseek-r1:8b

then the models start the downloading, if that clear we can quit the ollama via tray icons windows or what ever, we need to close for restarting ollama because we define the cors.

Image description

then u can run ollama via start menu.

Image description

By default, ollama serve endpoint http://127.0.0.1:11434 but if u direct using the endpoint to cursor i cant be used. so we need ngrok. U can download and login it, then they instruct u to login via auth token.

Next we need ngrok to give public url for ollama.
.\ngrok.exe http 11434 --host-header="localhost:11434"

Like this
Image description

Then we got the endpoint for OpenAI Public URL

Image description

U can check if ur endpoint is active

Image description

Ok, we move to cursor

Image description
We need define model what we use in cursor, u can check with ollama list for list of models u have.

Image description
On OpenAI Key use ur public url https://xxxxxx.ngrok-free.app with api key ollama the u done.

If the step done, we can go try some model with cursor chat.

Image description
As u can see, the local llm works properly at some case it not support for compose because cursor only allow antrophic and gpt models.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (9)

Collapse
 
jakub_honek_285ad66dda8 profile image
Jakub Hořínek

Any idea why it is not working for me??

Image description

Image description

$headers = @{
        "Content-Type" = "application/json"
        "Authorization" = "Bearer ollama"
}

$body = @{
        "messages" = @(
                @{
                        "role" = "system"
                        "content" = "You are a test assistant."
                },
                @{
                        "role" = "user"
                        "content" = "Testing. Just say hi and nothing else."
                }
        )
        "model" = "gpt-4o-mini"
} | ConvertTo-Json

Invoke-WebRequest -Uri "https://4ccc-88-101-25-25.ngrok-free.app/chat/completions" -Method Post -Headers $headers -Body $body
Enter fullscreen mode Exit fullscreen mode

Image description

Isnt that because page will load like this insted of api? Can I somehow bypass this?

Image description

Collapse
 
basstimam profile image
re-ten

Its not necessary, we can try on curl if ollama is running then will be showing mssge "Ollama is running".

Image description

Have u ever save the base url? the base url not save automaticlly, and verify.

Collapse
 
basstimam profile image
re-ten

And, just apply the ollama model
Image description

If u check both of model like sonnet and r1:8b its undifined model cause u only have r1 models.

Thread Thread
 
jakub_honek_285ad66dda8 profile image
Jakub Hořínek • Edited

It´s working!

Thread Thread
 
basstimam profile image
re-ten

Nice!

Collapse
 
jakub_honek_285ad66dda8 profile image
Jakub Hořínek
Collapse
 
funeralfolio profile image
FuneralFolio

Nice!

Collapse
 
basstimam profile image
re-ten

Thanks!