As a developer, I got tired of waiting 5 days for Claude resets and burning through Cursor’s $20 credit pool.
I was building an editorial platform and I was at a critical point where I was developing a new feature that involved build on the admin, server and the platform concurrently for complete context and I was hit with the weekly limit by Claude right in the middle of it, having to wait 5 days because of the context.
Frustrating ! ! !
I'm building at senior engineering level with Claude's Opus 4.7 in VS Code, and I want to be able to do that without usage limits, weekly caps, or surprise bills.
After lots of research online, I finally set up completely free, local AI models running directly in VS Code using Ollama + Continue.dev.
These models now handle most of my daily coding work and they’re shockingly good (some tasks feel close to Claude Opus 4.7 level).
This is the exact step-by-step guide that worked for me.
Why You Need This SetupNo more waiting for resets
- No credit pools or usage-based billing traps
- Works completely offline after download
- Private (your code never leaves your machine)
- Easy to switch between multiple free models
Step-by-Step Setup (Tested on MacBook Air M1 model)
Step 1: Install Ollama
Go to ollama.com, download and install it
Make sure to select your operating system before clicking download. After installation, open ollama - it would run as a background activity.
Open any terminal and run:
ollama list
Running ollama list should show nothing at first.
You can use VS Code's terminal since we would be doing the whole work in VS code.
Step 2: Download Good Free Models
I started with these two:
In your terminal, run the following and wait for the download to be complete
ollama pull qwen2.5-coder:7b
ollama pull deepseek-coder-v2:16b
Wait till the two are done downloading. qwen2.5-coder is roughly 4.7gb in download size while deepseek-coder-v2 is roughly 8.9gb in download size.
Step 3: Install Continue.dev in VS Code
- Open VS Code if you don't already have it opened
- Click on the extensions tab on the side bar available on the left side or right side of VS code
- Under the extensions tab, search "Continue" by Continue, it's the first result you are going to see.
Then proceed to install continue, it immediately becomes available on the sidebar after installation.
Step 4: Configure Continue.dev
Now this is the slightly tricky part.
In the Continue topbar, click the gear icon to open settings
Click on configs on the left tab to open the config panel
Replace everything in the config.yaml file with this configuration:
name: Local Config
version: 1.0.0
schema: v1
models:
- name: Qwen2.5 Coder 7B
provider: ollama
model: qwen2.5-coder:7b
- name: DeepSeek Coder V2 16B
provider: ollama
model: deepseek-coder-v2:16b
- name: Qwen 1.5B Autocomplete
provider: ollama
model: qwen2.5-coder:1.5b
roles:
- autocomplete
tabAutocompleteModel:
name: Qwen 1.5B Autocomplete
provider: ollama
model: qwen2.5-coder:1.5b
Save the file, close and reopen VS Code. Click on the continue tab on the sidebar again to open it, you should now have a Chat panel on Continue
Click on the models dropdown to see if you have any models running yet. If you don't have any model running it, you will need to reload the config, and you can do that by clicking on the local config dropdown right above the chat interface. Once the dropdown opens, click on reload.
Step 5: Test It
- In Continue sidebar, select Qwen2.5 Coder 7B
- Turn Agent mode OFF (very important for stability)
- Type “Hello” and send.
Common Problems & Fixes (From My Own Struggle)
- “No models configured” → Reload config after saving yaml, or restart VS Code.
- DeepSeek shows “does not support tools” → Switch to Qwen2.5 7B and turn Agent OFF.
- Too slow on MacBook Air → Use the 1.5B model for daily work, keep 7B for serious tasks.
- First response is very slow → Normal on CPU. Be patient for the first message.
Model Recommendations
- Daily driver: Qwen2.5 Coder 7B (best balance)
- Fast autocomplete: Qwen 1.5B
- Strong reasoning: DeepSeek Coder V2 16B (when speed is not critical)
I no longer worry about Claude resets or Cursor limit traps. I now have reliable, free, private AI coding assistance running locally in VS Code.
This setup won’t always beat the absolute latest Claude Opus on every single task, but for 60-70% of my work, it’s more than enough, it's frustrating to have to wait for limit reset or pay extra.
Direct Links to Resources
Ollama Download
https://ollama.com/download
VS Code IDE Download
https://ollama.com/download






Top comments (0)