AI-assisted coding has become a daily tool for many developers — from explaining complex code to generating entire functions in seconds. But most AI coding tools rely on cloud-based models like GitHub Copilot or ChatGPT, which means you’re always dependent on an internet connection, API tokens, and third-party privacy policies.
What if you could bring that power entirely local, right inside VS Code, with no external API calls and the ability to switch between multiple models at will?
That’s exactly what we’ll cover in this guide. You’ll learn how to use the Continue plugin in VS Code to run AI models locally using Ollama, and even set up multi-model switching for different coding scenarios.
What You’ll Need
Before begin, make sure you have the following:
- Visual Studio Code (latest version)
- Internet connection (only for installation)
- Ollama (for running local AI models)
- System resources — at least 8 GB RAM (16 GB recommended)
- Basic familiarity with JSON configuration files
Step 1: Install the Continue Plugin in VS Code
- Open VS Code.
- Go to the Extensions Marketplace (Ctrl+Shift+X / Cmd+Shift+X).
- Search for “Continue” by Continue.dev.
- Click Install.
Once installed, you’ll notice a new 🧠 Continue icon on your left sidebar. Clicking it will open the Continue chat panel.
Step 2: Set Up Ollama for Local Models
Ollama lets run open-source AI models like Llama 3, Mistral, codellama, and more — all locally on your machine.
Install Ollama
Run this command in your terminal:
curl -fsSL https://ollama.com/install.sh | sh
Then start Ollama:
ollama serve
Pull a Model
For example, only one model per execution:
ollama pull llama3
ollama pull mistral
ollama pull codellama
Once it’s loaded, Ollama hosts the model locally at http://localhost:11434, ready to respond to requests.
Step 3: Configure Continue to Use Local Models
Now using Continue to use Ollama as a provider for your local AI models.
Open Continue’s Configuration File
- In VS Code, open the Command Palette (
Ctrl+Shift+P/Cmd+Shift+P). - Search “Continue: Open Config File”.
- This opens a file named
.continue/config.json.
Add a Local Model
{
"models": [
{
"name": "Mistral Local",
"provider": "ollama",
"model": "mistral"
}
]
}
Save the file.
Now Continue will use your locally hosted Mistral model via Ollama.
Step 4: Add Multiple Models and Enable Switching
One of Continue’s most powerful features is multi-model support. You can define multiple models — local or remote — and switch between them instantly from the Continue sidebar.
Here’s an example setup:
{
"models": [
{ "title": "Llama 3", "provider": "ollama", "model": "llama3" },
{ "title": "Mistral", "provider": "ollama", "model": "mistral" },
{ "title": "CodeLlama", "provider": "ollama", "model": "codellama" }
]
}
How to Switch Models
- Click the model dropdown in the Continue sidebar and choose one.
- Or, use the command line shortcut in the chat:
/switch llama 3
Step 5: Start Using Continue Locally
Now you’re ready to use your local AI assistant — completely offline.
Some great use cases include:
- Explaining or summarizing existing code
- Generating unit tests
- Suggesting function names or documentation
- Refactoring large files with reasoning
Example Interaction
Prompt: “Using Python how to get the index of specific string from a sentence”
Model (Mistral): “To find the index (position) of a specific string in a sentence using Python, you can utilize the built-instr.find()method.”
Step 6: Troubleshooting and Optimization
Here are a few quick tips to make your setup smoother:
| Issue | Possible Fix |
|---|---|
Model not found |
Make sure the model is pulled via Ollama (ollama pull mistral) |
Slow responses |
Try smaller models like phi3 or codellama
|
JSON config errors |
Validate using VS Code’s built-in JSON formatter |
High memory use |
Limit concurrency in Ollama or close other running models |
⚡ Optional: Hybrid Setup (Local + Cloud Models)
Continue lets you combine local and remote models in the same workspace.
You can use OpenAI or Anthropic APIs for high-power reasoning tasks while keeping local models for everyday completions.
Example hybrid config:
{
"models": [
{ "name": "Llama 3 Local", "provider": "ollama", "model": "llama3" },
{ "name": "GPT-4 Turbo", "provider": "openai", "model": "gpt-4-turbo" }
]
}
Then simply switch between them based on your needs — offline vs. advanced reasoning.
Conclusion
And that’s it! You now have a fully local AI coding assistant in VS Code — powered by the Continue plugin and Ollama.
✅ You installed the Continue plugin
✅ Configured local models like Mistral and Llama
✅ Added multiple models with seamless switching
✅ Used your AI assistant completely offline
This setup gives you the best of both worlds — privacy, flexibility, and zero dependency on cloud APIs.
So go ahead and experiment — try different open models, optimize your workflows, and experience the power of AI-assisted coding locally.






Top comments (0)