manikandan

Posted on Nov 11

How to Use AI Models Locally in VS Code with the Continue Plugin (with Multi-Model Switching Support)

#vscode #codingassistant #productivity #softwaredevelopment

AI-assisted coding has become a daily tool for many developers — from explaining complex code to generating entire functions in seconds. But most AI coding tools rely on cloud-based models like GitHub Copilot or ChatGPT, which means you’re always dependent on an internet connection, API tokens, and third-party privacy policies.

What if you could bring that power entirely local, right inside VS Code, with no external API calls and the ability to switch between multiple models at will?

That’s exactly what we’ll cover in this guide. You’ll learn how to use the Continue plugin in VS Code to run AI models locally using Ollama, and even set up multi-model switching for different coding scenarios.

What You’ll Need

Before begin, make sure you have the following:

Visual Studio Code (latest version)
Internet connection (only for installation)
Ollama (for running local AI models)
System resources — at least 8 GB RAM (16 GB recommended)
Basic familiarity with JSON configuration files

Step 1: Install the Continue Plugin in VS Code

Open VS Code.
Go to the Extensions Marketplace (Ctrl+Shift+X / Cmd+Shift+X).
Search for “Continue” by Continue.dev.
Click Install.

Once installed, you’ll notice a new 🧠 Continue icon on your left sidebar. Clicking it will open the Continue chat panel.

Step 2: Set Up Ollama for Local Models

Ollama lets run open-source AI models like Llama 3, Mistral, codellama, and more — all locally on your machine.

Install Ollama

Run this command in your terminal:

curl -fsSL https://ollama.com/install.sh | sh

Then start Ollama:

ollama serve

Pull a Model

For example, only one model per execution:

ollama pull llama3
ollama pull mistral
ollama pull codellama

Once it’s loaded, Ollama hosts the model locally at http://localhost:11434, ready to respond to requests.

Step 3: Configure Continue to Use Local Models

Now using Continue to use Ollama as a provider for your local AI models.

Open Continue’s Configuration File

In VS Code, open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P).
Search “Continue: Open Config File”.
This opens a file named .continue/config.json.

Add a Local Model

{
  "models": [
    {
      "name": "Mistral Local",
      "provider": "ollama",
      "model": "mistral"
    }
  ]
}

Save the file.

Now Continue will use your locally hosted Mistral model via Ollama.

Step 4: Add Multiple Models and Enable Switching

One of Continue’s most powerful features is multi-model support. You can define multiple models — local or remote — and switch between them instantly from the Continue sidebar.

Here’s an example setup:

{
  "models": [
    { "title": "Llama 3", "provider": "ollama", "model": "llama3" },
    { "title": "Mistral", "provider": "ollama", "model": "mistral" },
    { "title": "CodeLlama", "provider": "ollama", "model": "codellama" }
  ]
}

How to Switch Models

Click the model dropdown in the Continue sidebar and choose one.
Or, use the command line shortcut in the chat:

  /switch llama 3

Step 5: Start Using Continue Locally

Now you’re ready to use your local AI assistant — completely offline.

Some great use cases include:

Explaining or summarizing existing code
Generating unit tests
Suggesting function names or documentation
Refactoring large files with reasoning

Example Interaction

Prompt: “Using Python how to get the index of specific string from a sentence”

Model (Mistral): “To find the index (position) of a specific string in a sentence using Python, you can utilize the built-in str.find() method.”

Step 6: Troubleshooting and Optimization

Here are a few quick tips to make your setup smoother:

Issue	Possible Fix
`Model not found`	Make sure the model is pulled via Ollama (`ollama pull mistral`)
`Slow responses`	Try smaller models like `phi3` or `codellama`
`JSON config errors`	Validate using VS Code’s built-in JSON formatter
`High memory use`	Limit concurrency in Ollama or close other running models

⚡ Optional: Hybrid Setup (Local + Cloud Models)

Continue lets you combine local and remote models in the same workspace.

You can use OpenAI or Anthropic APIs for high-power reasoning tasks while keeping local models for everyday completions.

Example hybrid config:

{
  "models": [
    { "name": "Llama 3 Local", "provider": "ollama", "model": "llama3" },
    { "name": "GPT-4 Turbo", "provider": "openai", "model": "gpt-4-turbo" }
  ]
}

Then simply switch between them based on your needs — offline vs. advanced reasoning.

Conclusion

And that’s it! You now have a fully local AI coding assistant in VS Code — powered by the Continue plugin and Ollama.

✅ You installed the Continue plugin

✅ Configured local models like Mistral and Llama

✅ Added multiple models with seamless switching

✅ Used your AI assistant completely offline

This setup gives you the best of both worlds — privacy, flexibility, and zero dependency on cloud APIs.

So go ahead and experiment — try different open models, optimize your workflows, and experience the power of AI-assisted coding locally.

DEV Community