Run Powerful AI Coding Locally on a Normal Laptop
A Developer-Friendly Guide to Setting Up ROO Code + Ollama + Qwen (8GB/16GB RAM)
If you are a developer who wants to use AI coding assistants locally without paying for cloud APIs or owning a high-end GPU, this guide is for you.
In this article, we will set up:
ROO Code inside Visual Studio Code
Ollama for running local AI models
Qwen2.5-Coder model locally
Optimized for:
8GB RAM laptops
16GB RAM laptops
No dedicated GPU / No VRAM
By the end, you’ll have your own private AI coding assistant running fully offline.
Why Run AI Locally?
Running AI locally gives developers:
✅ No API cost
✅ Better privacy
✅ Faster experimentation
✅ Offline development
✅ Full control over models
✅ No dependency on cloud providers
Recommended Hardware
Configuration Recommended Model
8GB RAM qwen2.5-coder:1.5b
16GB RAM qwen2.5-coder:7b
16GB+ RAM qwen2.5-coder:14b (slow but possible)
If you have no GPU, don’t worry. Ollama can run models entirely on CPU.
Step 1 — Install Visual Studio Code
- Download and install:
- Visual Studio Code
- Use the official website:
After installation:
code --version
Verify VS Code is properly installed.
Step 2 — Install Ollama
Install:
Ollama
Windows
Download installer from the official Ollama website.
Verify installation:
ollama --version
Step 3 — Start Ollama
Run:
ollama serve
This starts the local AI server at:
Keep this terminal running.
Step 4 — Install Qwen Coding Model
For 8GB RAM Systems
Recommended:
ollama run qwen2.5-coder:1.5b
Why?
- Lightweight
- Fast on CPU
- Good enough for:
- Code generation
- Refactoring
- Unit tests
- Small automation tasks
For 16GB RAM Systems
Recommended:
ollama run qwen2.5-coder:7b
This gives much better:
- Reasoning
- Architecture suggestions
- Refactoring quality
- Multi-file understanding
Step 5 — Test the Model
Try:
ollama run qwen2.5-coder:7b
Then ask:
Who are you and create a hello world example in python
If the model responds, you’re ready.
Step 6 — Install ROO Code Extension
Inside VS Code:
Open Extensions
Search:
Roo Code
Install the extension
ROO Code converts VS Code into an AI-powered development environment.
Step 7 — Configure ROO Code for Ollama
Open ROO Code settings.
Set:
Provider: Ollama
API Endpoint:
Model:
For 8GB RAM:
qwen2.5-coder:1.5b
For 16GB RAM:
qwen2.5-coder:7b
Save settings.
Step 8 — First AI Coding Test
Open a project and ask ROO Code:
Create a Java Spring Boot CRUD API with Controller, Service, Repository
Or:
Generate Cypress automation for login page
You now have a local AI coding assistant.
Best Practices for Low-RAM Systems
For 8GB RAM Machines
Recommended Settings
Setting Value
Context Window Small
Concurrent Apps Minimal
Model 1.5B
Browser Tabs Limited
Avoid
❌ Running Docker + AI together
❌ Opening large IDE projects
❌ Using 7B models continuously
Best Practices for 16GB RAM Machines
You can comfortably use:
qwen2.5-coder:7b
Medium-size repositories
Spring Boot projects
React applications
Cypress automation generation
Recommended:
OLLAMA_NUM_PARALLEL=1
This prevents RAM spikes.
Performance Optimization Tips
Reduce Model Temperature
Better coding consistency:
temperature = 0.2
Keep Context Smaller
Instead of entire repositories:
✅ Open only relevant folders
This improves response quality and speed.
Restart Ollama Occasionally
Long sessions can consume memory.
Restart:
ollama stop
ollama serve
Recommended Models by Use Case
Use Case Recommended Model
Basic coding qwen2.5-coder:1.5b
Java development qwen2.5-coder:7b
Test automation qwen2.5-coder:7b
Architecture discussion qwen2.5-coder:7b
Large enterprise code DeepSeek-Coder 14B (16GB+)
What Works Surprisingly Well Locally?
Even without a GPU, local models perform very well for:
✅ Boilerplate generation
✅ Refactoring
✅ Unit tests
✅ Cypress automation
✅ SQL generation
✅ Spring Boot scaffolding
✅ API creation
✅ Debugging suggestions
✅ Documentation generation
Limitations
Be realistic about CPU-only setups.
You may experience:
Slower response time
Limited context handling
Occasional hallucinations
Reduced multi-file reasoning
But for day-to-day development, the experience is still highly productive.
My Recommended Setup
For Most Developers
8GB RAM
Ollama + qwen2.5-coder:1.5b + Roo Code
16GB RAM
Ollama + qwen2.5-coder:7b + Roo Code
This provides the best balance between:
Performance
Memory usage
Coding quality
Stability
Final Thoughts
Local AI development is no longer limited to expensive GPUs.
Today, even a normal laptop can run surprisingly capable coding assistants using:
Ollama
Qwen2.5-Coder
Visual Studio Code
ROO Code
For developers working in Java, Spring Boot, React, Cypress, AI automation, and system design — this setup is an excellent starting point into the world of local AI engineering.
Useful Commands Cheat Sheet
Start Ollama
ollama serve
Run 1.5B model
ollama run qwen2.5-coder:1.5b
Run 7B model
ollama run qwen2.5-coder:7b
List installed models
ollama list
Remove model
ollama rm qwen2.5-coder:7b
Tags
Top comments (0)