DEV Community

upendra manike
upendra manike

Posted on

# πŸš€ Running Claude Code Locally with Ollama (No Token Cost)

Until recently, using Claude for coding workflows meant relying on paid API usage.

Now, there’s a powerful workaround:

πŸ‘‰ You can run Claude Code against a local Ollama endpoint, using open-source models like qwen2.5:3b.

This enables a fully local AI coding assistant β€” no per-token billing, and full control over your environment.


βš™οΈ Setup Guide

1. Install Ollama

brew install ollama
Enter fullscreen mode Exit fullscreen mode

2. Pull a Coding Model

ollama pull qwen2.5:3b
Enter fullscreen mode Exit fullscreen mode

3. Install Claude Code

npm install -g @anthropic-ai/claude-code
Enter fullscreen mode Exit fullscreen mode

4. Configure Local Endpoint

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434
Enter fullscreen mode Exit fullscreen mode

5. Run Claude Code Locally

claude --model qwen2.5:3b
Enter fullscreen mode Exit fullscreen mode

🧠 What This Actually Does

Instead of sending requests to Anthropic’s servers, Claude Code:

  • Calls a local API (Ollama)
  • Uses an open-source LLM
  • Executes agentic workflows on your machine

✨ Benefits

  • No API cost β†’ completely free usage
  • Privacy-first β†’ your code never leaves your system
  • Flexible models β†’ switch between different open-source LLMs
  • Offline capability β†’ works without internet

⚠️ Limitations

Let’s be honest:

  • Not equivalent to Claude Sonnet/Opus quality
  • Smaller models struggle with complex reasoning
  • Performance depends on your hardware

For example:

  • 3B models β†’ fast but limited
  • 7B–13B β†’ balanced
  • 30B+ β†’ powerful but slow on laptops

πŸ’‘ When to Use This

Best use cases:

  • Local development assistant
  • Code autocomplete / small tasks
  • Privacy-sensitive projects
  • Cost-sensitive workflows

πŸš€ Final Thoughts

This setup represents a shift toward:

Local-first AI development

While cloud models still lead in performance, local setups are becoming increasingly practical for everyday workflows.

And for developers, this means:

πŸ‘‰ More control
πŸ‘‰ Lower cost
πŸ‘‰ Faster experimentation


⭐ If you're building with local AI agents, I’d love to hear your setup.

Top comments (0)