DEV Community

Tsunamayo
Tsunamayo

Posted on

Your Local LLM Just Learned to Think: Building an Autonomous ReAct Agent with Ollama + MCP

Your local Ollama model just learned to think for itself.

With helix-agent v0.4.0, your local LLM doesn't just answer questions — it reasons step by step, uses tools, and iterates until it solves the problem. All through Claude Code, zero API cost.

What Changed

helix-agent started as a simple proxy: send a prompt to Ollama, get text back. Now it's an autonomous ReAct agent.

Here's what that looks like in practice:

Task: "Read pyproject.toml and summarize the project"

Step 1: LLM thinks "I need to read the file"
        -> calls read_file("pyproject.toml")
        -> gets file contents

Step 2: LLM analyzes the contents
        -> calls finish("v0.4.0, deps: fastmcp + httpx, MIT license")

Done. 2 steps. Correct answer.
Enter fullscreen mode Exit fullscreen mode

The LLM decided what to do, executed it, observed the result, and formed its answer. No human guidance needed.

Built-in Tools

The agent has 7 tools it can use autonomously:

Tool What it does
read_file Read any file (security-guarded)
write_file Create or modify files
list_files Browse directories
search_in_file Regex search within files
run_command Execute git, python, uv, ollama
calculate Evaluate math expressions
search_memory Query Qdrant knowledge base

Security: PathGuard

Letting an LLM touch your filesystem sounds dangerous. PathGuard makes it safe:

  • Directory allowlist — agent can only access specified folders
  • Sensitive file blocking.env, credentials, SSH keys are untouchable
  • Path traversal prevention../../ attacks are caught and blocked
  • Command allowlist — only git, python, uv, ollama can be executed

Why ReAct Instead of Native Function Calling?

Ollama's native tools API only works with a few models (Llama 3.1, Mistral Nemo). Worse, Qwen3.5 has known bugs with it.

helix-agent uses prompt-based ReAct with JSON structured output. This means:

  • Works with every Ollama model
  • Reasoning is visible (the thought field)
  • Easy to debug

Setup (2 Minutes)

# 1. Have Ollama running
ollama pull gemma3

# 2. Clone and install
git clone https://github.com/tsunamayo7/helix-agent.git
cd helix-agent && uv sync
Enter fullscreen mode Exit fullscreen mode

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "helix-agent": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/helix-agent", "python", "server.py"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Replace /path/to/helix-agent with your actual clone path. Restart Claude Code.

What You Can Do With It

Single-shot reasoning:

"Use helix-agent to review this function for bugs"

Multi-step agent tasks:

"Use helix-agent agent to explore the src directory and explain the architecture"

Benchmarking:

"Run helix-agent models benchmark to rank my local models"

The Numbers

  • 144 tests passing
  • 7 built-in agent tools
  • <5% context overhead (PAL MCP uses ~50%)
  • Works with any Ollama model
  • MIT license

GitHub: tsunamayo7/helix-agent

Feedback and stars welcome.

Top comments (0)