DEV Community

Cover image for Build an AI Agent That Actually Understands Your Codebase (Without Switching Editors)
Amariah Kamau
Amariah Kamau

Posted on

Build an AI Agent That Actually Understands Your Codebase (Without Switching Editors)

I use Neovim. I'm not switching.

But I also want an agent that can refactor across 50 files, run tests, debug failures, and come back with a working PR — not just suggest the next line.

The problem: every AI coding tool that does serious agentic work wants to be your editor. Cursor, Windsurf, GitHub Copilot Workspace — all VS Code. If you use anything else, you're a second-class citizen.

So I built something different. An agent that has its own workspace beside my editor, not inside it. I stay in Neovim. The agent gets a live map of the codebase, a terminal, file tools, and an approval queue.

This post walks through how it works, how to set it up, and what it actually looks like in practice.


The Core Problem with Raw Code Injection

Before getting into the setup, it's worth understanding why most agents struggle with large codebases.

The standard approach is to dump files into the context window:

Here is your codebase:
[file 1 - 500 lines]
[file 2 - 800 lines]
[file 3 - 1200 lines]
...
Enter fullscreen mode Exit fullscreen mode

This works at small scale. At 20+ files it starts to break down. The model spends most of its reasoning budget reconstructing architecture from raw text instead of actually solving the problem.

The better approach: give the agent a structured map of the codebase and let it query specific parts on demand. Think of it like the difference between handing someone a stack of printed pages vs giving them a searchable database with a good schema.

That's the core idea behind Atlarix's Live Code Map — your repo gets parsed into a node/edge graph that the agent navigates instead of reading raw files linearly.


Setup

1. Download and Install

Grab the installer from atlarix.dev.

  • macOS: .dmg, notarized, installs to Applications
  • Linux: .AppImage (auto-updates), .deb, or .rpm
  • Windows: unsigned .exe (code signing coming)

2. Install the CLI

Open Atlarix → Settings → General → Install CLI.

This drops an atlarix binary into your PATH. After that:

# Open current directory as a workspace
atlarix .

# Open a specific path
atlarix ~/projects/my-api
Enter fullscreen mode Exit fullscreen mode

Same muscle memory as code .. Atlarix opens in the background, your terminal returns immediately.

3. Connect a Model

Go to Settings → AI. You have two options:

BYOK (Bring Your Own Key) — paste in an API key for any of the supported providers:

OpenAI, Anthropic, Google Gemini, Groq, Together AI,
Mistral, xAI, OpenRouter, AWS Bedrock
Enter fullscreen mode Exit fullscreen mode

Local models via Ollama or LM Studio — set the base URL and pick your model. No API key needed. Works on the free Solo tier.

Ollama base URL: http://localhost:11434
Model: qwen2.5-coder:7b
Enter fullscreen mode Exit fullscreen mode

For local models I've had good results with qwen2.5-coder:7b and deepseek-coder-v2:16b on an M2 MacBook. The structured context from the code map means you don't need a massive model to get useful results.

Model Bridge (managed) — Atlarix's own managed tier (Speed/Standard/Deep) if you don't want to manage keys. Speed tier is included in the free Solo plan.


Opening a Workspace

cd ~/projects/my-saas-api
atlarix .
Enter fullscreen mode Exit fullscreen mode

Atlarix opens and loads your workspace. The first thing it does is a lightweight repo scan via git ls-files — fast, respects your .gitignore, and doesn't run any heavy parsing upfront.

The Live Code Map is built on-demand when the agent first calls get_blueprint. You can also trigger it manually from the Blueprint tab in the right panel.

For a 500-file TypeScript repo, the initial parse takes 3–5 seconds. After that it's cached and incremental.


Your First Agent Task

Let's say I want to add rate limiting to my auth routes. Here's what that actually looks like.

Step 1: Start in Explore Mode

Pick Explore from the mode selector. This is read-only — the agent can query the code map, read files, and search, but can't write anything. Good for orientation.

Me: Map the auth flow. Where does a login request go from the 
    route handler to the database?
Enter fullscreen mode Exit fullscreen mode

The agent calls get_blueprint to pull the architectural graph, then navigates from the auth route to the middleware stack to the database layer. It comes back with a precise answer and which files are involved — without reading every file in the codebase.

Step 2: Switch to Plan Mode

Now switch to Plan mode. The agent can now draft a plan but still can't execute writes.

Me: I want to add rate limiting to the login and registration 
    routes. Use express-rate-limit. Failed attempts should be 
    tracked by IP. After 5 failures in 10 minutes, lock out 
    for 30 minutes.
Enter fullscreen mode Exit fullscreen mode

The agent knows the architecture from the previous explore session. It drafts a step-by-step plan:

Plan: Add rate limiting to auth routes

[ ] Install express-rate-limit
[ ] Create rate limiter config in src/middleware/rateLimiter.ts
[ ] Apply limiter to POST /auth/login in src/routes/auth.ts
[ ] Apply limiter to POST /auth/register in src/routes/auth.ts
[ ] Add error response handler for 429 status
[ ] Update integration tests in tests/auth.test.ts
Enter fullscreen mode Exit fullscreen mode

Review the plan. If it looks right, switch to Build mode.

Step 3: Build — Approval Queue in Action

In Build mode, the agent starts executing. For every file write, you get an approval prompt:

Write: src/middleware/rateLimiter.ts

+ import rateLimit from 'express-rate-limit';
+
+ export const loginRateLimiter = rateLimit({
+   windowMs: 10 * 60 * 1000, // 10 minutes
+   max: 5,
+   message: { error: 'Too many login attempts. Try again later.' },
+   standardHeaders: true,
+   legacyHeaders: false,
+ });
+
+ export const registerRateLimiter = rateLimit({
+   windowMs: 60 * 60 * 1000, // 1 hour
+   max: 3,
+   message: { error: 'Too many registration attempts.' },
+   standardHeaders: true,
+   legacyHeaders: false,
+ });

[Approve] [Reject]
Enter fullscreen mode Exit fullscreen mode

You approve. The file gets written. The agent continues through the plan.

For terminal commands:

Terminal: npm install express-rate-limit

[Approve] [Reject]
Enter fullscreen mode Exit fullscreen mode

You approve. It runs, shows you the output, continues.

This isn't friction — it's the same review cycle as a PR, but live. You're watching the work happen and approving as it goes rather than reviewing after the fact.

Step 4: Review in Your Editor

The agent writes the files. You review them in Neovim, VS Code, IntelliJ, or wherever. The agent doesn't care what you're using to review — it just cares about the approval queue.

# Meanwhile in my terminal
nvim src/middleware/rateLimiter.ts
# Looks good, approve the rest in Atlarix
Enter fullscreen mode Exit fullscreen mode

Step 5: Tests

If the agent runs tests as part of the plan and they fail, it reads the failure output and iterates. Self-correction is built in — up to a configurable number of retry attempts before it stops and asks you what to do.

Test run: npm test
FAIL tests/auth.test.ts
  ● Auth › POST /login › should return 429 after 5 attempts
    Expected: 429
    Received: 200

Diagnosing failure...
Enter fullscreen mode Exit fullscreen mode

It reads the test, traces the failure to a missing @types/express-rate-limit dev dependency, installs it, re-runs. Pass.


Modes Reference

Mode What the agent can do
Explore Read files, query code map, search, web search. No writes.
Plan Everything in Explore + draft plans, create .atlarix/ATLARIX_PLAN.md
Build Full tool access — file writes, terminal, tests, MCP calls
Fix Focused on diagnosing and fixing specific errors
Review Read-only analysis, code quality feedback, architectural suggestions

You can switch modes mid-session. The agent's context persists across the switch.


What Works Well With Local Models

If you're running Ollama, here's what I've found works well and what doesn't:

Works great:

  • Explore mode — querying the code map, finding files, understanding architecture. Even a 7B model does this well when it has the graph.
  • Simple, scoped Build tasks — "add a field to this schema and update the related API endpoint"
  • Fix mode — diagnosing TypeScript errors with LSP output injected into context

Works better with larger models:

  • Multi-step autonomous plans across many files
  • Complex refactors with non-obvious dependency chains
  • Tasks that require reasoning about edge cases and side effects

The practical threshold I've found: for anything touching more than 10 files or involving significant architectural decisions, I switch from my local 7B to a Standard tier cloud model. For everything else, local is fast and free.


Tips

Use .atlarix/ATLARIX.md for persistent context. This file gets injected into every session for this workspace. Put your tech stack, conventions, and any context the agent should always know.

# Project Context

Stack: Node.js, Express, TypeScript, PostgreSQL, Prisma
Auth: JWT with refresh tokens, stored in httpOnly cookies
Testing: Jest + Supertest
Conventions: 
- All database queries go through service layer, never directly in routes
- Error handling via custom AppError class in src/lib/errors.ts
Enter fullscreen mode Exit fullscreen mode

Start tasks in Explore mode. Even for tasks you think are simple, a quick explore turn first means the agent navigates correctly from the start rather than making wrong assumptions.

Reject and explain, don't just reject. When you reject an approval queue item, add a reason. The agent uses it to replan. "Reject — use the existing AppError class for error handling, not a raw throw" gets you a better next attempt than a silent reject.

Check the Blueprint canvas on new codebases. The visual graph in the Blueprint tab is useful for understanding unfamiliar repos. Filter by file type, zoom into a module, and let the graph show you the dependency shape before you start prompting.


Free Tier Summary

Feature Solo (Free)
Workspaces 1
Local models (Ollama, LM Studio) ✓ Unlimited
Model Bridge Speed tier ✓ Included usage
Core tools (file, terminal, search, blueprint)
MCP marketplace ✗ (1 manual MCP)
Agent Behaviors marketplace

For a solo developer using local models, the free tier is genuinely unlimited. No token caps on Ollama usage.


Getting Started

# 1. Download from atlarix.dev
# 2. Install CLI from Settings → General
# 3. Open your project
atlarix .

# 4. Connect Ollama (or paste an API key)
# 5. Start in Explore mode, describe your codebase
# 6. Switch to Build when you're ready to execute
Enter fullscreen mode Exit fullscreen mode

The learning curve is less steep than it sounds. The hardest part is resisting the urge to jump straight into Build mode — a quick Explore turn first makes everything else go smoother.


atlarix.dev — macOS + Linux, free Solo tier

Open source: atlarix-skills · atlarix-mcps

Questions? Drop them in the comments — happy to go deep on any part of this.

Top comments (0)