Eresh Gorantla

Posted on Apr 5 • Edited on Apr 9 • Originally published at eresh-gorantla.Medium

Architecture Breaks Silently. Now There's a Tool That Investigates.

#ai #productivity #opensource #vscode

There's a class of bugs that no single-file view will ever reveal.

Working in IoT, I've seen failures surface in one place while the real cause lived somewhere completely different. A device message handler would fail under load -- but the handler code looked correct. The root cause was spread across multiple files: a config value that disabled a cache, three async paths racing against the same shared resource, and a state recomputation that should never have been happening on every request. Each file was fine in isolation. The bug only existed in how they interacted at runtime.

This pattern keeps showing up in every complex system I've worked on. The hardest bugs aren't syntax errors. They're architectural -- config that silently changes downstream behavior, interface contracts that drift, concurrent paths that only break under real load.

Fixing them requires investigation: trace the failure, read the caller chain, inspect config, connect behavior across modules. That's exactly the workflow I wanted to automate.

So I built Archexa

What it does

Archexa is a VS Code extension powered by the Archexa CLI -- a self-contained binary that scans your project with tree-sitter, then uses an LLM agent to investigate across files.

Six commands, all from right-click or keyboard shortcuts:

Review (Cmd+Shift+R) -- Architecture-aware code review. Findings appear as inline squiggles, just like ESLint or TypeScript diagnostics.
Diagnose (Cmd+Shift+D) -- Paste an error or stack trace. It traces the call chain to the root cause across files.
Impact (Cmd+Shift+I) -- "What breaks if I change this?" Traces callers and interface contracts.
Query (Cmd+Alt+Q) -- Ask anything about your codebase. Evidence-backed answers with file references.
Gist -- Quick codebase overview. Great for day-one onboarding.
Analyze -- Full architecture documentation with Mermaid diagrams.

Setup (60 seconds)

# From VS Code Marketplace — search "Archexa" and click Install
# Or from the command line:
code --install-extension EreshGorantla.archexa

The setup wizard downloads the CLI binary automatically (~20 MB). No Python, no pip, no runtime dependencies.

Then: Sidebar > Settings > Connection > set your API key + endpoint.

Works with any OpenAI-compatible provider:

OpenAI (gpt-4o)
OpenRouter (gemini-2.5-flash, claude-sonnet-4-20250514, etc.)
Ollama (llama3.1 -- fully local, zero code leaves your machine)
vLLM, Azure OpenAI, LiteLLM

How it works

Every command follows the same three-phase approach:

1. Scan -- The CLI parses your project with tree-sitter AST parsing. Maps imports, function signatures, call sites, and module boundaries. Runs offline in seconds. Results are cached.

2. Investigate -- The LLM becomes an agent with tools: read_file, grep_pattern, trace_callers, list_directory. It decides what to read and when it has enough evidence. Typical run: 3-6 iterations, 10-20 tool calls, 5-15 files examined.

Here's a real investigation trace:

Step 1: Read error handler -> found verify_token()
Step 2: Search verify_token -> 8 references, 5 files. Read cache module -> no synchronization
Step 3: Trace callers of cache.get_token -> 3 async handlers. Read config -> TTL=0
Step 4: Root cause: race condition. Concurrent handlers, unprotected cache, caching disabled by config.

The agent followed the trail across files, just like an experienced engineer would -- in under a minute.

3. Synthesize -- All evidence assembled into one context-optimized prompt. The LLM generates the final output with file references and severity ratings.

The Archexa CLI

The extension is the UI layer. The real engine is the Archexa CLI -- a standalone binary that handles:

Tree-sitter AST parsing across 13 languages
Agent orchestration with tool-calling loops
Progressive context pruning as evidence grows
Streaming LLM communication via any OpenAI-compatible API

The binary communicates with the extension over stdout (streaming markdown) and stderr (JSON events). One download, chmod +x, works everywhere.

Design principles

Fresh investigation every time. No memory between runs. Every analysis starts from a fresh scan of current code. The answer is always grounded in your codebase as it exists right now, not a cached version from last week. Tree-sitter caching keeps the structural scan fast.

Your data stays yours. Zero telemetry -- not usage stats, not crash reports. The binary talks to exactly one service: the LLM endpoint you configure. API keys are stored in VS Code's encrypted credential store and passed via environment variable. Never written to config files.

No gatekeeping. No accounts, no sign-up, no Archexa API key. Bring your own LLM provider. Open source under Apache 2.0.

Fully local option

brew install ollama
ollama pull llama3.1

# In Archexa settings:
# Endpoint: http://localhost:11434/v1/
# Model: llama3.1

Zero code leaves your machine. AST parsing is offline. LLM runs locally.

Platform support

	Supported
OS	macOS (Apple Silicon + Intel), Linux (x86_64 + ARM), Windows (x64)
Languages	Python, TypeScript, JavaScript, Go, Java, Rust, Ruby, C#, Kotlin, Scala, C++, C, PHP
LLM Providers	Any OpenAI-compatible endpoint

Beta notice

Archexa is currently in beta. The core pipeline is stable, but the CLI binary is not yet code-signed.

macOS users: Gatekeeper may block the binary on first run since it's unsigned. The extension handles quarantine removal automatically in most cases. If you see a "cannot be opened" dialog, a notification with a "Fix Permissions" button will appear. Full troubleshooting steps are in the Usage Guide.

What's next

CI integration (findings as GitHub PR comments)
Custom agent tools (project-specific scripts during deep mode investigation)
Auto-fix after review (investigate a finding, generate the fix, apply with your approval)

Top comments (1)

Harjot Singh • Jun 1

totally agree with you on the challenge of architectural bugs. they often hide in the interactions between components, making them tough to track down. at moonshift, we help developers get a full next.js + postgres + auth app up and running in about 7 minutes, and you own the code on your github. if you're curious, I can set you up with a free run to test it out.

DEV Community