DEV Community

Cover image for I Built an AI Agent That Analyzes Your Entire Codebase in One Command
XINMurat
XINMurat

Posted on

I Built an AI Agent That Analyzes Your Entire Codebase in One Command

From a static prompt library to a fully autonomous agentic framework — the story of Beyan v2.0

Most code analysis tools tell you what to look for. Beyan tells you everything — automatically.

I've been building Beyan, an open-source agentic framework that scans your project, detects its tech stack, compiles a custom analysis prompt, and runs a deep technical audit — all in a single CLI command. No manual configuration. No copy-pasting prompts. Just point it at your codebase and let it work.

Here's what that looks like:

python cli/analyzer.py --target /your/project --mode 1 --lang en --api anthropic
Enter fullscreen mode Exit fullscreen mode

That one command produces a structured report covering security vulnerabilities, performance bottlenecks, code quality patterns, API design issues, accessibility gaps, and more — calibrated specifically for your project's tech stack.


The Problem I Was Solving

I started Beyan as a prompt library — v1.0 was a carefully structured collection of markdown prompts for technical audits. Each prompt was hand-crafted for a specific project type: web apps, OS/firmware, AI/ML research, DevOps infrastructure, blockchain, and so on.

The core insight behind v1.0 was simple but powerful: applying the wrong analysis prompt gives you wrong results. A security audit prompt designed for a React app doesn't ask the right questions about a Kubernetes cluster. Type-awareness matters.

But v1.0 had a friction problem. Using it required:

  1. Reading the triage prompt to figure out which analysis to run
  2. Opening the right prompt file
  3. Copy-pasting it into your AI assistant
  4. Manually attaching your project files
  5. Repeating for each dimension (security, performance, etc.)

That's 15-20 minutes of setup before you get a single line of analysis. For a daily tool, that's too much.

v2.0 eliminates all of that.


How Beyan v2.0 Works

The architecture has four layers:

1. Discovery Engine (discovery.py)

Beyan scans your project directory and fingerprints it using a multi-stage detection system:

FINGERPRINTS = {
    "node": {
        "files": ["package.json", "yarn.lock"],
        "extensions": [".js", ".ts", ".jsx", ".tsx"],
        "content": {"package.json": r'"dependencies":'}
    },
    "infrastructure": {
        "files": ["main.tf", "kustomization.yaml"],
        "extensions": [".tf", ".yaml"],
        "content": {".yaml": r"apiVersion:|kind:\s*Deployment"}
    },
    # ... 20+ technology fingerprints
}
Enter fullscreen mode Exit fullscreen mode

It goes beyond file extensions — it reads file content to confirm. A .yaml file only triggers the infrastructure modules if it actually contains Kubernetes or Terraform syntax. This prevents false positives.

2. Compiler (compiler.py)

Once the tech stack is identified, the compiler loads the relevant analysis modules from MANIFEST.yaml and assembles them into a single, context-dense prompt:

modules:
  security_analysis:
    priority: P0
    auto_load_if: [production, handles_pii, financial]
  react_typescript_analysis:
    priority: P1
    auto_load_if: [react, typescript, tsx]
Enter fullscreen mode Exit fullscreen mode

P0 modules always load. P1+ modules load based on detected tags. If the compiled prompt exceeds the LLM's token limit, low-priority modules are pruned automatically — no truncation, no silent failures.

3. Three Operating Modes

Mode 1 — Analysis Only: Generates a structured report. No code changes. Safe to run on any project.

Mode 2 — Analysis + Plan: Runs the full analysis, then produces an implementation_plan.md with sprint-ready task breakdowns:

## Sprint 1 — Critical Fixes (P0)
| Task | File:Line | Effort | Risk |
|------|-----------|--------|------|
| Fix SQL injection in UserService | services/user.py:142 | 2h | High |
Enter fullscreen mode Exit fullscreen mode

Mode 3 — Semi-Autonomous Fix: The full agentic loop. Beyan analyzes, plans, writes code, runs tests, and commits — but pauses at three human checkpoints before touching anything:

CHECKPOINT #1: "Here are the P0 issues. Proceed with auto-fix?"
CHECKPOINT #2: "Here's the diff. Apply this change?"  
CHECKPOINT #3: "Tests pass. Commit to branch fix/beyan-p0-2024?"
Enter fullscreen mode Exit fullscreen mode

Safety rules are hard-coded: never touch database migration files, never modify production configs, max 20 files per run, always create a safety branch first.

4. Multi-Provider API Support

Works with both OpenAI and Anthropic out of the box:

# OpenAI
python cli/analyzer.py --target . --mode 1 --api openai

# Anthropic  
python cli/analyzer.py --target . --mode 3 --api anthropic
Enter fullscreen mode Exit fullscreen mode

Session persistence means Mode 3 can be interrupted and resumed — the conversation history is saved to sessions/ as JSON.


The Module System

Beyan ships with 52 analysis modules organized into six categories:

Category Count Examples
Core 23 security, performance, database, UI/UX, accessibility
Domain 7 web/mobile, DevOps, AI/ML, blockchain, OS/firmware
Specialized 10 React+TypeScript, .NET Core, Turkish market compliance
Focus 4 security audit, performance audit, API audit, compliance
Testing 3 test generation, UI interaction tests, collaboration tests
Guides 4 security fixes, DB migration, performance optimization

Every module follows the same two-layer analysis principle inherited from v1.0:

  • Descriptive layer: Documents what exists, no judgment
  • Evaluative layer: Assesses quality, identifies gaps, produces recommendations

The evaluative layer never starts until the descriptive layer is complete. This prevents premature conclusions from contaminating the factual record.


The NOT DETECTED Contract

One of Beyan's core principles, carried forward from v1.0: if information cannot be found in the codebase, the output must say:

⚠️ NOT DETECTED[which file/directory was searched]

Never guess. Never fabricate. This single rule is responsible for most of the hallucination resistance. An LLM that's forced to flag gaps instead of filling them with invented content produces dramatically more trustworthy output.


Token Budget Management

Context windows are finite. Beyan handles this automatically:

def prune_modules_by_priority(manifest, modules_to_load, lang):
    # Remove P3 modules first, then P2
    # P0 and P1 are always preserved
    p3_mods = [m for m in modules_to_load 
               if manifest["modules"][m]["priority"] == "P3"]
    if p3_mods:
        modules_to_load.remove(p3_mods[0])
        return modules_to_load
Enter fullscreen mode Exit fullscreen mode

The analyzer loop runs this iteratively until the compiled prompt fits within the configured token limit. You get the most important analysis every time, regardless of project size.


Quick Start

# Clone
git clone https://github.com/XINMurat/beyan.git
cd beyan/v2

# Install
pip install -r requirements.txt

# Set your API key
export ANTHROPIC_API_KEY=your_key_here
# or
export OPENAI_API_KEY=your_key_here

# Run analysis on any project
python cli/analyzer.py --target /path/to/your/project --mode 1 --lang en --api anthropic
Enter fullscreen mode Exit fullscreen mode

The compiled prompt is also saved to beyan_compiled_prompt.md so you can inspect exactly what gets sent to the LLM — full transparency, no black boxes.


What's Still in v1.0

The original prompt library lives in en/ and tr/ — 15 hand-crafted prompts covering every project type. If you want to run a manual deep-dive audit without the CLI, those are still there. v2.0 uses them as its knowledge base; they didn't go anywhere.

The self-referential development story is also documented: Beyan v1.0 was audited using its own Meta Audit prompt before release. Health score went from 2.45 to 4.75 after applying the findings. The full cycle is in tr/meta-analysis/.


What's Next

A few things on the roadmap:

  • PyPI packagepip install beyan-agentic is coming
  • GitHub Actions integration — run Beyan as a CI step on every PR
  • More domain modules — game engine analysis, mobile-native deep dive
  • Web UI — for teams who don't want a CLI

Links

If you try it on your project, I'd genuinely love to hear what the output looks like. Open an issue, start a discussion, or just drop a comment here.


Built with Claude, audited with Beyan, shipped with way too many git branch adventures.

Top comments (0)