XINMurat

Posted on Apr 19

I Built an AI Agent That Analyzes Your Entire Codebase in One Command

#ai #opensource #python #devtools

From a static prompt library to a fully autonomous agentic framework — the story of Beyan v2.0

Most code analysis tools tell you what to look for. Beyan tells you everything — automatically.

I've been building Beyan, an open-source agentic framework that scans your project, detects its tech stack, compiles a custom analysis prompt, and runs a deep technical audit — all in a single CLI command. No manual configuration. No copy-pasting prompts. Just point it at your codebase and let it work.

Here's what that looks like:

python cli/analyzer.py --target /your/project --mode 1 --lang en --api anthropic

That one command produces a structured report covering security vulnerabilities, performance bottlenecks, code quality patterns, API design issues, accessibility gaps, and more — calibrated specifically for your project's tech stack.

The Problem I Was Solving

I started Beyan as a prompt library — v1.0 was a carefully structured collection of markdown prompts for technical audits. Each prompt was hand-crafted for a specific project type: web apps, OS/firmware, AI/ML research, DevOps infrastructure, blockchain, and so on.

The core insight behind v1.0 was simple but powerful: applying the wrong analysis prompt gives you wrong results. A security audit prompt designed for a React app doesn't ask the right questions about a Kubernetes cluster. Type-awareness matters.

But v1.0 had a friction problem. Using it required:

Reading the triage prompt to figure out which analysis to run
Opening the right prompt file
Copy-pasting it into your AI assistant
Manually attaching your project files
Repeating for each dimension (security, performance, etc.)

That's 15-20 minutes of setup before you get a single line of analysis. For a daily tool, that's too much.

v2.0 eliminates all of that.

How Beyan v2.0 Works

The architecture has four layers:

1. Discovery Engine (`discovery.py`)

Beyan scans your project directory and fingerprints it using a multi-stage detection system:

FINGERPRINTS = {
    "node": {
        "files": ["package.json", "yarn.lock"],
        "extensions": [".js", ".ts", ".jsx", ".tsx"],
        "content": {"package.json": r'"dependencies":'}
    },
    "infrastructure": {
        "files": ["main.tf", "kustomization.yaml"],
        "extensions": [".tf", ".yaml"],
        "content": {".yaml": r"apiVersion:|kind:\s*Deployment"}
    },
    # ... 20+ technology fingerprints
}

It goes beyond file extensions — it reads file content to confirm. A .yaml file only triggers the infrastructure modules if it actually contains Kubernetes or Terraform syntax. This prevents false positives.

2. Compiler (`compiler.py`)

Once the tech stack is identified, the compiler loads the relevant analysis modules from MANIFEST.yaml and assembles them into a single, context-dense prompt:

modules:
  security_analysis:
    priority: P0
    auto_load_if: [production, handles_pii, financial]
  react_typescript_analysis:
    priority: P1
    auto_load_if: [react, typescript, tsx]

P0 modules always load. P1+ modules load based on detected tags. If the compiled prompt exceeds the LLM's token limit, low-priority modules are pruned automatically — no truncation, no silent failures.

3. Three Operating Modes

Mode 1 — Analysis Only: Generates a structured report. No code changes. Safe to run on any project.

Mode 2 — Analysis + Plan: Runs the full analysis, then produces an implementation_plan.md with sprint-ready task breakdowns:

## Sprint 1 — Critical Fixes (P0)
| Task | File:Line | Effort | Risk |
|------|-----------|--------|------|
| Fix SQL injection in UserService | services/user.py:142 | 2h | High |

Mode 3 — Semi-Autonomous Fix: The full agentic loop. Beyan analyzes, plans, writes code, runs tests, and commits — but pauses at three human checkpoints before touching anything:

CHECKPOINT #1: "Here are the P0 issues. Proceed with auto-fix?"
CHECKPOINT #2: "Here's the diff. Apply this change?"  
CHECKPOINT #3: "Tests pass. Commit to branch fix/beyan-p0-2024?"

Safety rules are hard-coded: never touch database migration files, never modify production configs, max 20 files per run, always create a safety branch first.

4. Multi-Provider API Support

Works with both OpenAI and Anthropic out of the box:

# OpenAI
python cli/analyzer.py --target . --mode 1 --api openai

# Anthropic  
python cli/analyzer.py --target . --mode 3 --api anthropic

Session persistence means Mode 3 can be interrupted and resumed — the conversation history is saved to sessions/ as JSON.

The Module System

Beyan ships with 52 analysis modules organized into six categories:

Category	Count	Examples
Core	23	security, performance, database, UI/UX, accessibility
Domain	7	web/mobile, DevOps, AI/ML, blockchain, OS/firmware
Specialized	10	React+TypeScript, .NET Core, Turkish market compliance
Focus	4	security audit, performance audit, API audit, compliance
Testing	3	test generation, UI interaction tests, collaboration tests
Guides	4	security fixes, DB migration, performance optimization

Every module follows the same two-layer analysis principle inherited from v1.0:

Descriptive layer: Documents what exists, no judgment
Evaluative layer: Assesses quality, identifies gaps, produces recommendations

The evaluative layer never starts until the descriptive layer is complete. This prevents premature conclusions from contaminating the factual record.

The NOT DETECTED Contract

One of Beyan's core principles, carried forward from v1.0: if information cannot be found in the codebase, the output must say:

⚠️ NOT DETECTED — [which file/directory was searched]

Never guess. Never fabricate. This single rule is responsible for most of the hallucination resistance. An LLM that's forced to flag gaps instead of filling them with invented content produces dramatically more trustworthy output.

Token Budget Management

Context windows are finite. Beyan handles this automatically:

def prune_modules_by_priority(manifest, modules_to_load, lang):
    # Remove P3 modules first, then P2
    # P0 and P1 are always preserved
    p3_mods = [m for m in modules_to_load 
               if manifest["modules"][m]["priority"] == "P3"]
    if p3_mods:
        modules_to_load.remove(p3_mods[0])
        return modules_to_load

The analyzer loop runs this iteratively until the compiled prompt fits within the configured token limit. You get the most important analysis every time, regardless of project size.

Quick Start

# Clone
git clone https://github.com/XINMurat/beyan.git
cd beyan/v2

# Install
pip install -r requirements.txt

# Set your API key
export ANTHROPIC_API_KEY=your_key_here
# or
export OPENAI_API_KEY=your_key_here

# Run analysis on any project
python cli/analyzer.py --target /path/to/your/project --mode 1 --lang en --api anthropic

The compiled prompt is also saved to beyan_compiled_prompt.md so you can inspect exactly what gets sent to the LLM — full transparency, no black boxes.

What's Still in v1.0

The original prompt library lives in en/ and tr/ — 15 hand-crafted prompts covering every project type. If you want to run a manual deep-dive audit without the CLI, those are still there. v2.0 uses them as its knowledge base; they didn't go anywhere.

The self-referential development story is also documented: Beyan v1.0 was audited using its own Meta Audit prompt before release. Health score went from 2.45 to 4.75 after applying the findings. The full cycle is in tr/meta-analysis/.

What's Next

A few things on the roadmap:

PyPI package — pip install beyan-agentic is coming
GitHub Actions integration — run Beyan as a CI step on every PR
More domain modules — game engine analysis, mobile-native deep dive
Web UI — for teams who don't want a CLI

DEV Community

I Built an AI Agent That Analyzes Your Entire Codebase in One Command

From a static prompt library to a fully autonomous agentic framework — the story of Beyan v2.0

The Problem I Was Solving

How Beyan v2.0 Works

1. Discovery Engine (`discovery.py`)

2. Compiler (`compiler.py`)

3. Three Operating Modes

4. Multi-Provider API Support

The Module System

The NOT DETECTED Contract

Token Budget Management

Quick Start

What's Still in v1.0

What's Next

Links

Top comments (0)

From a static prompt library to a fully autonomous agentic framework — the story of Beyan v2.0

The Problem I Was Solving

How Beyan v2.0 Works

1. Discovery Engine (discovery.py)

2. Compiler (compiler.py)

3. Three Operating Modes

4. Multi-Provider API Support

The Module System

The NOT DETECTED Contract

Token Budget Management

Quick Start

What's Still in v1.0

What's Next

Links

1. Discovery Engine (`discovery.py`)

2. Compiler (`compiler.py`)