<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rudraneel Das</title>
    <description>The latest articles on DEV Community by Rudraneel Das (@rudraneel_das_d5f9749a93e).</description>
    <link>https://dev.to/rudraneel_das_d5f9749a93e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3909699%2Fafdd407b-2745-4a84-bbdd-5c56eb6095af.jpg</url>
      <title>DEV Community: Rudraneel Das</title>
      <link>https://dev.to/rudraneel_das_d5f9749a93e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rudraneel_das_d5f9749a93e"/>
    <language>en</language>
    <item>
      <title>Building an AI Coding Agent That Remembers Your Codebase Across Sessions</title>
      <dc:creator>Rudraneel Das</dc:creator>
      <pubDate>Sun, 03 May 2026 07:53:14 +0000</pubDate>
      <link>https://dev.to/rudraneel_das_d5f9749a93e/building-an-ai-coding-agent-that-remembers-your-codebase-across-sessions-p2d</link>
      <guid>https://dev.to/rudraneel_das_d5f9749a93e/building-an-ai-coding-agent-that-remembers-your-codebase-across-sessions-p2d</guid>
      <description>&lt;p&gt;&lt;strong&gt;Every AI coding agent has amnesia. I fixed it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two months ago, Ramp's engineering team published a fascinating blog post about &lt;a href="https://builders.ramp.com/post/why-we-built-our-background-agent" rel="noopener noreferrer"&gt;Inspect&lt;/a&gt; — their internal background coding agent. It was impressive: 30% of all pull requests merged to their frontend and backend repos are AI-written. The architecture was elegant: Modal sandboxes, OpenCode as the agent runtime, Cloudflare Durable Objects for state.&lt;/p&gt;

&lt;p&gt;But there was a problem hiding in plain sight.&lt;/p&gt;

&lt;p&gt;Every session treated the codebase like it had never seen it before. Session 1 knew nothing. Session 100 knew nothing. Session 500… still nothing. The agent could write code, but it couldn't learn.&lt;/p&gt;

&lt;p&gt;That's when I realized: &lt;strong&gt;no background agent framework in existence retains cross-session state.&lt;/strong&gt; Ramp Inspect, Open-Inspect, Copilot, Claude Code, Aider, Sweep, Devin — they all have one thing in common: they forget everything between sessions.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Shadow Engineer&lt;/strong&gt; — a learning layer that compounds over time.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Why Agent Amnesia Matters
&lt;/h2&gt;

&lt;p&gt;Imagine you're a senior engineer mentoring a junior developer. Every morning, the junior walks in and asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How does our authentication work again?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You answer. They fix a bug. They leave. Next morning, same question.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How does our authentication work again?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You'd be frustrated. You'd wonder why they're not learning. You'd start documenting things. You'd build shared knowledge.&lt;/p&gt;

&lt;p&gt;Now replace "junior developer" with "your AI coding agent." That's the state of every background agent today.&lt;/p&gt;

&lt;p&gt;Current agents have three critical failures:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No memory&lt;/strong&gt; — Every session starts by fumbling through the codebase, trying to figure out what files handle what&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No learning&lt;/strong&gt; — If an approach fails 10 times, it'll try it an 11th time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No experimentation&lt;/strong&gt; — It's a single attempt. If it fails, you start over&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Shadow Engineer fixes all three.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works: Three Engines That Compound
&lt;/h2&gt;

&lt;p&gt;Shadow Engineer is a Python library that sits between your task and your background agent (Open-Inspect, Claude Code, or any custom agent). It has three engines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌──────────────────────────────────────┐
                    │         SHADOW ENGINEER                │
                    │                                         │
   User sends task  │  ┌────────────────┐  ┌──────────────┐ │
          │         │  │ Knowledge Graph│  │  Laboratory  │ │
          ▼         │  │                │  │              │ │
   ┌──────────┐     │  │ • 7 languages  │  │ • N variants │ │
   │ Classify │─────┼──│ • Semantic     │  │ • Auto-scored│ │
   │ problem  │     │  │   search       │  │ • Winner pick│ │
   └────┬─────┘     │  │ • Dependencies │  │              │ │
        │           │  └───────┬────────┘  └──────┬───────┘ │
   ┌────▼─────┐     │         │                    │         │
   │  Build   │     │         │                    ▼         │
   │ context  │─────┼─────────┼──▶ Agent Prompt + Approach   │
   └──────────┘     │         │                              │
                    │         │       ┌──────────────────┐   │
                    │         │       │     Learning     │   │
                    │         └───────│      Engine      │   │
                    │                 │ • Pattern extract│   │
                    │                 │ • Efficacy track │   │
                    │                 │ • Failure analyze│   │
                    │                 └────────┬─────────┘   │
                    │                          │             │
                    │                          ▼             │
                    │         ┌──────────────────────────┐   │
                    │         │  Next session is SMARTER  │   │
                    │         └──────────────────────────┘   │
                    └──────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Engine 1: Knowledge Graph — "Remember"
&lt;/h3&gt;

&lt;p&gt;The knowledge graph indexes your entire codebase into a persistent, searchable structure. It extracts symbols (functions, classes, methods, interfaces) from 7 languages — Python, TypeScript, JavaScript, Go, and Rust.&lt;/p&gt;

&lt;p&gt;But unlike a simple grep tool, it uses &lt;strong&gt;ChromaDB vector embeddings&lt;/strong&gt; for semantic search. "Fix the login rate limiting" will find &lt;code&gt;throttle_requests()&lt;/code&gt; even if "throttle" doesn't appear in your query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before every agent session&lt;/strong&gt;, Shadow Engineer injects a context block into the prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### Semantically Relevant Symbols&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="gs"&gt;**authenticate_user**&lt;/span&gt; (function) in src/auth/service.py (relevance: 0.89)
  Authenticate a user from a JWT token.
  Depends on: UserModel, TokenService
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="gs"&gt;**login_handler**&lt;/span&gt; (function) in src/auth/views.py (relevance: 0.82)
  Handle login POST requests. Validates credentials.
  Complexity: 6.5

&lt;span class="gu"&gt;### Learned Codebase Conventions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**error_handling**&lt;/span&gt;: Auth errors return 401 with JSON body {error, code}
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**testing**&lt;/span&gt;: Tests are written alongside code changes

&lt;span class="gu"&gt;### Historically Effective Approaches&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Targeted Fix**&lt;/span&gt;: 85% success rate (17/20) — best model: claude-sonnet-4-6
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent now starts with informed context. No more guessing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Engine 2: Laboratory — "Experiment"
&lt;/h3&gt;

&lt;p&gt;Instead of one attempt, Shadow Engineer spawns &lt;strong&gt;N parallel sessions&lt;/strong&gt; with different strategies and picks the winner.&lt;/p&gt;

&lt;p&gt;For a bug fix, it might try all three simultaneously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Targeted Fix&lt;/strong&gt; — Find the minimal change. Write a focused fix with regression test.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Root Cause + Guard&lt;/strong&gt; — Fix the bug, add input validation to prevent similar issues.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Defense in Depth&lt;/strong&gt; — Comprehensive fix with error handling at every layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each variant runs in its own sandbox. When they complete, Shadow Engineer scores them using a &lt;strong&gt;configurable logistic curve&lt;/strong&gt; (no arbitrary cliffs — 50 lines changed scores smoothly differently from 51). The winner is selected automatically based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test pass rate (40%)&lt;/li&gt;
&lt;li&gt;Code change size (20%)&lt;/li&gt;
&lt;li&gt;Execution speed (15%)&lt;/li&gt;
&lt;li&gt;Token efficiency (10%)&lt;/li&gt;
&lt;li&gt;File count scope (15%)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All weights are configurable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;shadow_engine.laboratory.experiment&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ScoringConfig&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ScoringConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;test_pass_weight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# Prioritize tests
&lt;/span&gt;    &lt;span class="n"&gt;speed_weight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# Care more about speed
&lt;/span&gt;    &lt;span class="n"&gt;change_size_weight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Engine 3: Learning Engine — "Improve"
&lt;/h3&gt;

&lt;p&gt;This is where the compounding happens. After every session completes, the Learning Engine analyzes the outcome and extracts:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Patterns&lt;/strong&gt; — It notices things about your codebase:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"This team always writes tests alongside code changes"&lt;/li&gt;
&lt;li&gt;"Successful PRs modify ≤3 files"&lt;/li&gt;
&lt;li&gt;"Clean PRs with no review comments follow existing conventions closely"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Efficacy&lt;/strong&gt; — It tracks which approaches work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Targeted Fix succeeds 85% of the time for bug fixes with claude-sonnet-4-6"&lt;/li&gt;
&lt;li&gt;"Aggressive Rewrite fails 90% of the time — don't use this for your codebase"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Failures&lt;/strong&gt; — It understands why things fail:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"12 tests failed — changes broke existing functionality"&lt;/li&gt;
&lt;li&gt;"Agent modified 15 files — too broad, likely introduced risk"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The next time you ask for a bug fix, Shadow Engineer suggests "Targeted Fix" with 85% expected success rate. The agent gets smarter at &lt;strong&gt;your&lt;/strong&gt; codebase.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real Pipeline Verified
&lt;/h2&gt;

&lt;p&gt;This isn't theoretical. I tested it end-to-end with a real LLM:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bootstrap&lt;/td&gt;
&lt;td&gt;211 symbols from 26 files (indexed the shadow-engine source code itself)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context&lt;/td&gt;
&lt;td&gt;89-line semantic context block (ChromaDB embeddings + SQLite lookups)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM Call&lt;/td&gt;
&lt;td&gt;Ollama &lt;code&gt;qwen3:8b&lt;/code&gt; (5.2GB local model) generated ~1,216 tokens in 88 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM Quality&lt;/td&gt;
&lt;td&gt;Correctly identified &lt;code&gt;CodebaseIndexer&lt;/code&gt; as the key component for the task&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion&lt;/td&gt;
&lt;td&gt;Session recorded with real duration, tokens, and approach&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Report&lt;/td&gt;
&lt;td&gt;100% success rate, 1 pattern learned&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The LLM received the knowledge graph context and correctly reasoned through the task from first principles — identifying which files to modify, how to approach the refactoring, and what tests to write.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technical Architecture
&lt;/h2&gt;

&lt;p&gt;Shadow Engineer is production-grade:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Storage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SQLite WAL (15 tables, 6 indexes, thread-safe), JSON fallback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ChromaDB vector embeddings → SQLite text → JSON text → in-memory (6-layer degradation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;FastAPI server with optional API key auth, Redis rate limiting, &lt;code&gt;/v1/&lt;/code&gt; prefix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Deployment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Docker + docker-compose (includes Redis)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CI/CD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GitHub Actions — test matrix (Python 3.12/3.13), lint, build&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Testing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;80 tests, 73% coverage, zero failures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Async Open-Inspect bridge via &lt;code&gt;run_in_executor&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every layer handles failure gracefully. ChromaDB unavailable? Falls back to SQLite text search. Redis unavailable? Falls back to in-memory rate limiting. JSON file corrupted? Recovers per-file.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;shadow-engine

&lt;span class="c"&gt;# Or from source&lt;/span&gt;
git clone https://github.com/rudraneel93/shadow-engine
&lt;span class="nb"&gt;cd &lt;/span&gt;shadow-engine &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Index Your Codebase (CLI)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/your/project
shadow-engine bootstrap
&lt;span class="c"&gt;# → Bootstrapped: 347 symbols, 52 files indexed&lt;/span&gt;

shadow-engine search &lt;span class="s2"&gt;"authenticate"&lt;/span&gt;
&lt;span class="c"&gt;# → [function] authenticate_user — src/auth/service.py&lt;/span&gt;
&lt;span class="c"&gt;#     Authenticates a user with email and password. Returns JWT token.&lt;/span&gt;

shadow-engine context &lt;span class="s2"&gt;"fix the login rate-limiting bug"&lt;/span&gt;
&lt;span class="c"&gt;# → Output: full knowledge graph context for agent prompt&lt;/span&gt;

shadow-engine suggest &lt;span class="s2"&gt;"fix the login rate-limiting bug"&lt;/span&gt;
&lt;span class="c"&gt;# → {"problem_type": "bug_fix", "recommended_approach": "Targeted Fix", "expected_success_rate": 0.85}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  REST API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the server&lt;/span&gt;
uvicorn shadow_engine.api_server.server:app &lt;span class="nt"&gt;--reload&lt;/span&gt;

&lt;span class="c"&gt;# Index codebase&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/bootstrap

&lt;span class="c"&gt;# Get context for agent&lt;/span&gt;
curl &lt;span class="s2"&gt;"http://localhost:8000/context?task=fix+the+login+rate+limiting+bug"&lt;/span&gt;

&lt;span class="c"&gt;# Record session result&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/sessions/ingest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"session_id":"sess-001","outcome":"success","prompt":"fix login bug",...}'&lt;/span&gt;

&lt;span class="c"&gt;# View improvement report&lt;/span&gt;
curl http://localhost:8000/report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python SDK
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;shadow_engine.main&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ShadowEngine&lt;/span&gt;

&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ShadowEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./my-project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bootstrap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Get context for agent prompt
&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix the login rate-limiting bug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get approach suggestion
&lt;/span&gt;&lt;span class="n"&gt;suggestion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;suggest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix the login rate-limiting bug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;suggestion&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recommended_approach&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;suggestion&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;expected_success_rate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; expected)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Record a completed session
&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;record_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sess-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix the login bug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;approach&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Targeted Fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;files_changed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/auth.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tests/test_auth.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;tests_passed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tests_failed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# View improvement
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_report&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;The AI coding tool market is exploding. Every major company is building internal agents. But they're all building them wrong — they optimize for individual sessions when the real value is &lt;strong&gt;cross-session compounding.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Shadow Engineer makes the 100th session smarter than the 1st. It's MIT-licensed. It works with any background agent. And it's been tested with real LLMs.&lt;/p&gt;

&lt;p&gt;The knowledge graph grows. The patterns accumulate. The efficacy data deepens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the moat that no other framework ships.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I'm actively working on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Same-file dependency tracking (intra-file call graphs)&lt;/li&gt;
&lt;li&gt;PostgreSQL backend for 100K+ session scale&lt;/li&gt;
&lt;li&gt;Distributed experiment execution (multi-node parallel agents)&lt;/li&gt;
&lt;li&gt;Fine-tuning on organization-specific coding patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you use background coding agents — or if you're building one — I'd love your feedback. What would make you switch? What's missing?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/rudraneel93/shadow-engine" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt; | &lt;a href="https://github.com/rudraneel93/shadow-engine/blob/main/API_DOCS.md" rel="noopener noreferrer"&gt;API Documentation&lt;/a&gt; | &lt;a href="https://github.com/rudraneel93/shadow-engine/blob/main/FINDINGS_REPORT.md" rel="noopener noreferrer"&gt;Research Report&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built in Python. MIT licensed. 80 tests. 7 languages indexed. ChromaDB semantic search. SQLite WAL. Docker ready.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>github</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
