<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rakshit</title>
    <description>The latest articles on DEV Community by Rakshit (@rkstlohchab).</description>
    <link>https://dev.to/rkstlohchab</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3616181%2Ff0ccc2ab-724e-4815-9ce9-c91238b0a560.png</url>
      <title>DEV Community: Rakshit</title>
      <link>https://dev.to/rkstlohchab</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rkstlohchab"/>
    <language>en</language>
    <item>
      <title>Why I Built the "Infrastructure Layer" Under Every AI Coding Agents</title>
      <dc:creator>Rakshit</dc:creator>
      <pubDate>Wed, 27 May 2026 07:11:38 +0000</pubDate>
      <link>https://dev.to/rkstlohchab/why-i-built-the-infrastructure-layer-under-every-ai-coding-agents-2kci</link>
      <guid>https://dev.to/rkstlohchab/why-i-built-the-infrastructure-layer-under-every-ai-coding-agents-2kci</guid>
      <description>&lt;h1&gt;
  
  
  AI Coding Agents Still Forget Everything — So I Built the Memory Layer Underneath Them
&lt;/h1&gt;

&lt;p&gt;AI coding agents are getting very good at editing files, running tests, and opening PRs.&lt;/p&gt;

&lt;p&gt;After heavily using tools like Cursor, Claude Code, and GitHub Copilot, I noticed they all share the same core limitation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;They have no persistent understanding of your system.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Ask the same question next week and they:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;re-read the repo from scratch&lt;/li&gt;
&lt;li&gt;re-run expensive LLM calls&lt;/li&gt;
&lt;li&gt;forget prior incidents&lt;/li&gt;
&lt;li&gt;lose architectural context&lt;/li&gt;
&lt;li&gt;and still don’t know what actually happened in production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of building another coding agent, I built the layer underneath them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing ASIL
&lt;/h2&gt;

&lt;p&gt;ASIL (&lt;strong&gt;Engineering Intelligence Infrastructure&lt;/strong&gt;) is a persistent, temporal, causal knowledge graph for software systems.&lt;/p&gt;

&lt;p&gt;It connects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;code&lt;/li&gt;
&lt;li&gt;commits&lt;/li&gt;
&lt;li&gt;deployments&lt;/li&gt;
&lt;li&gt;incidents&lt;/li&gt;
&lt;li&gt;logs&lt;/li&gt;
&lt;li&gt;metrics&lt;/li&gt;
&lt;li&gt;architecture drift&lt;/li&gt;
&lt;li&gt;AI memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;into one queryable system that any AI agent can access through MCP.&lt;/p&gt;

&lt;p&gt;The goal is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Stop making AI agents rediscover the same engineering knowledge over and over again.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fykihbrrlvln4wmsx0tb9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fykihbrrlvln4wmsx0tb9.gif" alt=" " width="748" height="806"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  The Core Idea
&lt;/h1&gt;

&lt;p&gt;Most coding agents understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the current codebase&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ASIL understands:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how the system evolved&lt;/li&gt;
&lt;li&gt;what changed&lt;/li&gt;
&lt;li&gt;what broke&lt;/li&gt;
&lt;li&gt;why it broke&lt;/li&gt;
&lt;li&gt;and what evidence supports that conclusion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“GPT thinks this caused the outage”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;ASIL derives causal chains from observable system state:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deployment timelines&lt;/li&gt;
&lt;li&gt;incident timestamps&lt;/li&gt;
&lt;li&gt;metric shifts&lt;/li&gt;
&lt;li&gt;runtime dependencies&lt;/li&gt;
&lt;li&gt;postmortems&lt;/li&gt;
&lt;li&gt;service relationships&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every conclusion includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;evidence&lt;/li&gt;
&lt;li&gt;confidence scores&lt;/li&gt;
&lt;li&gt;derivation chains&lt;/li&gt;
&lt;li&gt;citations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No black-box “AI intuition.”&lt;/p&gt;




&lt;h1&gt;
  
  
  What ASIL Can Do
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Ask Questions About a Repo
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run asil ask &lt;span class="s2"&gt;"How does auth work in this repo?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ASIL combines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;graph retrieval&lt;/li&gt;
&lt;li&gt;vector search&lt;/li&gt;
&lt;li&gt;verifier passes&lt;/li&gt;
&lt;li&gt;episodic memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;to return:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cited answers&lt;/li&gt;
&lt;li&gt;confidence scoring&lt;/li&gt;
&lt;li&gt;cached reasoning for future sessions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Replay Production Incidents
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run asil replay INC-2026-04-12
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ASIL reconstructs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deployment timelines&lt;/li&gt;
&lt;li&gt;causal chains&lt;/li&gt;
&lt;li&gt;affected services&lt;/li&gt;
&lt;li&gt;architecture drift&lt;/li&gt;
&lt;li&gt;metric changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;as a dependency-aware replay graph.&lt;/p&gt;

&lt;p&gt;Think:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Time-travel debugging for distributed systems.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Detect Architecture Drift
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run asil drift report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ASIL learns expected dependency boundaries and flags:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;undocumented coupling&lt;/li&gt;
&lt;li&gt;boundary violations&lt;/li&gt;
&lt;li&gt;dependency creep&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;before the PR merges.&lt;/p&gt;




&lt;h2&gt;
  
  
  Work With Any Coding Agent
&lt;/h2&gt;

&lt;p&gt;ASIL exposes 13 MCP tools usable from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cursor&lt;/li&gt;
&lt;li&gt;Claude Code&lt;/li&gt;
&lt;li&gt;OpenHands&lt;/li&gt;
&lt;li&gt;Aider&lt;/li&gt;
&lt;li&gt;or custom agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agents become clients of the intelligence layer.&lt;/p&gt;




&lt;h1&gt;
  
  
  The Unexpected Benefit: Massive LLM Cost Reduction
&lt;/h1&gt;

&lt;p&gt;ASIL stores every verified engineering conclusion in persistent memory.&lt;/p&gt;

&lt;p&gt;When someone asks a semantically similar question later, ASIL can reuse the prior verified reasoning instead of re-running the full LLM pipeline.&lt;/p&gt;

&lt;p&gt;On cache hits, the cost drops close to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;just the embedding lookup&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Repeated engineering queries become dramatically cheaper over time — especially across teams.&lt;/p&gt;




&lt;h1&gt;
  
  
  The Part I Care About Most
&lt;/h1&gt;

&lt;p&gt;ASIL does &lt;strong&gt;not&lt;/strong&gt; let the LLM invent causality.&lt;/p&gt;

&lt;p&gt;That rule shapes the entire architecture.&lt;/p&gt;

&lt;p&gt;Causal links come from deterministic signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;temporal proximity&lt;/li&gt;
&lt;li&gt;lagged correlation&lt;/li&gt;
&lt;li&gt;explicit references&lt;/li&gt;
&lt;li&gt;runtime graph relationships&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The LLM consumes evidence.&lt;/p&gt;

&lt;p&gt;It does not fabricate it.&lt;/p&gt;

&lt;p&gt;That distinction matters once AI systems start participating in production engineering workflows.&lt;/p&gt;




&lt;h1&gt;
  
  
  Built for Local-First Engineering
&lt;/h1&gt;

&lt;p&gt;Everything runs locally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Neo4j&lt;/li&gt;
&lt;li&gt;Qdrant&lt;/li&gt;
&lt;li&gt;Postgres&lt;/li&gt;
&lt;li&gt;Redis&lt;/li&gt;
&lt;li&gt;Prometheus&lt;/li&gt;
&lt;li&gt;Loki&lt;/li&gt;
&lt;li&gt;Grafana&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No central server.&lt;/p&gt;

&lt;p&gt;No telemetry.&lt;/p&gt;

&lt;p&gt;Your graph stays yours.&lt;/p&gt;

&lt;p&gt;The only optional network dependency is the reasoning LLM.&lt;/p&gt;




&lt;h1&gt;
  
  
  Why I’m Building It Open Source
&lt;/h1&gt;

&lt;p&gt;Most AI tooling is racing toward:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“make the agent better at editing code”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I think the more important problem is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“give agents persistent engineering intelligence”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;memory&lt;/li&gt;
&lt;li&gt;causality&lt;/li&gt;
&lt;li&gt;runtime awareness&lt;/li&gt;
&lt;li&gt;architecture understanding&lt;/li&gt;
&lt;li&gt;confidence-weighted reasoning&lt;/li&gt;
&lt;li&gt;reproducible evidence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s the layer ASIL is trying to build.&lt;/p&gt;

&lt;p&gt;Built solo over 6 months with:&lt;br&gt;
Python, FastAPI, Neo4j, Qdrant, Postgres, Tree-sitter, Next.js, Tailwind, ReactFlow, and MCP tooling.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>python</category>
    </item>
  </channel>
</rss>
