<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Deepesh Maravi</title>
    <description>The latest articles on DEV Community by Deepesh Maravi (@deepesh_maravi_40f46b9855).</description>
    <link>https://dev.to/deepesh_maravi_40f46b9855</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3875127%2F60aaa797-8454-4b77-ad8d-5b3c897efb9b.jpg</url>
      <title>DEV Community: Deepesh Maravi</title>
      <link>https://dev.to/deepesh_maravi_40f46b9855</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/deepesh_maravi_40f46b9855"/>
    <language>en</language>
    <item>
      <title>Building a Code Review Agent That Learns From Every Decision</title>
      <dc:creator>Deepesh Maravi</dc:creator>
      <pubDate>Sun, 12 Apr 2026 16:16:48 +0000</pubDate>
      <link>https://dev.to/deepesh_maravi_40f46b9855/building-a-code-review-agent-that-learns-from-every-decision-4oli</link>
      <guid>https://dev.to/deepesh_maravi_40f46b9855/building-a-code-review-agent-that-learns-from-every-decision-4oli</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh70cmwuo0gb4ow0t24h1.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh70cmwuo0gb4ow0t24h1.jpeg" alt=" " width="800" height="518"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flikzug5ab1uaufugky0a.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flikzug5ab1uaufugky0a.jpeg" alt=" " width="800" height="518"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7gsvgcz0gway394afei1.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7gsvgcz0gway394afei1.jpeg" alt=" " width="800" height="518"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjp9vmsw9978y5pj63lhr.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjp9vmsw9978y5pj63lhr.jpeg" alt=" " width="800" height="518"&gt;&lt;/a&gt;&lt;br&gt;
Building a Code Review Agent That Actually Learns&lt;/p&gt;

&lt;h1&gt;
  
  
  agents #ai #codequality #machinelearning
&lt;/h1&gt;

&lt;p&gt;AI code reviewers are becoming common in modern development workflows. However, most of them share a critical limitation: they don’t improve over time.&lt;br&gt;
You can use the same tool across multiple pull requests, reject the same irrelevant suggestions repeatedly, and it will still produce the same output again. There is no accumulation of context, no adjustment, and no memory of past decisions.&lt;br&gt;
This limitation reduces how effective these systems can be.&lt;br&gt;
Instead of building another stateless reviewer, I focused on a different question:&lt;br&gt;
What would a code review agent look like if it could continuously learn from developer feedback?&lt;br&gt;
The Shift: Reviews as a Feedback System&lt;br&gt;
Traditional code review tools operate like simple functions:&lt;br&gt;
Input: diff&lt;br&gt;&lt;br&gt;
Output: comments&lt;br&gt;&lt;br&gt;
No retained state&lt;br&gt;&lt;br&gt;
The system I built behaves more like a feedback-driven process:&lt;br&gt;
Observe past decisions&lt;br&gt;&lt;br&gt;
Adapt future outputs&lt;br&gt;&lt;br&gt;
Align with team patterns&lt;br&gt;&lt;br&gt;
This shift transforms code reviews from static outputs into evolving systems.&lt;br&gt;
System Overview&lt;br&gt;
At a high level, the agent works through three steps:&lt;br&gt;
Recall&lt;br&gt;
Retrieve past review patterns and team conventions&lt;br&gt;
Review&lt;br&gt;
Analyze the current pull request and generate structured feedback&lt;br&gt;
Retain&lt;br&gt;
Store developer decisions (accept/reject) for future learning&lt;br&gt;
Each pull request contributes to a continuous improvement loop.&lt;br&gt;
Memory as a Core Component&lt;br&gt;
The key differentiator of this system is the memory layer.&lt;br&gt;
Two simple operations drive it:&lt;br&gt;
retain()   -&amp;gt; stores feedback decisions&lt;br&gt;&lt;br&gt;
recall()   -&amp;gt; retrieves past patterns&lt;br&gt;&lt;br&gt;
Instead of using complex structured storage, feedback is saved in plain language:&lt;br&gt;
"Developer rejected this suggestion in a previous review."&lt;br&gt;
This approach allows the system to directly use context without additional processing.&lt;br&gt;
Review Pipeline&lt;br&gt;
The backend follows a straightforward pipeline:&lt;br&gt;
Fetch PR data&lt;br&gt;&lt;br&gt;
Parse diff&lt;br&gt;&lt;br&gt;
Generate review&lt;br&gt;&lt;br&gt;
Return structured output&lt;br&gt;&lt;br&gt;
Each generated comment includes:&lt;br&gt;
• File reference &lt;br&gt;
• Line number &lt;br&gt;
• Severity &lt;br&gt;
• Category &lt;br&gt;
• Suggested improvement (if applicable) &lt;br&gt;
This ensures feedback is clear and actionable.&lt;br&gt;
What Changes Over Time&lt;br&gt;
At the beginning, the system behaves like a standard reviewer.&lt;br&gt;
After multiple iterations:&lt;br&gt;
• Repeatedly rejected suggestions are reduced &lt;br&gt;
• Accepted patterns are reinforced &lt;br&gt;
• Feedback becomes more relevant &lt;br&gt;
The system gradually adapts to how a team actually works.&lt;br&gt;
Challenges&lt;br&gt;
Building this system introduced several challenges:&lt;br&gt;
• Handling inconsistent diff formats &lt;br&gt;
• Maintaining low response latency &lt;br&gt;
• Interpreting feedback signals correctly &lt;br&gt;
These factors are critical for real-world usability.&lt;br&gt;
Future Improvements&lt;br&gt;
Possible extensions include:&lt;br&gt;
• Integration with live pull request systems &lt;br&gt;
• Team-specific memory segmentation &lt;br&gt;
• Improved feedback weighting mechanisms &lt;br&gt;
Conclusion&lt;br&gt;
Most AI tools operate as stateless systems—they respond and reset.&lt;br&gt;
Adding memory changes this behavior.&lt;br&gt;
Each accept or reject decision becomes a signal. Over time, these signals build a system that aligns with real development practices.&lt;br&gt;
This is what transforms a generic reviewer into a system that actually learns.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>codequality</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
