<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: shubham yewale</title>
    <description>The latest articles on DEV Community by shubham yewale (@shubham_yewale_0a90a43ea5).</description>
    <link>https://dev.to/shubham_yewale_0a90a43ea5</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3875050%2Fbd8bbb32-2ab2-4922-91f8-8791aef381c0.png</url>
      <title>DEV Community: shubham yewale</title>
      <link>https://dev.to/shubham_yewale_0a90a43ea5</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shubham_yewale_0a90a43ea5"/>
    <language>en</language>
    <item>
      <title>From Stateless to Adaptive: Designing a Code Review Agent That Actually Learns</title>
      <dc:creator>shubham yewale</dc:creator>
      <pubDate>Sun, 12 Apr 2026 15:07:47 +0000</pubDate>
      <link>https://dev.to/shubham_yewale_0a90a43ea5/from-stateless-to-adaptive-designing-a-code-review-agent-that-actually-learns-2ee5</link>
      <guid>https://dev.to/shubham_yewale_0a90a43ea5/from-stateless-to-adaptive-designing-a-code-review-agent-that-actually-learns-2ee5</guid>
      <description>&lt;p&gt;AI code reviewers are everywhere now—but most of them share a critical flaw: they don’t improve.&lt;br&gt;
You can run the same tool across dozens of pull requests, reject the same irrelevant suggestions repeatedly, and it will still make those exact suggestions again tomorrow. There’s no accumulation of context, no adjustment, no memory.&lt;br&gt;
That limitation isn’t just inconvenient—it fundamentally caps how useful these systems can become.&lt;br&gt;
So instead of building another reviewer, I focused on a different question:&lt;br&gt;
What would a code review agent look like if it could learn continuously from developer feedback?&lt;br&gt;
The Shift: Reviews as a Feedback System&lt;br&gt;
Traditional code review tools operate like functions:&lt;br&gt;
Input: diff&lt;br&gt;
Output: comments&lt;br&gt;
No retained state&lt;br&gt;
What I built instead behaves more like a system with feedback control:&lt;br&gt;
It observes past decisions&lt;br&gt;
It adapts future outputs&lt;br&gt;
It converges toward team-specific norms&lt;br&gt;
Every time a developer accepts or rejects a suggestion, the system updates its internal understanding of what “good feedback” looks like for that team.&lt;br&gt;
Over time, this creates a meaningful shift:&lt;br&gt;
Fewer irrelevant suggestions&lt;br&gt;
Better alignment with team conventions&lt;br&gt;
More trust in the output&lt;br&gt;
The Core Loop&lt;br&gt;
The architecture is intentionally simple but powerful. Each review goes through three stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Context Retrieval
Before analyzing a pull request, the system pulls in historical signals:
Previously accepted suggestions
Previously rejected suggestions
Implicit team preferences
This is not static configuration—it’s learned behavior.&lt;/li&gt;
&lt;li&gt;Diff Analysis and Review Generation
The pull request diff is parsed into structured chunks and passed, along with historical context, into a large language model.
The model is constrained to produce structured output:
File and line references
Severity levels
Categorized issues
Optional fixes
This ensures the output is not just readable, but actionable.&lt;/li&gt;
&lt;li&gt;Feedback Capture
Every interaction—Accept or Reject—is captured and persisted.
This is the most important step.
Without it, the system cannot evolve.
With it, every review becomes training data.
Memory Design: Simplicity Over Structure
One of the more surprising design decisions was how to store feedback.
The obvious approach is structured data:
JSON
{
"file": "auth.py",
"category": "security",
"action": "rejected"
}
Instead, I chose plain language:
“PR #42 | File: auth.py | Comment: Use parameterized queries | Developer rejected this suggestion.”
This works better for a simple reason: the model consumes it directly.
There’s no need for transformation or interpretation. The same representation used for storage is used for reasoning.
This reduces system complexity and improves alignment with how LLMs process information.
Making Reviews More Precise
A key technical challenge was ensuring that generated comments map cleanly to the diff.
Diff Chunking
Rather than sending the entire diff as a monolithic block, it’s split into file-level segments with metadata (additions, deletions, headers).
This improves:
Line reference accuracy
Contextual grounding
Output consistency
Handling Imperfect Inputs
Diffs in the real world are messy:
Missing headers
Partial hunks
Irregular formatting
The system includes fallback logic to treat ambiguous input as a single chunk, prioritizing resilience over strict correctness.
Output That Developers Actually Engage With
A typical review might include:
Security vulnerabilities (e.g., unsafe SQL usage)
Cryptographic issues (e.g., weak hashing)
Resource management bugs
Positive reinforcement (well-written documentation, clean abstractions)
The last category—praise—is not accidental.
If every comment is negative, developers disengage. Balanced feedback makes the system feel less like a tool and more like a collaborator.
Performance Considerations
Speed matters more than it seems.
The full pipeline—context retrieval, diff parsing, model inference—completes in a few seconds. That’s fast enough to feel interactive, which is critical for adoption.
If the system lags, it won’t be used, regardless of how good the feedback is.
Designing for Real-World Usage
Two practical decisions made a significant difference:&lt;/li&gt;
&lt;li&gt;Graceful Degradation
The system works even without external dependencies:
No memory service → fallback context
No model access → mock responses
This enables:
Local development
Reliable demos
Reduced friction for onboarding&lt;/li&gt;
&lt;li&gt;Feedback as a First-Class Interaction
The Accept/Reject mechanism is not UI decoration—it is the engine of improvement.
Without it, the system stagnates.
With it, the system compounds value over time.
Where This Becomes Interesting
Once you introduce memory, new possibilities emerge:
Team-Specific Behavior
Different teams within the same repository often have different standards.
Segmenting memory by team allows the agent to adapt at a finer granularity.
Real Repository Integration
Hooking into live pull requests via APIs is straightforward. The challenge isn’t data access—it’s maintaining responsiveness and reliability at scale.
The Larger Takeaway
The real innovation here isn’t in model choice, API design, or UI layout.
It’s in treating feedback as persistent signal.
Most AI systems today are transactional—they answer and forget.
This system is incremental—it learns and adjusts.
That shift changes the trajectory of the product:
From static tool → adaptive assistant
From generic output → team-aligned insight
From isolated interactions → compounding intelligence
Over enough iterations, those small improvements stop being incremental.
They become identity.
And that’s when the system stops feeling like AI—and starts feeling like part of the team.
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnvv43iantbt11xhpqyc0.jpeg" alt=" " width="800" height="518"&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F11xxtk957xptieunom9d.jpeg" alt=" " width="800" height="518"&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8xm3lphln6cn4efltyb5.jpeg" alt=" " width="800" height="518"&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqg56hw8orogtodej5trr.jpeg" alt=" " width="800" height="518"&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>codequality</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
