<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jerin</title>
    <description>The latest articles on DEV Community by Jerin (@jerinmathewvinu).</description>
    <link>https://dev.to/jerinmathewvinu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3874774%2F14a752fc-2eb2-433b-8235-b08950e3db61.jpg</url>
      <title>DEV Community: Jerin</title>
      <link>https://dev.to/jerinmathewvinu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jerinmathewvinu"/>
    <language>en</language>
    <item>
      <title>How I Built a Code Review Agent That Gets Smarter Every Time a Developer Hits Accept</title>
      <dc:creator>Jerin</dc:creator>
      <pubDate>Sun, 12 Apr 2026 11:05:02 +0000</pubDate>
      <link>https://dev.to/jerinmathewvinu/how-i-built-a-code-review-agent-that-gets-smarter-every-time-a-developer-hits-accept-43jb</link>
      <guid>https://dev.to/jerinmathewvinu/how-i-built-a-code-review-agent-that-gets-smarter-every-time-a-developer-hits-accept-43jb</guid>
      <description>&lt;p&gt;&lt;strong&gt;Most AI tools forget everything the moment you close the tab. I wanted to build one that didn't.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The idea was straightforward: a code review agent that reads your pull request diff, flags real issues, and — crucially — remembers which suggestions your team actually liked and which ones they ignored. After a few reviews, it stops nagging you about trailing whitespace if you've been rejecting those comments all along. It starts sounding less like a linter and more like a senior engineer who's worked with your team for six months.&lt;/p&gt;

&lt;p&gt;That's the system I built with &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight agent memory&lt;/a&gt;, Groq, and FastAPI. Here's how it works, where it got interesting, and what I learned.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What the system does&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent sits in front of your pull requests. A developer opens a PR, clicks "Run AI Review," and the system does three things in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Recall&lt;/strong&gt; — it fetches team standards and past review patterns from &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt;, the memory layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review&lt;/strong&gt; — it sends the diff plus that memory context to Groq's &lt;code&gt;qwen/qwen3-32b&lt;/code&gt; model, which returns structured JSON comments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retain&lt;/strong&gt; — when the developer clicks Accept or Reject on a comment, that decision gets stored back into Hindsight&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The frontend is a three-panel layout: PR sidebar on the left, syntax-highlighted diff in the middle, review comments with Accept/Reject buttons on the right. Each comment shows severity (critical, warning, suggestion, praise), the file and line it refers to, and an optional code fix.&lt;/p&gt;

&lt;p&gt;The loop closes when a developer rejects "use logging instead of print" for the third time. The next review won't bring it up.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4lwbuwo6polwpvygkn9z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4lwbuwo6polwpvygkn9z.png" alt=" " width="800" height="518"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5clqiexrmr88ql8ydj2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5clqiexrmr88ql8ydj2m.png" alt=" " width="800" height="518"&gt;&lt;/a&gt; &lt;strong&gt;The memory layer: retain() and recall()&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The part I found most interesting to build was the memory integration. Most agents I've seen treat each request as stateless — the LLM gets a system prompt, does its job, and forgets. Hindsight changes that with two primitives: &lt;code&gt;retain()&lt;/code&gt; and &lt;code&gt;recall()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here's the actual wrapper I wrote:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retain_feedback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pr_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;comment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;collection&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reviews:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PR #&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pr_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | File: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | Comment: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;comment&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | Developer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; this suggestion.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pr_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pr_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/retain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And recall:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;recall_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;patterns_resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/recall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;collection&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reviews:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accepted and rejected review comments&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;past_patterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patterns_resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;past_patterns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;past_patterns&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No past patterns yet.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key design decision was storing human-readable strings rather than embeddings or structured records. The content field reads like a sentence — "PR #42 | File: auth.py | Comment: Use parameterized queries | Developer accepted this suggestion." — because that's what gets injected into the LLM prompt later. The model can read it and understand it without any additional parsing.&lt;/p&gt;




&lt;h2&gt;
  
  
  The review pipeline
&lt;/h2&gt;

&lt;p&gt;The backend is a FastAPI app with three endpoints that matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GET /prs&lt;/code&gt; — serves the pull request list to the frontend&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /review&lt;/code&gt; — runs the full pipeline: recall → parse diff → call Groq → return structured comments&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /feedback&lt;/code&gt; — called on each Accept/Reject click, stores the decision via retain()&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;/review&lt;/code&gt; endpoint is where everything connects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;review_pr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ReviewRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;recall_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;comments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generate_review&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;pr_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pr_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;team_standards&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;team_standards&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;past_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;past_patterns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pr_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pr_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;comments&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory_used&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The diff parser splits a unified diff into per-file chunks with addition and deletion counts. This matters because the LLM prompt includes each file's diff separately with a header — it makes the model's references to specific line numbers more reliable.&lt;/p&gt;

&lt;p&gt;The Groq call uses &lt;code&gt;qwen/qwen3-32b&lt;/code&gt; with a system prompt that instructs the model to return only a JSON array. Each element has a file, line number, severity, category, comment, and optional suggestion field. I added a fallback that wraps raw text in a single comment object if the JSON parse fails — which happened more often than I expected during early testing before I tightened the system prompt.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a real review looks like
&lt;/h2&gt;

&lt;p&gt;Here's what the agent returned on a PR adding a user authentication endpoint:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical — auth.py:9 — security&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User input is interpolated directly into the SQL query string — this is a textbook SQL injection vulnerability.&lt;br&gt;
&lt;em&gt;Fix: &lt;code&gt;cursor.execute('SELECT * FROM users WHERE username = ? AND password = ?', (username, password))&lt;/code&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Critical — auth.py:12 — security&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;MD5 is a broken hashing algorithm and must not be used for tokens or passwords. It is trivially reversible with rainbow tables.&lt;br&gt;
&lt;em&gt;Fix: Use &lt;code&gt;secrets.token_hex(32)&lt;/code&gt; for session tokens, bcrypt or argon2 for passwords.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Warning — auth.py:7 — bug&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The database connection is never closed. On high traffic this exhausts the SQLite file handle limit.&lt;br&gt;
&lt;em&gt;Fix: Use a context manager: &lt;code&gt;with sqlite3.connect('users.db') as conn:&lt;/code&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Praise — auth.py:5 — documentation&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Good docstring — clearly describes what the function returns. This makes it easy to use correctly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The praise comment matters. A reviewer that only flags problems is easy to tune out. Including something genuine when the code does something well makes the rest of the review land differently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Human-readable memory beats structured records — for LLMs.&lt;/strong&gt;&lt;br&gt;
I initially considered storing feedback as JSON objects with typed fields. I switched to plain sentences because the LLM uses the recalled context directly in the prompt. It doesn't need to deserialize anything; it just reads "Developer rejected this suggestion" and adjusts accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The feedback loop is the product.&lt;/strong&gt;&lt;br&gt;
The first review an agent produces is mediocre. The tenth is genuinely useful. If you're building an agent like this, the Accept/Reject UI is not a nice-to-have — it is the core mechanism that makes everything else worth building.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Diff parsing is harder than it looks.&lt;/strong&gt;&lt;br&gt;
Unified diffs have edge cases everywhere — files with no standard headers, diffs that start mid-hunk, filenames with spaces. I ended up with a fallback that treats the entire input as a single chunk if no file headers are detected. It is not elegant but it is resilient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Groq is genuinely fast.&lt;/strong&gt;&lt;br&gt;
The full pipeline — recall from Hindsight, parse diff, call Groq, return JSON — completes in about 2–3 seconds end to end. For a code review tool that needs to feel interactive, that matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Design for the no-API-key case from the start.&lt;/strong&gt;&lt;br&gt;
Both Hindsight and Groq integrations have graceful fallbacks — hardcoded team standards and mock comments respectively. This made local development and demos dramatically easier. The app works end to end without any keys, which meant I could build and test the frontend without ever blocking on API setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;The obvious next step is connecting real GitHub PRs via the API instead of the static fake PR data. The diff format is identical — it's just a matter of fetching it from &lt;code&gt;api.github.com/repos/{repo}/pulls/{number}&lt;/code&gt; with an &lt;code&gt;Accept: application/vnd.github.v3.diff&lt;/code&gt; header.&lt;/p&gt;

&lt;p&gt;The more interesting extension is per-team memory segmentation. Right now, all feedback goes into a single &lt;code&gt;reviews:{repo}&lt;/code&gt; collection. With multiple teams working on the same repo, you could store feedback in team-specific collections and recall from the right one based on who opened the PR.&lt;/p&gt;

&lt;p&gt;The core insight though is simpler: &lt;a href="https://vectorize.io/what-is-agent-memory" rel="noopener noreferrer"&gt;agent memory&lt;/a&gt; turns a one-shot tool into something that compounds. Every accept or reject makes the next review marginally more accurate. Across hundreds of reviews and thousands of decisions, that compounding adds up to an agent that actually understands how your team writes code.&lt;/p&gt;

&lt;p&gt;That's worth building for.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
