<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Marcin Brzozka</title>
    <description>The latest articles on DEV Community by Marcin Brzozka (@marcin_brzozka_ff45b1ccb6).</description>
    <link>https://dev.to/marcin_brzozka_ff45b1ccb6</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4004062%2F284890b0-1393-4adb-b556-0ec8e3e2e48a.png</url>
      <title>DEV Community: Marcin Brzozka</title>
      <link>https://dev.to/marcin_brzozka_ff45b1ccb6</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/marcin_brzozka_ff45b1ccb6"/>
    <language>en</language>
    <item>
      <title>6 checks before merging AI-agent generated code</title>
      <dc:creator>Marcin Brzozka</dc:creator>
      <pubDate>Fri, 26 Jun 2026 16:36:07 +0000</pubDate>
      <link>https://dev.to/marcin_brzozka_ff45b1ccb6/6-checks-before-merging-ai-agent-generated-code-dmg</link>
      <guid>https://dev.to/marcin_brzozka_ff45b1ccb6/6-checks-before-merging-ai-agent-generated-code-dmg</guid>
      <description>&lt;p&gt;AI coding agents are useful because they can make large changes quickly.&lt;/p&gt;

&lt;p&gt;That is also the reason I do not want to merge their patches just because the final answer says “done”.&lt;/p&gt;

&lt;p&gt;The risky failure mode is not usually obvious broken code. It is a plausible patch that quietly touches a risky area.&lt;/p&gt;

&lt;p&gt;Here is the checklist I use before merging AI-agent generated diffs.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Did dependencies change?
&lt;/h2&gt;

&lt;p&gt;Look for package files and lockfiles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;package.json&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;lockfiles&lt;/li&gt;
&lt;li&gt;&lt;code&gt;requirements.txt&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pyproject.toml&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;go.mod&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Docker base images&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dependency changes should get explicit review. A tiny source diff plus a large dependency change is not tiny.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Did auth, payment, security, or config files change?
&lt;/h2&gt;

&lt;p&gt;Slow down if the patch touches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;authentication middleware,&lt;/li&gt;
&lt;li&gt;session/token handling,&lt;/li&gt;
&lt;li&gt;payment or checkout code,&lt;/li&gt;
&lt;li&gt;webhook handlers,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.env&lt;/code&gt; parsing,&lt;/li&gt;
&lt;li&gt;deployment config,&lt;/li&gt;
&lt;li&gt;CI workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are exactly the areas where “it builds” is not enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Did source change without tests changing?
&lt;/h2&gt;

&lt;p&gt;Not every patch needs new tests, but source changes with zero test changes should be visible in review.&lt;/p&gt;

&lt;p&gt;At minimum, the author/agent should provide real command output showing what was run.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Did generated or bundled files change?
&lt;/h2&gt;

&lt;p&gt;Large generated files can bury important edits.&lt;/p&gt;

&lt;p&gt;If a patch changes a minified file, lockfile, generated client, or build artifact, review the source of that generated output too.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Are there secret-like literals?
&lt;/h2&gt;

&lt;p&gt;Search for suspicious strings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;api_key&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;token&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;secret&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;password&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;private keys&lt;/li&gt;
&lt;li&gt;webhook secrets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even test fixtures deserve a second look.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Is “tests passed” backed by actual output?
&lt;/h2&gt;

&lt;p&gt;I want to see the command and real result, not just a summary.&lt;/p&gt;

&lt;p&gt;Good:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm test
18 passed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Weak:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tests should pass.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Turning the checklist into a local gate
&lt;/h2&gt;

&lt;p&gt;I packaged this workflow as a small local Python CLI that scores a unified diff before merge.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git diff &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; change.patch
python src/agent_change_risk_auditor.py audit &lt;span class="nt"&gt;--diff&lt;/span&gt; change.patch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It flags dependency changes, sensitive paths, source-without-tests, large/generated changes, and secret-like literals.&lt;/p&gt;

&lt;p&gt;The point is not to replace human review. The point is to make “slow down and inspect this patch” visible before merge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related resources
&lt;/h2&gt;

&lt;p&gt;I put the checklist and example report here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blog/checklist page: &lt;a href="http://152.239.117.170/blog/ai-agent-code-review-checklist-before-merge.html" rel="noopener noreferrer"&gt;http://152.239.117.170/blog/ai-agent-code-review-checklist-before-merge.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Sample report: &lt;a href="http://152.239.117.170/sample-audit-report.html" rel="noopener noreferrer"&gt;http://152.239.117.170/sample-audit-report.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Product page: &lt;a href="http://152.239.117.170/" rel="noopener noreferrer"&gt;http://152.239.117.170/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is also a small paid Gumroad kit for teams that want the source, CI template, and Pro workflow pack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic Kit: &lt;a href="https://marcnova48.gumroad.com/l/cakkb" rel="noopener noreferrer"&gt;https://marcnova48.gumroad.com/l/cakkb&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Pro Pack: &lt;a href="https://marcnova48.gumroad.com/l/bdyklr" rel="noopener noreferrer"&gt;https://marcnova48.gumroad.com/l/bdyklr&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Question: what risk category would you add to this checklist for AI-generated patches?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>codequality</category>
      <category>devops</category>
      <category>security</category>
    </item>
    <item>
      <title>I built a local risk gate for AI-agent code changes</title>
      <dc:creator>Marcin Brzozka</dc:creator>
      <pubDate>Fri, 26 Jun 2026 15:19:20 +0000</pubDate>
      <link>https://dev.to/marcin_brzozka_ff45b1ccb6/i-built-a-local-risk-gate-for-ai-agent-code-changes-1o7f</link>
      <guid>https://dev.to/marcin_brzozka_ff45b1ccb6/i-built-a-local-risk-gate-for-ai-agent-code-changes-1o7f</guid>
      <description>&lt;p&gt;AI coding agents are now good enough to create a lot of code quickly. That also means they are good enough to create a lot of risky changes quickly.&lt;/p&gt;

&lt;p&gt;The failure mode I keep seeing is not “the agent writes obviously broken code.” The harder problem is quieter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a dependency or lockfile changes without anyone noticing,&lt;/li&gt;
&lt;li&gt;auth/payment/config files are touched in a broad refactor,&lt;/li&gt;
&lt;li&gt;source files change but tests do not,&lt;/li&gt;
&lt;li&gt;generated-looking rewrites bury a small important change,&lt;/li&gt;
&lt;li&gt;a secret-like literal appears in a diff,&lt;/li&gt;
&lt;li&gt;the final report says “tests passed” but the evidence is thin.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I built a small local CLI workflow: &lt;strong&gt;AI Agent Change Risk Auditor&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It reads a unified diff/patch file and returns a risk score before merge.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git diff &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; change.patch
python src/agent_change_risk_auditor.py audit &lt;span class="nt"&gt;--diff&lt;/span&gt; change.patch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AI Agent Change Risk Audit
Risk level: high
Risk score: 63/100
Files changed: 2
Lines: +3 / -1

Flags:
- DEPENDENCY_CHANGE:package.json
- SOURCE_CHANGED_WITHOUT_TEST_CHANGE
- POSSIBLE_SECRET_LITERAL_IN_DIFF

Recommendations:
- Add or update tests for changed source files before merge.
- Remove secret-like literals and rotate exposed credentials if real.
- Review dependency changes manually and run lockfile/security checks.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What it checks
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;dependency and lockfile changes,&lt;/li&gt;
&lt;li&gt;auth/payment/security/config paths,&lt;/li&gt;
&lt;li&gt;source changes without test changes,&lt;/li&gt;
&lt;li&gt;large or generated-looking rewrites,&lt;/li&gt;
&lt;li&gt;secret-like literals in the diff.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What it does not do
&lt;/h2&gt;

&lt;p&gt;It is not a security guarantee. It is not a replacement for tests, code review, or proper SCA.&lt;/p&gt;

&lt;p&gt;It is just a cheap local guardrail that catches “slow down and review this” signals before an AI-agent patch gets merged.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why local-first?
&lt;/h2&gt;

&lt;p&gt;Teams often do not want to upload private diffs to a third-party service just to get a basic risk score.&lt;/p&gt;

&lt;p&gt;A local script is easy to inspect, easy to modify, and easy to run in private CI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who it is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;founders using AI coding tools,&lt;/li&gt;
&lt;li&gt;agencies reviewing AI-generated client changes,&lt;/li&gt;
&lt;li&gt;small teams that need a pre-merge safety checklist,&lt;/li&gt;
&lt;li&gt;anyone who wants a simple local guardrail before a human review.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The minimal checklist
&lt;/h2&gt;

&lt;p&gt;If you want the idea without any tool, start here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Did dependencies change?&lt;/li&gt;
&lt;li&gt;Did auth/payment/security/config files change?&lt;/li&gt;
&lt;li&gt;Did source change without tests?&lt;/li&gt;
&lt;li&gt;Did a large generated file change?&lt;/li&gt;
&lt;li&gt;Are there secret-like literals in the diff?&lt;/li&gt;
&lt;li&gt;Can the agent’s “tests passed” claim be tied to actual output?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That checklist alone catches a surprising amount.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo assets
&lt;/h2&gt;

&lt;p&gt;I made two public demo pages for the workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sample audit report: &lt;a href="http://152.239.117.170/sample-audit-report.html" rel="noopener noreferrer"&gt;http://152.239.117.170/sample-audit-report.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;ROI calculator: &lt;a href="http://152.239.117.170/roi-calculator.html" rel="noopener noreferrer"&gt;http://152.239.117.170/roi-calculator.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Commercial note
&lt;/h2&gt;

&lt;p&gt;I packaged this as a small paid starter kit with source, tests, docs, CI templates, and a commercial-use summary.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic Kit: $5 one-time — &lt;a href="https://marcnova48.gumroad.com/l/cakkb" rel="noopener noreferrer"&gt;https://marcnova48.gumroad.com/l/cakkb&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Pro Pack: $19 one-time — &lt;a href="https://marcnova48.gumroad.com/l/bdyklr" rel="noopener noreferrer"&gt;https://marcnova48.gumroad.com/l/bdyklr&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Product page and demos: &lt;a href="http://152.239.117.170/" rel="noopener noreferrer"&gt;http://152.239.117.170/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The goal is not to promise perfect security. The goal is to save reviewer attention and reduce obvious AI-agent change risk.&lt;/p&gt;

&lt;p&gt;Question for other teams using coding agents: what additional risk category would you add?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>codequality</category>
      <category>devops</category>
      <category>security</category>
    </item>
  </channel>
</rss>
