<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rafael Costa</title>
    <description>The latest articles on DEV Community by Rafael Costa (@devanomaly).</description>
    <link>https://dev.to/devanomaly</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3641305%2F56bcff32-8972-461c-8d25-a687e4adee96.jpeg</url>
      <title>DEV Community: Rafael Costa</title>
      <link>https://dev.to/devanomaly</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/devanomaly"/>
    <language>en</language>
    <item>
      <title>The Privacy Is the Architecture: Building an Instagram Bulk Unfollower Under MV3 Constraints</title>
      <dc:creator>Rafael Costa</dc:creator>
      <pubDate>Wed, 15 Apr 2026 15:01:12 +0000</pubDate>
      <link>https://dev.to/devanomaly/the-privacy-is-the-architecture-building-an-instagram-bulk-unfollower-under-mv3-constraints-e79</link>
      <guid>https://dev.to/devanomaly/the-privacy-is-the-architecture-building-an-instagram-bulk-unfollower-under-mv3-constraints-e79</guid>
      <description>&lt;p&gt;The Instagram follower-tool ecosystem has a malware problem. In January 2026, the &lt;a href="https://www.bleepingcomputer.com/news/security/malicious-ghostposter-browser-extensions-found-with-840-000-installs/" rel="noopener noreferrer"&gt;GhostPoster campaign&lt;/a&gt; was found to have spread across 17 extensions with 840,000+ cumulative installs across Chrome, Firefox, and Edge — hiding JavaScript malware inside PNG icon files using steganography. In March, &lt;a href="https://thehackernews.com/2026/03/chrome-extension-turns-malicious-after.html" rel="noopener noreferrer"&gt;two Chrome extensions were reported to have turned malicious after ownership transfer&lt;/a&gt;; in ShotBird's case, researchers documented fake Chrome update prompts used to deliver credential-theft malware. The "privacy policy" is a legal checkbox. The permissions list is where the real policy actually lives.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://chromewebstore.google.com/detail/reciprocity/micnkndhjhajkhpbgjijfihcendcoebm" rel="noopener noreferrer"&gt;Reciprocity&lt;/a&gt;, a Manifest V3 Chrome extension that computes the set difference between who you follow and who follows you back on Instagram, and automates the unfollows. Zero servers, no external dependencies and only two permissions: &lt;code&gt;tabs&lt;/code&gt; and &lt;code&gt;storage&lt;/code&gt;. Host permissions locked to &lt;code&gt;www.instagram.com&lt;/code&gt; and &lt;code&gt;instagram.com&lt;/code&gt;. That's it, nothing else.&lt;/p&gt;

&lt;p&gt;Privacy isn't a policy or a promise. It's an architecture with no server-side collection path and no third-party exfiltration endpoint.&lt;/p&gt;

&lt;p&gt;In Chrome MV3, building this safely forces you into a specific, often painful set of constraints.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MV3 Two-World Problem
&lt;/h2&gt;

&lt;p&gt;Chrome extensions run content scripts in an "isolated world." You share the DOM with the page, but not the JavaScript execution environment (&lt;code&gt;window&lt;/code&gt;). Great for security, but fatal if you need to intercept the page's own network requests before they happen.&lt;/p&gt;

&lt;p&gt;To parse a user's Instagram following list without forcing them to scroll a modal for an hour, you have to hook into &lt;code&gt;fetch()&lt;/code&gt; and &lt;code&gt;XMLHttpRequest&lt;/code&gt;. To do that before Instagram's minified React bundle mounts, your code must run at &lt;code&gt;document_start&lt;/code&gt; in the &lt;code&gt;MAIN&lt;/code&gt; world.&lt;/p&gt;

&lt;p&gt;The catch: &lt;code&gt;MAIN&lt;/code&gt; world scripts have zero access to &lt;code&gt;chrome.runtime.*&lt;/code&gt; APIs. They can't talk to your background service worker. Ergo, they can't read your extension storage.&lt;/p&gt;

&lt;p&gt;So you build a two-world bridge:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;content-main.js&lt;/code&gt; (MAIN world):&lt;/strong&gt; Hooks &lt;code&gt;fetch&lt;/code&gt;, parses GraphQL responses, drives direct API pagination.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;content.js&lt;/code&gt; (Isolated world):&lt;/strong&gt; Orchestrates the state machine, talks to the background service worker, manages the execution queue.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;They communicate via &lt;code&gt;window.postMessage&lt;/code&gt;. But throwing messages across the &lt;code&gt;window&lt;/code&gt; boundary on a public site is fundamentally insecure. Page JS can see them. Page JS can &lt;em&gt;forge&lt;/em&gt; them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Build the Bridge, Then Mistrust It
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;window.postMessage&lt;/code&gt; is a broadcast channel in hostile territory. The &lt;a href="https://github.com/w3c/webextensions/issues/77" rel="noopener noreferrer"&gt;W3C WebExtensions working group has had an open proposal since 2021&lt;/a&gt; for a secure replacement — acknowledging that the current approach is fundamentally broken — and no solution has shipped. &lt;a href="https://thehackernews.com/2018/02/grammar-checking-software.html" rel="noopener noreferrer"&gt;Grammarly's extension&lt;/a&gt; exposed authentication tokens to every website a user visited through an unvalidated postMessage bridge. Twenty-two million users, and JavaScript on any visited page could abuse the bridge to access a session.&lt;/p&gt;

&lt;p&gt;So the rule in Reciprocity is simple: the MAIN world is &lt;strong&gt;scan-only&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It can contribute &lt;em&gt;observations&lt;/em&gt;, but not authority.&lt;/p&gt;

&lt;p&gt;When the MAIN world intercepts a GraphQL batch of followers, it sends scan data across the bridge tagged with a &lt;code&gt;__RECIPROCITY__&lt;/code&gt; prefix, a per-scan &lt;code&gt;scanId&lt;/code&gt; and a per-pagination-run &lt;code&gt;requestId&lt;/code&gt;, plus a rotated 32-char hex bridge token negotiated at session start. The isolated world validates every incoming payload. In the real code, that means checking the bridge source marker, token, scan correlation, phase, and request correlation before accepting scan data. Simplified:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;__RECIPROCITY__&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;currentBridgeToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scanId&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;activeScanId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// validated — process scan data only&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the token doesn't match, or if the &lt;code&gt;scanId&lt;/code&gt; is stale, the message is silently dropped. The token is not a secret from page JavaScript once the bridge is established; its job is correlation, freshness, and stale-session rejection, not authentication against a page that's already listening. The real security boundary is narrower and stronger: even if a hostile page can forge scan traffic, it still cannot cross the isolated-world/runtime boundary into the unfollow path.&lt;/p&gt;

&lt;p&gt;Is this overengineered for a tool that most people will run once a month? Maybe. But the threat model isn't "normal user on a clean page." It's "normal user on a page where &lt;em&gt;any&lt;/em&gt; third-party script — Instagram's own ad SDK, a browser toolbar, an injected A/B test — can &lt;code&gt;postMessage&lt;/code&gt; into your bridge." That very bridge has exactly one job: make sure that even in that environment, a hostile page can at worst corrupt or spam scan-only data, not cross into the unfollow path.&lt;/p&gt;

&lt;p&gt;Destructive actions never originate from the MAIN world. The background service worker accumulates the lists, computes the set difference, and holds the state. When the user clicks "Execute", the background script talks &lt;em&gt;only&lt;/em&gt; to the isolated world via &lt;code&gt;chrome.runtime.sendMessage&lt;/code&gt;. This is the part of the architecture I'm proudest of — not because it's clever, but because the guarantee is structural. It doesn't depend on discipline, or code review, or "we'd never route unfollows through MAIN." It depends on the fact that the MAIN world &lt;em&gt;physically cannot reach&lt;/em&gt; the unfollow path.&lt;/p&gt;

&lt;p&gt;The page JavaScript cannot trigger, spoof, or even perceive an unfollow command.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop Puppeteering the DOM
&lt;/h2&gt;

&lt;p&gt;The standard approach to scraping a single-page app is DOM puppeteering — scrolling the viewport to trigger lazy loading. I tried this first. It's brittle in ways that compound: (i)if the tab loses focus, Chrome throttles &lt;code&gt;requestAnimationFrame&lt;/code&gt; and the scrolling stalls; (ii) if Instagram changes a modal's CSS class, the scroller breaks. You're simulating a human to trick a UI into loading data that already has an API.&lt;/p&gt;

&lt;p&gt;That is backwards.&lt;/p&gt;

&lt;p&gt;Reciprocity captures the endpoint shape Instagram is already using when available, then takes over with direct cursor-based pagination, falling back to the well-known REST endpoint when needed. &lt;code&gt;content-main.js&lt;/code&gt; makes direct &lt;code&gt;fetch()&lt;/code&gt; calls to Instagram's endpoints using cursor-based pagination for list extraction — no scroll simulation, no dependency on Instagram's UI rendering pipeline.&lt;/p&gt;

&lt;p&gt;Because we aren't relying on UI rendering, the background script spawns a dedicated, unfocused Chrome window for the scan. The user clicks "Scan" and goes back to whatever they were doing. The extension pages through up to 50,000 users per list silently — 800–2000ms between API calls, with a 5-second pause every 50 requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rate Limits Are Part of the Product
&lt;/h2&gt;

&lt;p&gt;Instagram will shadowban you for velocity. Unfollow 300 people in two minutes and your account goes quiet for weeks. The architecture has to absorb that constraint, not defer it to user discipline.&lt;/p&gt;

&lt;p&gt;Reciprocity enforces: 20 unfollows per rolling 60-minute window, 100/day hard cap, 3–8 seconds of randomized delay between each request.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Rolling&lt;/em&gt; window, not clock-hour buckets. If I execute 20 unfollows at 2:55 PM, I shouldn't get 20 more at 3:00 PM. Both limits derive from a single &lt;code&gt;unfollowSuccessTimestamps&lt;/code&gt; array in &lt;code&gt;chrome.storage.local&lt;/code&gt; — epoch-millisecond entries, continually pruned to retain only same-day and last-hour entries. One data structure, two constraints, zero drift.&lt;/p&gt;

&lt;p&gt;The execution lock is persistent. If the user closes the browser mid-unfollow, the &lt;code&gt;unfollowExecutionState&lt;/code&gt; snapshot in storage prevents concurrent bulk runs when they reopen. There's a 90-second stale-lock reconciliation to handle bad exits. The background state machine (&lt;code&gt;idle → scanning_following → scanning_followers → processing → done&lt;/code&gt;, plus error and interrupted recovery paths) is the sole truthbearer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Without a Build System
&lt;/h2&gt;

&lt;p&gt;337 tests are covered through five files and, again, no external dependencies.&lt;/p&gt;

&lt;p&gt;The tests use Node.js built-in &lt;code&gt;node:test&lt;/code&gt; and &lt;code&gt;node:assert&lt;/code&gt;. No Vitest, no Jest, no test runner that needs its own config file. But this created a real constraint: the extension's source files are vanilla scripts, i.e., no &lt;code&gt;module.exports&lt;/code&gt; or ESM exports. They're designed to run in a browser, not in Node.&lt;/p&gt;

&lt;p&gt;The solution is ugly and deliberate. Each test file copies the pure functions it needs to test inline. &lt;code&gt;validateBatchUser()&lt;/code&gt;, &lt;code&gt;normalizeTimestamps()&lt;/code&gt;, the message normalization logic — they exist as duplicated source in the test files, extracted by hand from the extension code.&lt;/p&gt;

&lt;p&gt;This is a maintenance cost I chose to pay. The alternative was introducing a build step — a bundler that could tree-shake exports for the browser while making them available to Node. For a four-file extension with no external dependencies, a bundler is not simplification. It's a new failure mode wearing a productivity costume.&lt;/p&gt;

&lt;p&gt;The same zero-dependency principle that keeps the permission surface minimal keeps the toolchain auditable. There's no &lt;code&gt;package-lock.json&lt;/code&gt; because there are no packages to lock in the first place. The npm/package-manager supply-chain surface here is &lt;em&gt;zero&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Constraints Produced
&lt;/h2&gt;

&lt;p&gt;Every MV3 constraint I fought against turned into a structural property I'd now defend:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The two-world split forced the MAIN world to be scan-only. Destructive actions are architecturally unreachable from page JS.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;window.postMessage&lt;/code&gt; is insecure by design, so every payload goes through token rotation and validation. A compromised page can corrupt scan data, but it cannot cross into the unfollow path.&lt;/li&gt;
&lt;li&gt;With only &lt;code&gt;tabs&lt;/code&gt; and &lt;code&gt;storage&lt;/code&gt; in the permission set, there's no cookie-store permission, no &lt;code&gt;webRequest&lt;/code&gt;, and no broad host access.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is invisible to the user. The &lt;code&gt;manifest.json&lt;/code&gt; is under 60 lines. A skeptical developer can read it in a few minutes and verify that the trust model matches the claim. That verifiability is the entire product thesis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.chrome.com/docs/extensions/develop/migrate/mv2-deprecation-timeline" rel="noopener noreferrer"&gt;Chrome disabled Manifest V2 for all users in Chrome 138 on July 24, 2025; Chrome 139 removed the enterprise-policy escape hatch.&lt;/a&gt; MV3 is the environment extensions actually live in now. You can spend your time trying to smuggle the old model forward, or you can let the constraints shape the architecture.&lt;/p&gt;

&lt;p&gt;The constraints here &lt;em&gt;were&lt;/em&gt; the architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://chromewebstore.google.com/detail/reciprocity/micnkndhjhajkhpbgjijfihcendcoebm" rel="noopener noreferrer"&gt;Reciprocity is on the Chrome Web Store.&lt;/a&gt; Install it, right-click, inspect the source. Don't believe me, verify.&lt;/p&gt;

</description>
      <category>privacy</category>
      <category>security</category>
      <category>javascript</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Five Corrections: What an AI Agent Didn't Know About My Production Database</title>
      <dc:creator>Rafael Costa</dc:creator>
      <pubDate>Wed, 08 Apr 2026 14:08:54 +0000</pubDate>
      <link>https://dev.to/devanomaly/five-corrections-what-an-ai-agent-didnt-know-about-my-production-database-5815</link>
      <guid>https://dev.to/devanomaly/five-corrections-what-an-ai-agent-didnt-know-about-my-production-database-5815</guid>
      <description>&lt;h3&gt;
  
  
  And why "just write a better prompt" is the wrong lesson
&lt;/h3&gt;

&lt;p&gt;The AI agent had just pulled 30 days of CloudWatch metrics, parsed them correctly, built a table, and pivoted the entire recommendation based on what it found.&lt;br&gt;
I typed four words: "that's shared cluster data."&lt;br&gt;
The deadlock counts, the connection peaks, the daily patterns — all real numbers from a real database cluster. All useless for the decision we were making. Our application shares that cluster with other services. The agent had processed the data flawlessly and drawn a conclusion from someone else's workload.&lt;br&gt;
The data was flawless. The conclusion was someone else's. That gap has a structure worth looking at.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;Multi-tenant Django application on Aurora MySQL. Low user count, background workers for bulk operations, primary-replica topology. No transaction isolation level configured anywhere — MySQL's default &lt;code&gt;REPEATABLE READ&lt;/code&gt; running unchallenged. Sporadic deadlocks in background tasks, stale reads on the replica.&lt;br&gt;
I pointed an AI agent at the codebase and asked it to devise a plan for choosing the right isolation level. Not "switch to READ COMMITTED" — &lt;em&gt;decide what it should be, with evidence.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What the agent did well
&lt;/h2&gt;

&lt;p&gt;Within minutes it had explored the entire codebase, mapped six categories of concurrent access patterns, and evaluated all four MySQL isolation levels against each. Found that no isolation level was configured anywhere — not in Django settings, not in middleware, not in &lt;code&gt;init_command&lt;/code&gt;. Identified the specific background tasks doing bulk write cycles, the views with missing transaction boundaries, the delete-then-recreate services structurally vulnerable to race conditions.&lt;br&gt;
This would have taken me a full day of methodical grep-and-read. The agent produced a concurrency map that didn't exist before the session started.&lt;br&gt;
Then it started getting things wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three corrections, same shape
&lt;/h2&gt;

&lt;p&gt;Over the course of the investigation, I interrupted the agent five times. Three of those reveal the same structural pattern. The other two are different in kind — I'll get to those.&lt;br&gt;
&lt;strong&gt;"That's shared cluster data."&lt;/strong&gt; CloudWatch showed hundreds of peak connections and sporadic deadlocks across 30 days. The agent built a careful analysis from these numbers, but those numbers belonged to the Aurora cluster, not to our application — low user count, handful of background workers, a fraction of those connections. The deadlocks could be entirely from other tenants on the same cluster.&lt;br&gt;
Could a better prompt have prevented this? Sure. "The Aurora cluster is shared; CloudWatch metrics are cluster-wide." I knew that before the session. It's promptable.&lt;br&gt;
But I &lt;em&gt;expected&lt;/em&gt; the agent to catch it. It had the user count, the worker count, the CLAUDE.md. There was enough signal to at least question whether those connection peaks belonged to us. It didn't question anything — processed the numbers as ours and kept going.&lt;br&gt;
&lt;strong&gt;"Did you check production only, or were staging and dev mixed?"&lt;/strong&gt; The agent queried SigNoz for database metrics — millions of reads, tens of thousands of writes, clean latency percentiles. SigNoz had separate service entries per environment, and the agent hadn't verified which ones it was aggregating. The CLAUDE.md, the project docs — the namespace separation was right there.&lt;br&gt;
&lt;strong&gt;"What about the slow queue workers?"&lt;/strong&gt; The agent pulled error data for the main API service and the primary background worker. Missed the slow-queue workers entirely — the ones handling the heaviest bulk operations, the exact workloads where deadlocks would actually manifest. I know which queues carry what because I designed the routing. The agent queried the obvious service names and stopped.&lt;br&gt;
&lt;strong&gt;The pattern.&lt;/strong&gt; Each time, the agent had directional context: the codebase, the project setup, the CLAUDE.md scoping the investigation. The shared cluster, the environment boundaries, the queue routing were all inferrable from material it could see.&lt;br&gt;
Fair objection: "So the context was there and the agent missed it. That's a tooling problem." Not a bad objection — better agents will get better at this.&lt;br&gt;
But notice what's actually happening. The agent treats observability data as self-describing, and observability data inherits the topology of the infrastructure that produces it — structure that doesn't announce itself in the query results. CloudWatch doesn't label which connections belong to which tenant. SigNoz doesn't flag cross-environment aggregation. The data just arrives, looking authoritative.&lt;br&gt;
The total context space includes cluster layout, environment naming conventions, service routing, monitoring configuration, team history with past incidents. The hard part isn't supplying that context. It's knowing which piece applies at the exact moment of inference — and that's a judgment problem wearing a context mask.&lt;br&gt;
Better agent protocols will close some of this gap: verification steps before ingesting observability data, environment boundary checks, worker enumeration. Adversarial loops and self-correction prompts — "before drawing conclusions from this data, verify its scope" — would probably have caught all three. But someone has to write that checklist, and it's the engineer who already knows where the agent will drift. You're pre-encoding judgment into process. The human steering doesn't disappear; it moves earlier in the pipeline.&lt;br&gt;
That objection only holds for these three. The next two resist it entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two different species
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Premature convergence.&lt;/strong&gt; The agent's first plan evaluated &lt;code&gt;REPEATABLE READ&lt;/code&gt; versus &lt;code&gt;READ COMMITTED&lt;/code&gt; and recommended switching. MySQL has four isolation levels. The plan hadn't touched &lt;code&gt;READ UNCOMMITTED&lt;/code&gt; or &lt;code&gt;SERIALIZABLE&lt;/code&gt;, hadn't considered per-connection versus per-transaction strategies, hadn't looked at Aurora-specific features. It optimized before it explored. Not all of those gaps matter equally — per-transaction strategy is the one that actually bites — but the point is the agent never even mapped the decision space before narrowing it. Probably because the REPEATABLE READ vs. READ COMMITTED binary dominates MySQL docs and Stack Overflow, so the training data funnels toward it. Strong early evidence triggers convergence, and decision-space awareness is a judgment, not instruction-following, skill.&lt;br&gt;
&lt;strong&gt;Adversarial peer review.&lt;/strong&gt; After the agent recommended &lt;code&gt;READ COMMITTED&lt;/code&gt;, I dropped in a document I'd prepared separately: a fact-check showing this is a defensible but contested position. Dimitri Kravtchuk, Oracle's MySQL performance architect, has shown that &lt;code&gt;READ COMMITTED&lt;/code&gt; creates per-statement ReadView overhead causing &lt;code&gt;trx_sys&lt;/code&gt; mutex contention at scale — up to 2x performance degradation for short transactions. No prompt can pre-load a rebuttal to a conclusion that hasn't been reached yet. That's the human acting as adversarial reviewer, not correcting the trajectory but stress-testing the destination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this generalizes
&lt;/h2&gt;

&lt;p&gt;The isolation level was the vehicle. The pattern applies anywhere the codebase is an incomplete map of the system — capacity planning, migration strategies, incident response — anywhere the agent can analyze what's visible but can't weigh what's implicit.&lt;br&gt;
Providing context and having the agent &lt;em&gt;apply&lt;/em&gt; it at the right inferential moment are different problems, and the second one requires holding the system in your head, which is exactly the kind of knowledge an agent is supposed to help you leverage, not replace.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually happened
&lt;/h2&gt;

&lt;p&gt;The agent shipped a solid architecture decision record, a team-facing report, a concurrency architecture document, half a dozen issues for the gaps it found, and a PR with the implementation. &lt;code&gt;READ COMMITTED&lt;/code&gt; via &lt;code&gt;init_command&lt;/code&gt;, with an escape hatch for the two operations that genuinely need snapshot isolation. Days of senior engineering work compressed into hours.&lt;br&gt;
But the final recommendation was correct because a human who understood the system — the infrastructure, the observability configuration, the workload routing, the external literature — kept redirecting the analysis every time it drifted.&lt;br&gt;
&lt;em&gt;Who in the room holds the topology?&lt;/em&gt;&lt;br&gt;
That's the &lt;a href="https://dev.to/devanomaly/the-mental-model-problem-of-ai-generated-code-2dle"&gt;mental model problem&lt;/a&gt; applied to infrastructure. The agent is a force multiplier — but a multiplier needs something to multiply.&lt;br&gt;
The faulty inferences — despite context that should have been more than enough — cast mental model shadows: business rules, operations structure, and infrastructure constraints are all projections of the same multidimensional creature that can either devour or protect you forever.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>database</category>
      <category>architecture</category>
    </item>
    <item>
      <title>GitHub Told Me I Had Merge Conflicts. Git Told Me I Didn't. They Were Both Right.</title>
      <dc:creator>Rafael Costa</dc:creator>
      <pubDate>Wed, 01 Apr 2026 14:37:36 +0000</pubDate>
      <link>https://dev.to/devanomaly/github-told-me-i-had-merge-conflicts-git-told-me-i-didnt-they-were-both-right-412m</link>
      <guid>https://dev.to/devanomaly/github-told-me-i-had-merge-conflicts-git-told-me-i-didnt-they-were-both-right-412m</guid>
      <description>&lt;p&gt;Last month I tried to merge &lt;code&gt;main&lt;/code&gt; into &lt;code&gt;stg&lt;/code&gt;. Routine sync. GitHub said: "Can't automatically merge". So I ran the same merge locally... and got a clean merge. Zero conflicts.&lt;/p&gt;

&lt;p&gt;Same branches. Same commits. Different answer. I've been writing software for years and I genuinely did not know this could happen.&lt;/p&gt;

&lt;p&gt;What followed was the kind of debugging session I recognize from physics more than from software: tracing a failure back through layers of structure until you hit the actual constraint that's doing the damage. Except the system wasn't a quantum lattice. It was git's commit graph. And the constraint wasn't an obvious one.&lt;/p&gt;

&lt;p&gt;If you've ever been surprised by a merge conflict, or wondered how git's merge-base works, or just want to understand how branch flow design can create weird graph topologies, welcome to the story of twelve merge bases, a diamond-shaped DAG, and the fix that shouldn't have worked.&lt;/p&gt;

&lt;h2&gt;
  
  
  A quick crash course (the parts that matter)
&lt;/h2&gt;

&lt;p&gt;If you already think in terms of DAGs and merge bases, skip to "Twelve Ancestors." If not, we'll cover three core concepts, and the rest of this article follows from them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concept 1: commits are snapshots with parent pointers.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A ← B ← C ← D
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each commit stores a complete snapshot of your files and a pointer back to its parent. That's it. The whole history is a chain of these. Computer scientists call the resulting structure a &lt;strong&gt;directed acyclic graph&lt;/strong&gt; - a DAG. Directed because pointers go one way. Acyclic because you can never follow them in a circle. &lt;em&gt;Every problem I'm about to describe is a property of this graph&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;A branch is just a sticky note pointing to a commit. &lt;code&gt;main&lt;/code&gt; points to D. Branches are labels, not containers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concept 2: merges need a common ancestor.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Consider what happens when you branch off and both sides get new commits. Say we're creating a feature branch from main:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       ← E ← F     ← feature
      /
A ← B ← C ← D       ← main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;B is the last commit reachable by walking back parent pointers from &lt;em&gt;both&lt;/em&gt; branch tips. Git calls this the &lt;strong&gt;merge base&lt;/strong&gt; — "common" always means "common to the two branches being merged." That definition is load-bearing for everything below.&lt;/p&gt;

&lt;p&gt;To merge, git diffs base → main and base → feature, then combines both diffs; this is a 3-way merge of two branch tips and one common ancestor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        ← E ← F ──╮
       /           M  ← main
A ← B ← C ← D ────╯
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If both diffs touch the same line differently, that's a conflict. Everything else merges automatically.&lt;/p&gt;

&lt;p&gt;You should notice that M, our merge commit, has &lt;strong&gt;two parents&lt;/strong&gt;: D (main's old tip) and F (feature's tip). This is contrary to a regular commit which has &lt;strong&gt;a single parent&lt;/strong&gt;. That means M is a descendant of &lt;em&gt;both&lt;/em&gt; branches that were merged. This seems innocuous, but it's the single property that makes &lt;del&gt;me writing this piece possible&lt;/del&gt; everything below work. It's really how ancestry flows between branches, why sequential merges stay clean, and why the diamond requires concurrency. One mechanism, three consequences.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concept 3: git needs the &lt;em&gt;newest&lt;/em&gt; common ancestor.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If git picks an old ancestor as the base, both diffs include changes the branches already agree on. False conflicts everywhere. The newest ancestor minimizes the diff — only what actually diverged shows up.&lt;/p&gt;

&lt;p&gt;But what happens when there &lt;em&gt;isn't&lt;/em&gt; a single newest?&lt;/p&gt;

&lt;h2&gt;
  
  
  Twelve Ancestors
&lt;/h2&gt;

&lt;p&gt;Back to my problem.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git merge-base &lt;span class="nt"&gt;--all&lt;/span&gt; origin/main origin/stg | &lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt;
12
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Twelve merge bases, not one. &lt;/p&gt;

&lt;p&gt;When git finds multiple incomparable bases - none descending from any other - it can't just pick one. Its fallback: recursively merge them together into a synthetic virtual ancestor, then use that as the single base for the real merge. Not a real commit in your history, though. It's a temporary in-memory artifact. With twelve bases, that's a cascade of merges-within-merges before the intended one even starts.&lt;/p&gt;

&lt;p&gt;Local git merged it cleanly. GitHub's server-side mergeability check didn't.&lt;/p&gt;

&lt;p&gt;That was the first surprise. "GitHub runs the same merge I run locally" is close enough for everyday work, but not literally true. GitHub computes PR mergeability in the background using a test merge commit, and historically its server-side behavior diverged from local git often enough that GitHub &lt;a href="https://github.blog/engineering/infrastructure/scaling-merge-ort-across-github/" rel="noopener noreferrer"&gt;migrated merges and rebases to merge-ort in 2023&lt;/a&gt;. In our case, the important fact wasn't the exact internal path — it was that the server-side check surfaced a graph-topology problem that local git could still resolve.&lt;/p&gt;

&lt;p&gt;But twelve merge bases is not normal. Where did they come from?&lt;/p&gt;

&lt;h2&gt;
  
  
  How twelve diamonds form
&lt;/h2&gt;

&lt;p&gt;Our branch flow had recently evolved into this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;main → working-branch           (branches always fork from main)
       working-branch → stg     (QA)
       working-branch → releases → main   (deploy)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;stg&lt;/code&gt; is a dead end, a parallel validation lane with nothing flowing out of it. Clean, one-directional pipeline. Working branches are always born from main, so their fork points sit on main's history.&lt;/p&gt;

&lt;p&gt;Except two things used to happen that broke the one-directional rule.&lt;/p&gt;

&lt;p&gt;First: we merged &lt;code&gt;main&lt;/code&gt; back into &lt;code&gt;stg&lt;/code&gt; to "stay in sync." Second - and this was the invisible one - developers (not me, pff, of course, &lt;em&gt;cough cough&lt;/em&gt;) occasionally ran &lt;code&gt;git merge origin/stg&lt;/code&gt; on their working branches to grab something from staging. That branch then shipped through releases into main, carrying stg-only ancestry with it.&lt;/p&gt;

&lt;p&gt;The first puts main-only commits into stg's history. The second puts stg-only commits into main's ancestry. Bidirectional flow from intermediate branch states. When two branches each absorb the other's history like that, git calls it a &lt;strong&gt;criss-cross merge&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The minimal version
&lt;/h3&gt;

&lt;p&gt;Strip away the branch flow. Two branches, two merges, one diamond:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Start:   main at M, stg at S (diverged from ancestor A)

Person 1: git checkout main &amp;amp;&amp;amp; git merge origin/stg
          → M' (parents: M, S)

Person 2: git checkout stg &amp;amp;&amp;amp; git merge origin/main
          → S' (parents: S, M)

(Both fetched before either pushed.)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now trace the common ancestors of M' and S':&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;M&lt;/strong&gt; is reachable from M' (direct parent). Reachable from S' too (S' has M as its second parent, a cross-merge). Common ancestor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S&lt;/strong&gt; is reachable from S' (direct parent). Reachable from M' as well (M' has S as its second parent, the other cross-merge). Common ancestor.&lt;/li&gt;
&lt;li&gt;M does not descend from S. S does not descend from M.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       M
      ╱ ╲
   S'    M'
      ╲ ╱
       S
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two bases → diamond. That's it (;&lt;/p&gt;

&lt;p&gt;The key: both merges see the other branch's &lt;strong&gt;pre-merge&lt;/strong&gt; state. If Person 2 had fetched &lt;em&gt;after&lt;/em&gt; Person 1 pushed, S' would descend from M' — single base, no diamond. Concurrency is the crucial ingredient.&lt;/p&gt;

&lt;p&gt;Your first instinct might be: can't you create this with sequential direct merges? main→stg, then stg→main, then main→stg? No, because of the two-parent property from the crash course. Each merge commit descends from both tips. So when you merge stg→main, main's new tip descends from stg's current state. The next merge (main→stg) sees that result (a commit that already contains stg's history) and there's a single dominant ancestor. &lt;strong&gt;Always&lt;/strong&gt;, no matter how many times you alternate.&lt;/p&gt;

&lt;h3&gt;
  
  
  How our branch flow produced this concurrency
&lt;/h3&gt;

&lt;p&gt;In practice, nobody on our team was simultaneously merging in both directions. The working branch indirection turned sequential actions, spread across days or weeks, into graph-concurrent events. Here's the mechanism.&lt;/p&gt;

&lt;p&gt;Start with stg at state &lt;strong&gt;S&lt;/strong&gt; and main at state &lt;strong&gt;M&lt;/strong&gt;. Both have diverged from their last common point, that is, neither descends from the other "in the near past". For all purposes now, they sit on parallel paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The contamination.&lt;/strong&gt; A developer on branch-B runs &lt;code&gt;git merge origin/stg&lt;/code&gt; to pull something from staging. That merge commit has two parents: branch-B's old tip and S. The second parent is the door: stg's entire history is now reachable from branch-B by walking that parent pointer. Branch-B now carries &lt;strong&gt;S&lt;/strong&gt; in its ancestry.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    stg:   ─────── S ──────────────
                    │
              (git merge origin/stg)
                    │
    main:  ─── M ────── branch-B ──
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The backflow.&lt;/strong&gt; Before branch-B ships, someone merges main → stg to "stay current." Same mechanism, opposite direction: BF's two parents are stg's old tip and M. Once more, second parent paves the way to disaster: main's history is now reachable from stg.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    stg:   ─── S ──────── BF
                          ╱
                   (main → stg)
                        ╱
    main:  ─── M ────────── branch-B (still developing)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The carrier ships.&lt;/strong&gt; Branch-B goes through releases into main.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    stg:   ─── S ──────── BF ─────── ...
                │          ╱
          (via branch-B)  (backflow)
                │        ╱
    main:  ─── M ────── MX ─────── ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MX descends from both M and S (through branch-B's merge of stg). The two-parent property created the bidirectional flow, and it's about to create the diamond too.&lt;/p&gt;

&lt;p&gt;Now trace the common ancestors of stg and main:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;M&lt;/strong&gt; is reachable from stg. How? BF is a merge commit, its two parents are S and M. Walk stg's ancestry back to BF, then follow BF's second parent to M. That's the backflow's two-parent link doing the work. M is also reachable from main directly. Common ancestor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S&lt;/strong&gt; is reachable from main. How? MX descends from branch-B, and branch-B merged stg, again two parents, one of which is S. Walk main's ancestry back to MX → branch-B → S. That's the contamination's two-parent link. S is also reachable from stg directly. Common ancestor.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But M does not descend from S. And S does not descend from M. They're on parallel paths: M on main's history, S on stg's history.&lt;/p&gt;

&lt;p&gt;None of this matters until someone tries to merge stg and main. That's the triggering event — git needs a single merge base, runs &lt;code&gt;merge-base&lt;/code&gt;, and hits both S and M. Remember the filtering rule from the crash course: git only keeps the &lt;em&gt;youngest&lt;/em&gt; common ancestors, dropping any that have a younger common ancestor descending from them. But neither S nor M can filter the other out, because neither descends from the other. So both survive, leaving Git stuck with two incomparable bases. It now has to fall back and recursively synthesize a virtual ancestor from S and M.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        M (reached via backflow)
       ╱  ╲
    stg    main
       ╲  ╱
        S (reached via branch-B)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two paths diverge and reconverge. Same diamond as the toy example, different wiring. The working branch captured stg's state weeks before the backflow captured main's state. Wall-clock sequential. But in the &lt;em&gt;graph&lt;/em&gt;, each cross-merge referenced the other branch's pre-merge state. The branch indirection turned sequential actions into the same concurrent topology as two people merging "at the same time".&lt;/p&gt;

&lt;p&gt;Each cycle, of course, adds more diamonds: a new backflow that references main before the latest carrier ships, or some new working branch that merged stg before the latest backflow... With six backflows over a few months, you can get twelve merge bases with none dominating the others.&lt;/p&gt;

&lt;p&gt;In physics you'd call this a degeneracy — multiple states at the same energy level, no symmetry-breaking mechanism to select one. The DAG had the same problem: twelve ancestors at the same "depth," no &lt;strong&gt;easy&lt;/strong&gt; descendancy relationship to break the tie.&lt;/p&gt;

&lt;p&gt;A natural question if your team has a similar branch flow: does merging many feature branches into both stg and main create this problem? No. If branches fork from main and merge into both sides &lt;em&gt;without pulling from stg first&lt;/em&gt;, the common ancestors are all fork points on main's linear history. Linear means each one descends from the last. Clear ordering, always one merge base.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    stg:   ── M₁ ───── M₂ ───── M₃ ───── M₄
             ╱         ╱         ╱         ╱
          feat-A    feat-B    feat-C    feat-D
           ╱         ╱         ╱         ╱
    main:  F₁ ── M₅ ── F₂ ── M₆ ── F₃ ── M₇ ── F₄ ── M₈

    Every feature forks from main and merges into both stg and main.
    Fork points are on main's history — each descends from the last.
    merge-base(stg, main) always has a single youngest. No diamonds.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The diamond requires &lt;strong&gt;two&lt;/strong&gt; ingredients: stg-only ancestry entering main (via a working branch that merged stg) &lt;strong&gt;plus&lt;/strong&gt; main-only ancestry entering stg (via a backflow). And critically, each has to happen before the other's result is visible, so they reference mutual past states instead of the merged result. No bidirectional flow, no diamonds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "just pick the newest" doesn't work
&lt;/h2&gt;

&lt;p&gt;My first instinct: why doesn't git use timestamps to pick the most recent?&lt;/p&gt;

&lt;p&gt;Because "newest" requires a total ordering, and the diamond creates commits that are &lt;em&gt;incomparable&lt;/em&gt;. M1 was created February 25. M2 was created February 28. Neither descends from the other. They sit on parallel paths connected at the top and bottom of the diamond, but not to each other.&lt;/p&gt;

&lt;p&gt;Git can only order commits along parent-child chains. Across parallel paths, there's no ordering. Asking "which is newer?" is like asking which is taller, the color blue or a Tuesday.&lt;/p&gt;

&lt;h3&gt;
  
  
  What git actually does
&lt;/h3&gt;

&lt;p&gt;It doesn't pick a winner. It &lt;em&gt;synthesizes&lt;/em&gt; one.&lt;/p&gt;

&lt;p&gt;When &lt;code&gt;merge-base&lt;/code&gt; returns S and M as incomparable youngest ancestors, git's &lt;code&gt;ort&lt;/code&gt; strategy merges S and M into a virtual commit V, and to do &lt;em&gt;that&lt;/em&gt;, it needs their common ancestor. Remember, S and M do share ancestry - the old fork point where stg was born from main, the one that was filtered out of the top-level search because S and M are younger. That older ancestor comes back one level down as the base for merging S against M.&lt;/p&gt;

&lt;p&gt;So the cascade is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Top-level merge&lt;/strong&gt; (stg into main): finds twelve youngest common ancestors, all incomparable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recursive resolution&lt;/strong&gt;: merge them pairwise. Each pairwise merge needs a base, and that base is an older ancestor that was "too old" for the top-level merge but exactly right for resolving the conflict between two of the twelve.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At that lower level&lt;/strong&gt;, the older ancestor is usually unambiguous: it predates the diamond-creating merges, so there's a single youngest. The recursive merge succeeds and produces a virtual result.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;V becomes the synthetic base&lt;/strong&gt; for the real merge.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With twelve bases, this is why the operation is expensive! It's not twelve straightforward comparisons, but a cascade of merges-within-merges, each needing its own base resolution. Local git's &lt;code&gt;ort&lt;/code&gt; handled that depth. GitHub's server-side path didn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix that shouldn't work
&lt;/h2&gt;

&lt;p&gt;Here's where it got counterintuitive. The fix for twelve merge bases is... yet another merge!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout stg
git merge origin/main
git push origin stg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait: wasn't a &lt;code&gt;main → stg&lt;/code&gt; merge the thing that &lt;em&gt;caused&lt;/em&gt; this? Shouldn't this make it worse?&lt;/p&gt;

&lt;p&gt;That was my reaction too. But look at the graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BEFORE: 12 merge bases, none dominating the others

    stg ──────────── ...          main ──────────── ...
         ╲        ╱                    ╲        ╱
         MB₁    MB₂                  MB₃    MB₄  ... MB₁₂
         (none is a descendant of any other — all 12 are "latest")


AFTER: git merge origin/main

               stg (pointer moves here)
                ↓
    ── ── ──── M (new merge commit)
              ╱ ╲
    (stg's      main-tip  ← this is now THE single merge base
   old head)       │
                (descendant of all 12 old MBs)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Remember: a branch is just a pointer to a commit. The merge created M and moved &lt;code&gt;stg&lt;/code&gt; to point at it. M has two parents: the commit stg used to point at, and main's tip. That makes main's tip reachable from both branches (a new common ancestor). And main's tip descends from all twelve old bases, because main's history contains all of them.&lt;/p&gt;

&lt;p&gt;Here's the mechanism. &lt;code&gt;git merge-base&lt;/code&gt; only returns the &lt;em&gt;youngest&lt;/em&gt; common ancestors of the two branches. The rule: if a common ancestor has a descendant that's &lt;em&gt;also&lt;/em&gt; a common ancestor of both branches, the older one is redundant - the younger one already contains everything the older one had - so it gets dropped.&lt;/p&gt;

&lt;p&gt;The twelve old bases survived before because none descended from any other. Same depth, parallel paths, no way to filter. But the merge created main's tip as a new common ancestor that descends from all twelve. Now every one of them has a younger common ancestor below it. All twelve filtered out. Only main's tip survives.&lt;/p&gt;

&lt;p&gt;It doesn't untangle the diamonds. It buries them.&lt;/p&gt;

&lt;p&gt;And nothing is lost, which is where the first concept pays off: &lt;em&gt;"Commits are snapshots, not diffs"&lt;/em&gt;. Diffs are computed on the fly from whatever base git picks. Main's tip already contains all the content from the twelve old bases - it descends from all of them, so their content is baked into its snapshot. The burial doesn't discard data — it moves the comparison point forward, shrinking the diff to only the real divergence.&lt;/p&gt;

&lt;p&gt;After pushing that merge, GitHub's "Can't automatically merge" disappeared.&lt;/p&gt;

&lt;p&gt;The fix works because the criss-cross requires a &lt;em&gt;cycle&lt;/em&gt;, content flowing through both directions. A single final merge without subsequent backflows creates a new dominant ancestor and stops. No cycle, no new diamond. It's a one-time symmetry-breaking intervention: you introduce a commit that's unambiguously "more recent" than all twelve, and the degeneracy lifts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cherry-pick vs merge
&lt;/h2&gt;

&lt;p&gt;One thing clicked during this that I'd never really appreciated enough.&lt;/p&gt;

&lt;p&gt;A merge connects two histories, due to the commit-with-two-parents property. Every future merge-base calculation has to account for that connection. A cherry-pick copies a diff as a new, independent commit. No parent link. Git doesn't know they're related (because, fundamentally, they aren't). The graphs stay disconnected with no impact on ancestry and no diamonds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MERGE: main → stg (to get hotfix H)

    main:  A ── B ── H
                     │
    stg:   C ── D ── M (merge commit, H is a parent)
                    ╱
              parent link created
              → graphs are now connected
              → affects all future merge-base calculations
              → potential diamond


CHERRY-PICK: cherry-pick H onto stg

    main:  A ── B ── H

    stg:   C ── D ── H' (new commit, same diff, NO parent link)

              H and H' have identical content
              but git doesn't know they're related
              → graphs stay independent
              → no impact on ancestry
              → no diamonds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A note on received wisdom: Raymond Chen's excellent &lt;a href="https://devblogs.microsoft.com/oldnewthing/20180323-01/?p=98325" rel="noopener noreferrer"&gt;"Stop cherry-picking, start merging"&lt;/a&gt; series documents how cherry-picks between branches that &lt;em&gt;will eventually merge&lt;/em&gt; create time bombs — spurious conflicts, silent reversions, the works. But in a &lt;a href="https://devblogs.microsoft.com/oldnewthing/20180709-00/?p=99195" rel="noopener noreferrer"&gt;follow-up&lt;/a&gt;, he's explicit: "if the two branches never merge, then there's no need to get all fancy with your cherry-picking." Our stg is a dead end. Nothing flows out of it. Cherry-pick is the right tool precisely &lt;em&gt;because&lt;/em&gt; the graphs should stay disconnected.&lt;/p&gt;

&lt;p&gt;Rebase avoids merge commits, but it doesn't avoid ancestry changes. It replays your branch onto a chosen upstream. In this workflow, rebasing a working branch onto &lt;code&gt;main&lt;/code&gt; is fine — your branch stays rooted in main's history. Rebasing onto &lt;code&gt;stg&lt;/code&gt; would pull staging ancestry into the branch tip, which is exactly what we wanted to avoid. You also pay the price of rewriting history, but that's a separate tradeoff.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;p&gt;One merge to collapse existing damage. One flow change to prevent new damage.&lt;br&gt;
A guiding principle to unite them all: stop backflowing with merges.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Working flow:     main → branch (fork)
                  branch → stg (QA, dead end)
                  branch → releases → main (deploy)

If stg needs a hotfix:  cherry-pick from main
Never:                  main → stg, stg → main, releases → stg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The deeper realization was simpler and more uncomfortable: the bidirectional merges that created this mess weren't accidents. They were our process. "Merge main into stg to stay current" was something we did &lt;em&gt;on purpose&lt;/em&gt;, routinely, because it seemed like good hygiene. The diamonds accumulated silently for months and nobody noticed... until GitHub's less-tolerant merge path surfaced what local git had been quietly papering over.&lt;/p&gt;

&lt;p&gt;GitHub wasn't wrong. It was simply less forgiving. And that turned out to be &lt;strong&gt;substantially&lt;/strong&gt; useful, since it forced us to see a graph topology problem that &lt;code&gt;ort&lt;/code&gt; had been abstracting away!&lt;/p&gt;

&lt;p&gt;When GitHub says "can't merge" and local git says clean... the question isn't who's right. Both are. They're evaluating the same graph in different contexts - local git running a direct merge in your repo, GitHub running a server-side mergeability check. Knowing &lt;em&gt;that&lt;/em&gt; is the difference between debugging for ten minutes and debugging for a day.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The cheat sheet I wish I'd had:&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;You want to...&lt;/th&gt;
&lt;th&gt;Do this&lt;/th&gt;
&lt;th&gt;Not this&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Update your branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git rebase origin/main&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;git merge origin/stg&lt;/code&gt; or &lt;code&gt;git rebase origin/stg&lt;/code&gt; (imports stg ancestry)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test a feature&lt;/td&gt;
&lt;td&gt;branch → stg&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ship a feature&lt;/td&gt;
&lt;td&gt;branch → releases → main&lt;/td&gt;
&lt;td&gt;stg → releases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Get a hotfix into stg&lt;/td&gt;
&lt;td&gt;cherry-pick from main&lt;/td&gt;
&lt;td&gt;merge main → stg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub says "can't merge"&lt;/td&gt;
&lt;td&gt;Test locally first&lt;/td&gt;
&lt;td&gt;Trust the UI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Check merge base health&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git merge-base --all A B | wc -l&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Assume it's 1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

</description>
      <category>devjournal</category>
      <category>git</category>
      <category>github</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>The Mental Model Problem of AI-Generated Code</title>
      <dc:creator>Rafael Costa</dc:creator>
      <pubDate>Wed, 25 Mar 2026 15:00:37 +0000</pubDate>
      <link>https://dev.to/devanomaly/the-mental-model-problem-of-ai-generated-code-2dle</link>
      <guid>https://dev.to/devanomaly/the-mental-model-problem-of-ai-generated-code-2dle</guid>
      <description>&lt;h1&gt;
  
  
  The Mental Model Problem: Why AI-Generated Code Is More Expensive Than It Looks
&lt;/h1&gt;

&lt;p&gt;In physics, you never trust a result just because the math produced it.&lt;/p&gt;

&lt;p&gt;You take the output and attack it, check limiting cases — does the equation reduce to something known when you push a parameter to zero or infinity? You plug in extreme values, look for dimensional inconsistencies, and compare it against independent derivations. The computation is merely a tool; the verification is the methodology. &lt;br&gt;
Then, if and only if you can't break the result, you can &lt;em&gt;start&lt;/em&gt; to believe it. And it's win-win because, even if you do break it, that means you learned something specific about where the original reasoning went wrong — which is, sometimes, equally or more valuable than the result itself.&lt;/p&gt;

&lt;p&gt;I trained as a physicist — years of condensed-matter theory, all the way through a PhD. Now I build and ship software products. The career changed; the verification instinct didn't. And somewhere along the way, I noticed that the discipline that's second nature in physics is almost perfectly inverted in how most developers use AI coding tools.&lt;/p&gt;

&lt;p&gt;Much of the industry is converging on one workflow: AI generates code, you review it (as a matter of fact, even the review process is often automated). And the response to the quality problems this creates is to bolt guardrails on top — better review tools, AI-on-AI review chains, automated quality gates. All of that addresses a real problem. But it's addressing it from the wrong end.&lt;/p&gt;

&lt;p&gt;There's a different workflow that I've found consistently more effective for nontrivial work, and far fewer people center it as their default:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You write the code. AI tries to break it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This isn't about who types the first draft, but a cognitive fact: for nontrivial work, AI is often more useful as a &lt;em&gt;critic&lt;/em&gt; than as a first author. And once you internalize that, your entire workflow changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why AI Often Works Better as Critic Than First Author
&lt;/h2&gt;

&lt;p&gt;When an AI model generates code, it's predicting plausible token sequences given your prompt. It doesn't have intent. Hell, it frequently doesn't even know your system's history, nor does it understand why you picked one data structure over another six months ago, or what edge case took your team a week to discover. It produces something that &lt;em&gt;looks like&lt;/em&gt; a solution. Sometimes it is one, but often it's a sophisticated guess that drifts from your constraints in ways that are expensive to find.&lt;/p&gt;

&lt;p&gt;When the same model &lt;em&gt;critiques&lt;/em&gt; code, the dynamic is fundamentally different. You've given it a concrete artifact to reason about. Now, it can trace logic paths, check boundary conditions, ask "what happens if this input is null, or negative, or enormous?" and even compare your implementation against known patterns and spot deviations. What's central is this: critique is a constrained task — the model is operating within the boundaries of something that already exists. Generation is more-or-less an unconstrained task — the model is making architectural decisions it can have no basis for.&lt;/p&gt;

&lt;p&gt;This isn't just a practical observation. There's a cognitive mechanism underneath it that explains why the difference is so large.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mental Model Problem
&lt;/h2&gt;

&lt;p&gt;In nontrivial systems, the most expensive bottleneck is usually the mental model someone holds of how the system works.&lt;/p&gt;

&lt;p&gt;When you write code yourself — even rough, incomplete, first-draft code — you're building that mental model as you go. Every decision, even the ones you make quickly, leaves a trace in your understanding. You know &lt;em&gt;why&lt;/em&gt; the function is structured this way. You know which constraints you're encoding and which you're deferring. You know where you cut corners and where you were careful.&lt;/p&gt;

&lt;p&gt;When AI generates code and you review it, nobody holds the mental model. The AI never had one in the first place. And you're trying to &lt;em&gt;reconstruct&lt;/em&gt; that by reading the output — reverse-engineering intent from an artifact that was produced without any. This is possible for trivial code. For nontrivial systems, it's where time goes to die. Even just-a-little seasoned developers know how expensive this is. It's why code reviews are so much more draining than pair programming — in the latter, the mental model is shared in real time; in review, particularly of code you're somehow "far away from," it has to be reconstructed from the artifact alone.&lt;/p&gt;

&lt;p&gt;I think of it as the difference between &lt;em&gt;navigating a city you've walked through&lt;/em&gt; and &lt;em&gt;navigating a city from a map someone else drew&lt;/em&gt;. Both get you places. But when something unexpected happens — a road closure, a detour, a constraint that wasn't on the map — the person who walked the city knows six alternatives. The person with someone else's map is lost.&lt;/p&gt;

&lt;p&gt;This is why AI-generated code that "works" can be more dangerous than AI-generated code that breaks. Broken code surfaces the gap immediately - at least it should. Working code that you don't fully understand creates what I call &lt;strong&gt;orphaned architecture&lt;/strong&gt; — a system with no mental model owner. A couple of months later, when something downstream fails, you'll debug a design whose rationale exists only in a conversation history you've long since closed.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "You Generate" Actually Means
&lt;/h2&gt;

&lt;p&gt;I don't mean "you handwrite every line."&lt;/p&gt;

&lt;p&gt;I mean you author the first &lt;strong&gt;intent-bearing artifact&lt;/strong&gt;. This is the thing that gives the system its center of gravity before AI starts expanding it.&lt;/p&gt;

&lt;p&gt;That might be the domain model, the core function, whatever invariants must hold, or a couple of key test cases that define correct behavior. It could be the state machine, the architectural skeleton, or a short ADR explaining which tradeoff you're accepting and why.&lt;/p&gt;

&lt;p&gt;I'm not coming from a manual purity perspective. The crucial detail is that someone — you — has made the decisions that carry judgment, and those decisions exist in a form the model can now reason about. Once that exists, AI becomes dramatically more powerful, because it's critiquing &lt;em&gt;your&lt;/em&gt; structure instead of silently inventing one.&lt;/p&gt;

&lt;h2&gt;
  
  
  False Velocity and the Missing Mental Model
&lt;/h2&gt;

&lt;p&gt;The evidence is piling up, and it's converging on a pattern that should worry anyone paying attention.&lt;/p&gt;

&lt;p&gt;A CMU study accepted at &lt;a href="https://arxiv.org/abs/2511.04427" rel="noopener noreferrer"&gt;MSR '26&lt;/a&gt;, analyzing 807 Cursor-adopting repositories against matched controls, found that velocity gains were real but &lt;em&gt;transient&lt;/em&gt; — they faded within months — while code complexity increases were &lt;em&gt;persistent&lt;/em&gt;, creating a self-reinforcing debt cycle. An IEEE Spectrum &lt;a href="https://spectrum.ieee.org/ai-coding-degrades" rel="noopener noreferrer"&gt;piece from January&lt;/a&gt; documented something worse: newer models producing code that doesn't crash but silently fails to do what was intended — avoiding errors by removing safety checks or generating fake output that matches the expected format. And METR's own &lt;a href="https://metr.org/blog/2026-02-24-uplift-update/" rel="noopener noreferrer"&gt;follow-up&lt;/a&gt; revealed that they had to &lt;em&gt;redesign the study&lt;/em&gt; because developers increasingly refused to participate if it meant working without AI on half their tasks. The tool that makes you slower has become the tool you can't imagine working without.&lt;/p&gt;

&lt;p&gt;The industry's reaction is reasonable: add more review layers. AI reviewers reviewing AI-generated code, quality gates, automated scanning.&lt;/p&gt;

&lt;p&gt;This addresses but a symptom of the disease: nobody holds the mental model.&lt;/p&gt;

&lt;p&gt;When AI generates code and a human reviews it, the human is doing the most cognitively expensive possible version of review: building a mental model from scratch by reading someone else's output. There's no reasoning to reconstruct. There's no intent to discover. There's just an artifact that looks plausible, and you have to determine whether plausible is correct.&lt;/p&gt;

&lt;p&gt;When AI generates code and another AI reviews it, you may catch surface defects — style violations, common security patterns, obvious bugs. But you still haven't solved the real problem: nobody owns the reasoning that gave the system its shape. That's fine for boilerplate, but potentially endgame for code that encodes judgment.&lt;/p&gt;

&lt;h2&gt;
  
  
  "But My AI Has Full Repo Context Now"
&lt;/h2&gt;

&lt;p&gt;The obvious counterargument: tools have gotten better. Cursor indexes your repo. Claude Code reads your file tree and does the beautiful &lt;code&gt;/init&lt;/code&gt; thing. You can inject conventions via agents.md or .cursorrules. Copilot has repo-wide context. Some teams — mine included — have experimented with architectures where a large-context model ingests the entire codebase and compresses it for downstream agents. If AI can see your system, doesn't the mental model problem go away?&lt;/p&gt;

&lt;p&gt;That narrows the issue, but doesn't quite close it.&lt;/p&gt;

&lt;p&gt;Context-aware tools can see &lt;em&gt;what your code looks like&lt;/em&gt;. They can match conventions, follow existing patterns, stay stylistically consistent. That's a real upgrade over a blank-slate chat prompt, and I'm not pretending otherwise. Generated code from a context-aware tool is substantially better than what's produced with a model that's never seen your repo.&lt;/p&gt;

&lt;p&gt;But context is not intent. The tool can see that you use a specific pattern for error handling across your codebase. What it can't see is whether that pattern actually represents a deliberate architectural choice or legacy debt you haven't cleaned up yet. It can see your data model, but not which constraints are load-bearing and which are accidental — which fields exist because of a product decision or due to a migration you never finished. It can see &lt;em&gt;what&lt;/em&gt; you decided. It can't see &lt;em&gt;what you considered and rejected&lt;/em&gt;, which is often the more important half of understanding a system. Context windows capture artifacts, not decision trees.&lt;/p&gt;

&lt;p&gt;And here's the part that actually strengthens the inverted workflow: context-aware AI is an even better &lt;em&gt;critic&lt;/em&gt; than it is a generator. A model that can see your full codebase, your conventions, your patterns — and then reviews &lt;em&gt;your&lt;/em&gt; new code against all of that — catches things a context-free critic never would. "This function doesn't follow the error handling pattern you use everywhere else." "This data flow is inconsistent with how the rest of the system handles state." "Your naming here deviates from the convention in these twelve other files."&lt;/p&gt;

&lt;p&gt;Context makes AI-as-critic dramatically more powerful. It makes AI-as-generator incrementally better. That asymmetry is exactly the point.&lt;/p&gt;

&lt;h2&gt;
  
  
  When AI-First Generation Is the Right Call
&lt;/h2&gt;

&lt;p&gt;I want to be precise about the boundary, because overselling the inverted workflow would be exactly the kind of false clarity I'm arguing against.&lt;/p&gt;

&lt;p&gt;AI generation is the right default when the mental model doesn't need an owner, or the path is so well-trodden that the decisions are obvious. Config files, project scaffolding, CORS setup, CI pipeline boilerplate — nobody needs to deeply understand why the YAML looks the way it does. The code doesn't encode intent; it encodes convention. Let the model handle that.&lt;/p&gt;

&lt;p&gt;AI generation is also great as a research tool: "show me three different approaches to X" is not asking the model to build your system. It's asking it to widen your field of view before you make a decision. Same with translation ("rewrite this Python function in Go") — intent is fully specified; the generation is mechanical.&lt;/p&gt;

&lt;p&gt;The workflow flips when the code should incarnate your actual product decisions. Core logic, business rules, architectural boundaries. Anything where the &lt;em&gt;reason&lt;/em&gt; for a design choice is as important as the choice itself. Anything where, if someone asked you "why is it structured this way?", the answer matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Workflow
&lt;/h2&gt;

&lt;p&gt;Here's the concrete version, if you want to try it on one real feature:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Write the skeleton.&lt;/strong&gt; Not the whole feature — just the parts that carry intent. Module boundaries, data model, core function, invariants, key test cases. Don't optimize for completeness. Optimize for &lt;em&gt;decisions&lt;/em&gt;. Every line should reflect a choice you made for a reason you could articulate if pressed. There's no problem with using AI to refine this, brainstorm alternatives, or even generate a thoroughly guided first draft — as long as you understand that the mental model is yours, and the AI is just a tool to help you build it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have AI attack it.&lt;/strong&gt; Not "review this" — that's too passive. Ask for adversarial input: "What inputs would break this? What assumption am I making that might not hold? Write tests that target the riskiest parts of this design. Argue against my architectural choice — under what conditions is it the wrong call? Am I overlooking any established patterns that would solve this more robustly?" The goal is to find the holes in your design, not just surface-level defects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix what the critique reveals.&lt;/strong&gt; Because you designed the system, you'll know exactly where each fix goes. No reverse-engineering required. This is where the convergence advantage is most obvious.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Then let AI expand.&lt;/strong&gt; Once the core is solid and yours, hand AI the periphery: documentation, error messages, logging, additional test cases, boilerplate around the edges. This code is easy to verify because you have a clear architectural spine to compare it against.&lt;/p&gt;

&lt;p&gt;The first time you try this, it'll feel slower. You'll miss the rush of watching code appear.&lt;br&gt;
Give it one full feature cycle, though. &lt;br&gt;
Then compare not just time, but &lt;em&gt;confidence in what you shipped&lt;/em&gt; and &lt;em&gt;speed of the next change in that module&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Discipline
&lt;/h2&gt;

&lt;p&gt;Every conversation about AI coding eventually arrives at the same question: how much can AI do?&lt;/p&gt;

&lt;p&gt;I think the more useful question is: where is the mental model, and who owns it?&lt;/p&gt;

&lt;p&gt;For boilerplate, nobody needs to own the mental model. Let AI generate. For the core of your system — the logic that encodes why your product exists — the mental model is the most valuable artifact you produce. More valuable than the code itself, because code can be rewritten but understanding can't be downloaded that easily.&lt;/p&gt;

&lt;p&gt;Physics taught me this before software did. You don't trust a result because the computation produced it. You trust it because you attacked it and it survived. The computation is cheap. The verification is where understanding lives.&lt;/p&gt;

&lt;p&gt;The question is not how much code AI can write. The question is whether your workflow preserves a human owner of the system's mental model.&lt;/p&gt;

&lt;p&gt;Write the structure, let AI break that, and even use it to explore alternatives or cut corners... but understand the core of what you deliver. &lt;br&gt;
The mental model is the most expensive thing in your system. Don't let it become an orphan.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>software</category>
      <category>productivity</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
