<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joshua Brackin</title>
    <description>The latest articles on DEV Community by Joshua Brackin (@jbrackin).</description>
    <link>https://dev.to/jbrackin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3997280%2F60f524c7-0dc3-4ecf-8f47-054f023c16d0.png</url>
      <title>DEV Community: Joshua Brackin</title>
      <link>https://dev.to/jbrackin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jbrackin"/>
    <language>en</language>
    <item>
      <title>Coding agents are good at writing Swift. They're bad at finishing it.</title>
      <dc:creator>Joshua Brackin</dc:creator>
      <pubDate>Mon, 22 Jun 2026 16:52:48 +0000</pubDate>
      <link>https://dev.to/jbrackin/coding-agents-are-good-at-writing-swift-theyre-bad-at-finishing-it-md3</link>
      <guid>https://dev.to/jbrackin/coding-agents-are-good-at-writing-swift-theyre-bad-at-finishing-it-md3</guid>
      <description>&lt;p&gt;I've spent the last few months pointing AI coding agents at real Swift and Xcode work and watching where they come apart. Not "write me a login screen" demos. Tasks with a build, a test target, and a finish line the agent has to reach on its own.&lt;/p&gt;

&lt;p&gt;Start with the part that surprised me: the first draft is usually fine.&lt;/p&gt;

&lt;p&gt;Give a capable model a reasonable Swift task and the code it writes on the first pass is often correct, or close. The view is sensible. The types line up. If writing Swift were the bottleneck, these tools would already be done.&lt;/p&gt;

&lt;p&gt;A certain kind of post likes to claim the models can't write Swift. They can. They're good at it and getting better. So the interesting question is what happens after that first draft, in the gap between "looks finished" and "is actually right."&lt;/p&gt;

&lt;p&gt;The loud version of the build-loop complaint also gets something wrong: on a modern harness, the pure "won't compile" loop is mostly handled. Claude Code and Codex won't accept their own work while the build is red. They churn on a compile error quietly and hand you something that builds. If your agent still ships you red builds, that's a harness problem with a known fix.&lt;/p&gt;

&lt;p&gt;The failures that survive are the ones the compiler can't see. Those are the ones that cost me time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It builds fine and isn't what I asked for.&lt;/strong&gt; The most common one now. The code compiles, the tests it wrote pass, and the behavior is subtly or completely wrong against what I actually wanted. The agent has no way to check intent. Green is not the same as correct, and green is the only thing the agent knows how to chase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It compiles and races.&lt;/strong&gt; Concurrency is the sharp version of this. Swift's compiler catches a lot, but you can still get code that builds clean and has a data race that only shows up under certain timing. The agent reads the green build as success and moves on. When the failure does surface, it usually wants a small redesign rather than a one-line fix, and the redesign is exactly the move the agent won't reach for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It fixes one thing and quietly breaks another, then loops.&lt;/strong&gt; This is the one that eats the most of my afternoon. The agent lands a real fix, hits a different problem, and while chasing the second problem it undoes the first. A few turns later it's back to something it already solved. Left running long enough I've watched it oscillate between two broken states: A breaks B, the fix for B brings back A, around and around. It has no durable sense of "I tried that and it didn't work."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It writes code I wouldn't ship.&lt;/strong&gt; Compiles, runs, still the wrong shape. A pattern that fights the framework. A structure that ignores how the rest of the app is built. Fine for a throwaway. Not fine in code I have to live with.&lt;/p&gt;

&lt;p&gt;None of these are language problems. The model knows Swift. What it's missing is everything the compiler can't tell it: whether the result matches what I meant, whether it holds up at runtime, and whether it's the kind of code an experienced developer would keep.&lt;/p&gt;

&lt;p&gt;That last gap is where the real expense lives, and it's not the one people reach for first. The talk is usually about token cost. What I actually feel is attention. An agent that writes good Swift and then needs me watching it, stepping in every few turns to keep it from circling, hasn't saved me the work. It's converted writing into supervising. Some days that's a fine trade. A lot of days it means the thing only really runs when I'm sitting next to it.&lt;/p&gt;

&lt;p&gt;I don't have this solved. What I can say is that the failure modes are consistent enough to name, which is more than I expected when I started measuring them. The progress I've made has come from changing the loop around the model, what it checks and what it remembers between turns, more than from swapping in a smarter model.&lt;/p&gt;

&lt;p&gt;So I want to know whether this matches your experience. For those of you running agents against real Apple work: where does it actually break for you now that the build mostly takes care of itself? The intent mismatches, the races that compile, or something I'm not watching for yet?&lt;/p&gt;

</description>
      <category>swift</category>
      <category>ios</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
