<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: WaveAssist</title>
    <description>The latest articles on DEV Community by WaveAssist (@waveassist).</description>
    <link>https://dev.to/waveassist</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3552599%2F1df0f745-587c-4e57-9f64-879ffef8edf7.png</url>
      <title>DEV Community: WaveAssist</title>
      <link>https://dev.to/waveassist</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/waveassist"/>
    <language>en</language>
    <item>
      <title>Coding Didn't Die. Prompting Became Coding.</title>
      <dc:creator>WaveAssist</dc:creator>
      <pubDate>Sat, 02 May 2026 09:26:05 +0000</pubDate>
      <link>https://dev.to/waveassist/coding-didnt-die-prompting-became-coding-2jcm</link>
      <guid>https://dev.to/waveassist/coding-didnt-die-prompting-became-coding-2jcm</guid>
      <description>&lt;p&gt;The story everyone's telling is that AI killed the coder. That "anyone can build software now" and the four decade run of programming as a craft is over.&lt;/p&gt;

&lt;p&gt;The story is backwards.&lt;/p&gt;

&lt;p&gt;What actually happened is the opposite. The act of &lt;strong&gt;making AI do something useful has quietly turned into coding&lt;/strong&gt;. The skills didn't die. They migrated.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fake Dichotomy
&lt;/h2&gt;

&lt;p&gt;The pitch was: people who don't code will leapfrog those who do, because you can "just ask the AI."&lt;/p&gt;

&lt;p&gt;What happened in practice: those people hit a wall the moment they need the same output &lt;strong&gt;twice&lt;/strong&gt;. The coders didn't.&lt;/p&gt;

&lt;p&gt;Why? Because every instinct a coder has is exactly what it takes to make an LLM produce something reliable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Decompose the problem.&lt;/li&gt;
&lt;li&gt;Name the variables.&lt;/li&gt;
&lt;li&gt;Version the change.&lt;/li&gt;
&lt;li&gt;Test the edge case.&lt;/li&gt;
&lt;li&gt;Refactor when it gets ugly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prompting without those instincts is vibes. Prompting with them is programming.&lt;/p&gt;




&lt;h2&gt;
  
  
  Structure Is the Coder's Native Language, and LLMs Reward It
&lt;/h2&gt;

&lt;p&gt;The prompts that work aren't the eloquent ones. They're the ones with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explicit inputs&lt;/li&gt;
&lt;li&gt;Explicit outputs&lt;/li&gt;
&lt;li&gt;Explicit constraints&lt;/li&gt;
&lt;li&gt;Explicit failure modes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Schemas beat paragraphs. Step by step beats "think carefully." Typed JSON beats "return the answer." Contracts beat vibes.&lt;/p&gt;

&lt;p&gt;Every one of those is a habit a coder already has. The people who write the best prompts in 2026 are the ones who've been writing function signatures for a decade.&lt;/p&gt;




&lt;h2&gt;
  
  
  Vibe Coding Doesn't Scale, and Coders Already Know Why
&lt;/h2&gt;

&lt;p&gt;Prompts that "just work" on day 1 break on day 10. The model gets updated. An input gets weird. A downstream tool changes format. Suddenly the magic stops.&lt;/p&gt;

&lt;p&gt;Coders have lived this exact cycle with untyped code, global variables, flaky tests, and no version control. They know the fix: &lt;strong&gt;structure, types, tests, version pinning, observability.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now they're applying it to prompts. And the prompts that survive contact with production are the ones that got the coder treatment.&lt;/p&gt;

&lt;p&gt;Forty years of software engineering lessons didn't expire when the LLM showed up. They got a new domain to conquer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Industry's Own Framing Confirms It
&lt;/h2&gt;

&lt;p&gt;Listen to how the people actually doing this work are talking about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Andrej Karpathy.&lt;/strong&gt; His term for the new workflow isn't "prompting." It's &lt;strong&gt;agentic engineering&lt;/strong&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"You are not writing the code directly 99% of the time, you are orchestrating agents who do and acting as oversight… easily the biggest change to my basic coding workflow in 2 decades."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Thomas Ptacek (fly.io, June 2025):&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;LLMs "devour schlep, and clear a path to the important stuff, where your judgement and values really matter."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Kent Beck, at 52 years in:&lt;/strong&gt; augmented coding&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"changes programming but doesn't eliminate it. Developers make more consequential decisions per hour."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;None of them say the skill is obsolete. They say the skill &lt;strong&gt;compounded&lt;/strong&gt;. The boilerplate got cheaper. The judgment got more valuable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Inversion
&lt;/h2&gt;

&lt;p&gt;The people most threatened by "AI writes the code now" aren't senior engineers.&lt;/p&gt;

&lt;p&gt;They're the people who thought &lt;strong&gt;prompting was a shortcut around engineering discipline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Those people are finding out that getting an AI to work reliably requires exactly what getting anything to work reliably has always required: decomposition, contracts, tests, iteration, structure.&lt;/p&gt;

&lt;p&gt;The moat didn't disappear. It moved one level up.&lt;/p&gt;




&lt;h2&gt;
  
  
  WaveAssist's Bet on This
&lt;/h2&gt;

&lt;p&gt;This is the architecture we bet on.&lt;/p&gt;

&lt;p&gt;We don't ship you a prompt box and wish you luck. We also don't compile your intent down into a wall of generated code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/"&gt;WaveAssist&lt;/a&gt; &lt;strong&gt;embeds AI calls inside deterministic code&lt;/strong&gt;. The orchestration is a typed pipeline you can test, version, and observe. The intelligence is the LLM call sitting at exactly the right step, with structured JSON going in and structured JSON coming out.&lt;/p&gt;

&lt;p&gt;Still AI. Still intelligent. Just not vibes.&lt;/p&gt;

&lt;p&gt;These are the disciplines that make software work. They worked before LLMs. They work with LLMs. They will work when LLMs are ten times smarter.&lt;/p&gt;

&lt;p&gt;Coders didn't lose. The rest of the world just signed up for their job.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Coding was never about typing curly braces.&lt;/p&gt;

&lt;p&gt;It was about thinking in &lt;strong&gt;structure&lt;/strong&gt;. About making a fuzzy intent into something that runs the same way twice.&lt;/p&gt;

&lt;p&gt;That skill just became the single most valuable skill in the AI era.&lt;/p&gt;

&lt;p&gt;The craft isn't dying. It's eating prompting whole.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>You Can't Nudge a Painting: The Two Shapes of AI Output</title>
      <dc:creator>WaveAssist</dc:creator>
      <pubDate>Sat, 02 May 2026 09:23:56 +0000</pubDate>
      <link>https://dev.to/waveassist/you-cant-nudge-a-painting-the-two-shapes-of-ai-output-1pj9</link>
      <guid>https://dev.to/waveassist/you-cant-nudge-a-painting-the-two-shapes-of-ai-output-1pj9</guid>
      <description>&lt;p&gt;There are two shapes of AI output: &lt;strong&gt;paintings&lt;/strong&gt; and &lt;strong&gt;blueprints&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Paintings are finished. To change one, you regenerate it.&lt;/p&gt;

&lt;p&gt;Blueprints are structured. To change one, you edit a part.&lt;/p&gt;

&lt;p&gt;Both shapes are useful. They just do different jobs.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Painting Analogy
&lt;/h2&gt;

&lt;p&gt;Commission a painter. Get a painting.&lt;/p&gt;

&lt;p&gt;Now ask the painter for the same painting, but with the tree slightly to the left.&lt;/p&gt;

&lt;p&gt;What you get back is a &lt;strong&gt;different painting&lt;/strong&gt;. The sky is a different blue. The horizon shifted. The brushstrokes don't match. You didn't edit the first painting. You commissioned a new one and hoped the artist remembered.&lt;/p&gt;

&lt;p&gt;You can't nudge a painting.&lt;/p&gt;

&lt;p&gt;That's one shape AI output can take, and a lot of valuable AI output has it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Other Shape: Blueprints
&lt;/h2&gt;

&lt;p&gt;A blueprint is structured. It's measured, labeled, and broken into parts you can change one at a time.&lt;/p&gt;

&lt;p&gt;Code is a blueprint. HTML is a blueprint. A Figma file is a blueprint. A typed function with input and output schemas is a blueprint. A spreadsheet with formulas is a blueprint.&lt;/p&gt;

&lt;p&gt;You can move one wall on a blueprint without redrawing the house. That's the property software needs to be useful.&lt;/p&gt;




&lt;h2&gt;
  
  
  When You Want a Painting
&lt;/h2&gt;

&lt;p&gt;Painting shape is the right answer when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The output is consumed once and isn't iterated.&lt;/li&gt;
&lt;li&gt;Uniqueness is the value: a unique image, a fresh draft, a creative spark.&lt;/li&gt;
&lt;li&gt;The cost of regenerating is low and the cost of building structure is high.&lt;/li&gt;
&lt;li&gt;You're using AI as a starting point, not a system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Generate an image of a person walking in the mountains." Image generators ship paintings on purpose. You either like it or you regenerate. That's the job.&lt;/li&gt;
&lt;li&gt;"Draft three taglines for the launch." You're picking, not editing.&lt;/li&gt;
&lt;li&gt;"Write a poem about my dog." Uniqueness &lt;strong&gt;is&lt;/strong&gt; the deliverable.&lt;/li&gt;
&lt;li&gt;"One off Python script to clean this CSV." Throwaway code, not a maintained artifact.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's nothing wrong with paintings. A huge amount of valuable creative work is one shot. Most photographs are paintings. Most logo concepts, brand directions, and moodboards start as paintings.&lt;/p&gt;

&lt;p&gt;If you tried to make every AI output editable, you'd be building infrastructure for things that don't need it.&lt;/p&gt;




&lt;h2&gt;
  
  
  When You Want a Blueprint
&lt;/h2&gt;

&lt;p&gt;Blueprint shape is the right answer when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You'll iterate on it (most product work).&lt;/li&gt;
&lt;li&gt;It composes with other systems: pipelines, builds, deployments.&lt;/li&gt;
&lt;li&gt;It needs to be auditable, versionable, diffable.&lt;/li&gt;
&lt;li&gt;It runs many times and needs to behave predictably.&lt;/li&gt;
&lt;li&gt;Multiple people collaborate on it over time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is what &lt;strong&gt;editable&lt;/strong&gt; means. It's the quiet reason software works at all.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When Figma ships a design, you can move a box. The rest stays.&lt;/li&gt;
&lt;li&gt;When a spreadsheet ships a number, you can change a cell and dependents recalculate deterministically.&lt;/li&gt;
&lt;li&gt;When an engineer ships code, you can change a function and the rest keeps working.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Painting shape AI fails this. You can't reliably move a box on a generated image. You can't change a sentence in a freeform email regen and keep the rest of the email. That's not a flaw of paintings. It's the wrong shape for jobs that need editability.&lt;/p&gt;




&lt;h2&gt;
  
  
  Claude Design Made the Bet Public
&lt;/h2&gt;

&lt;p&gt;You can date the moment the blueprint side became unmissable. &lt;strong&gt;April 17, 2026.&lt;/strong&gt; Anthropic launched &lt;strong&gt;Claude Design&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It looked like a slide deck tool. It wasn't. It was a public statement of an architectural bet. Claude Design's trick wasn't a bigger model. It was a different artifact: the LLM writes &lt;strong&gt;HTML, CSS, and JavaScript&lt;/strong&gt;, not images. The output renders into slides, documents, and interfaces, but what you're shipping is &lt;strong&gt;code&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Want to change a word? Change a string. Re render. Everything else stays identical. Same logo. Same fonts. Same colors. Editable, reviewable, diffable, version controllable. Because it &lt;strong&gt;is&lt;/strong&gt; code.&lt;/p&gt;

&lt;p&gt;That's blueprint shape, applied to a job the whole industry had quietly assumed was a painting job.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Industry Ships Both
&lt;/h2&gt;

&lt;p&gt;Claude Design didn't invent the pattern. It named it. The blueprint side has been stacking wins for a while:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code, Cursor, GitHub Copilot&lt;/strong&gt; edit source code in place. Typed, linted, diffable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vercel's v0&lt;/strong&gt; turns text or screenshots into React + Tailwind components, not pixels.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tldraw's "Make Real"&lt;/strong&gt; turns a canvas sketch into HTML + CSS + JS you can iterate on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic's Skills&lt;/strong&gt; (December 2025) codify whole capabilities as folders of instructions and scripts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The painting side is alive and well in parallel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DALL-E, Imagen, Midjourney&lt;/strong&gt; for images.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Freeform Claude / GPT / Gemini&lt;/strong&gt; for prose, ideation, exploration.&lt;/li&gt;
&lt;li&gt;Most "give me a quick draft" interactions across every chat product.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Frontier labs aren't picking sides. They're shipping both shapes on purpose. The bet got sharper. &lt;strong&gt;When the work needs editing, ship a blueprint. When the work needs uniqueness, ship a painting.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Now Apply the Lens to AI Agents
&lt;/h2&gt;

&lt;p&gt;The same choice exists for agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Painting shape agent.&lt;/strong&gt; A chatbot that "helps you review PRs." Each response is unique. The output looks finished but can't be diffed against last week's, can't be audited, can't be composed into a larger system. Fine for a one off question. Wrong shape for production review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Blueprint shape agent.&lt;/strong&gt; &lt;a href="https://dev.to/assistants/gitzoid"&gt;GitZoid&lt;/a&gt;, running on every PR webhook with a deterministic schema. Editable. Composable. Predictable. You can open the pipeline, change a prompt, re run, and get a predictable delta. You can point it at a new repo. You can fork it.&lt;/p&gt;

&lt;p&gt;Same intelligence. Different shape.&lt;/p&gt;

&lt;p&gt;The choice depends on what you're using the agent for. If you want a one off creative second opinion on a tricky PR, painting shape is fine. If you want a reliable review on every PR for the next year, you need blueprint shape.&lt;/p&gt;

&lt;p&gt;This is what &lt;a href="https://dev.to/"&gt;WaveAssist&lt;/a&gt; is built for. Agents like GitZoid, GitDigest, WavePredict, WaveContent, and the rest are blueprint shaped on purpose. Not because painting shape is bad. Because the work we ship is the kind that needs blueprint shape.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;The difference between AI that feels magical and AI that's load bearing isn't model quality. It's whether the artifact is a painting or a blueprint, and whether the job needed which.&lt;/p&gt;

&lt;p&gt;Use paintings where uniqueness is the point.&lt;br&gt;
Build blueprints where structure is the point.&lt;br&gt;
Don't confuse the two.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Deterministic vs Agentic: The Quiet Architectural Bet Every AI Agent Company Is Making</title>
      <dc:creator>WaveAssist</dc:creator>
      <pubDate>Sat, 02 May 2026 09:21:13 +0000</pubDate>
      <link>https://dev.to/waveassist/deterministic-vs-agentic-the-quiet-architectural-bet-every-ai-agent-company-is-making-33p</link>
      <guid>https://dev.to/waveassist/deterministic-vs-agentic-the-quiet-architectural-bet-every-ai-agent-company-is-making-33p</guid>
      <description>&lt;p&gt;Every "AI agent" product on the market is making one of two architectural bets, and the founders usually can't articulate which. The bet decides whether your agent costs &lt;strong&gt;cents or dollars per run&lt;/strong&gt;, whether it works the same way twice, and whether it will still be running a year from now.&lt;/p&gt;

&lt;p&gt;It's worth naming.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Two Camps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fat harness, agentic.&lt;/strong&gt; The LLM decides every step at runtime. Every run is a fresh plan. Every plan costs tokens. Every step is an opportunity for the model to go somewhere new.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code, Cursor, Copilot agents (open ended coding work)&lt;/li&gt;
&lt;li&gt;LangGraph (reasoning over a graph)&lt;/li&gt;
&lt;li&gt;CrewAI (agents organized by role, +280% adoption in 2025)&lt;/li&gt;
&lt;li&gt;AutoGPT (autonomous loops)&lt;/li&gt;
&lt;li&gt;OpenAI's AgentKit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Thin harness, deterministic.&lt;/strong&gt; The LLM designs the pipeline once, at build time. Then &lt;strong&gt;code runs forever&lt;/strong&gt;. The model gets called only for the specific steps that actually need judgment. The trigger is deterministic. The data sources are scoped. The output goes somewhere predefined.&lt;/p&gt;

&lt;p&gt;Examples (and this list is nowhere near complete):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HubSpot Breeze&lt;/strong&gt; (Customer Agent on ticket creation, Prospecting Agent on deal stage, Data Agent in workflows)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack&lt;/strong&gt; (daily channel recaps, thread summaries, scheduled digests)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linear Triage Intelligence&lt;/strong&gt; (auto triage, auto assign, duplicate detection, label suggestions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Asana Smart Fields&lt;/strong&gt; (AI generated custom field values on task creation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notion AI&lt;/strong&gt; (summarize a page, "Ask Notion" Q&amp;amp;A across the workspace, meeting note formatting)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ClickUp&lt;/strong&gt; (Project Manager Agent auto assigning tasks, Meeting Notetaker, Auto Prioritize)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zoom AI Companion&lt;/strong&gt; (post meeting summaries, action items, decisions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft Teams Intelligent Recap&lt;/strong&gt; (AI notes, chapters, speaker timelines, follow ups)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WaveAssist&lt;/strong&gt; (GitZoid, GitDigest, WavePredict, WaveContent, and the rest of the assistant catalog, each running on a defined trigger and a defined job)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Look at almost any major SaaS shipping in 2026. The "AI features" tab is, with rare exception, a list of thin harness agents. Each one has a defined trigger, a scoped input, a predictable output, and a job small enough that the model can't wander out of it.&lt;/p&gt;

&lt;p&gt;One approach puts the intelligence &lt;strong&gt;inside&lt;/strong&gt; the loop. The other puts the intelligence &lt;strong&gt;behind&lt;/strong&gt; the loop.&lt;/p&gt;

&lt;p&gt;Neither is wrong. They're built for different jobs. The fat harness list above is short because the work is hard. The thin harness list is long because the work is everywhere.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Fat Harness Wins
&lt;/h2&gt;

&lt;p&gt;Fat harness is the right answer when the work is novel, open ended, or impossible to script in advance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coding is the canonical example.&lt;/strong&gt; Every bug is different. Every codebase has its own conventions. Every fix requires reading files, running tests, reasoning about the output, and changing course. You can't write a deterministic pipeline for "fix the bug." You need an agent that re reads the situation at every step. That's why Claude Code, Cursor, and the Copilot agents are all fat harness, and that's why they work as well as they do.&lt;/p&gt;

&lt;p&gt;The same applies to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open ended research and analysis ("dig into this question and tell me what you find")&lt;/li&gt;
&lt;li&gt;Investigative tasks where the next step depends on the last step's output&lt;/li&gt;
&lt;li&gt;Tasks where the human doesn't fully know what they want until they see what's possible&lt;/li&gt;
&lt;li&gt;One off jobs where the cost of building a pipeline exceeds the cost of running the agent twice&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you tried to make Claude Code thin harness, you'd be writing a pipeline for a problem you can't predict. The whole point is that the model gets to plan as it goes, react to what it finds, and re plan when reality doesn't match.&lt;/p&gt;

&lt;p&gt;Fat harness is genuinely awesome at this kind of work. It's not a worse architecture. It's a different one.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Thin Harness Wins
&lt;/h2&gt;

&lt;p&gt;Thin harness is the right answer when the work is &lt;strong&gt;scoped, repeated, and triggered&lt;/strong&gt;. A defined job, a defined trigger, a defined output. Run it a thousand times this year.&lt;/p&gt;

&lt;p&gt;Almost every AI agent shipping inside production SaaS today is thin harness. Once you know the shape, you start seeing it everywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HubSpot Breeze (CRM).&lt;/strong&gt; When a ticket is created, the &lt;strong&gt;Customer Agent&lt;/strong&gt; runs. When a deal hits a stage, the &lt;strong&gt;Prospecting Agent&lt;/strong&gt; runs. The &lt;strong&gt;Data Agent&lt;/strong&gt; runs as a workflow step on a schedule. HubSpot's workflow engine handles the trigger, the data, and the routing. The LLM is called at the one step that needs judgment ("draft a reply to this ticket using these knowledge base articles"). Same shape, every fire. Charged per result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slack summaries.&lt;/strong&gt; Daily recaps. Channel summaries. Thread catch ups. Each one is a fixed function: defined input (this channel, this date range), defined output (a structured summary), defined schedule (every morning). Slack's published number for time saved is over a million working hours. None of that comes from a fat harness loop replanning what to do each morning. It comes from the same pipeline running, reliably, on millions of channels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ClickUp task agents.&lt;/strong&gt; The Project Manager Agent auto assigns tasks based on owner expertise when a task is created. Meeting Notetaker turns a transcript into action items. Auto Prioritize sorts a backlog. Each agent has one job, scoped to one trigger. ClickUp's own docs draw the line clearly: "automations are deterministic and fast. AI agents are flexible but come with token costs that add up at scale." That cost arithmetic is why most production work in their stack runs deterministic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WaveAssist agents.&lt;/strong&gt; Every Monday at 9am, &lt;strong&gt;GitDigest&lt;/strong&gt; reads the week's diffs and writes five role specific summaries. Every PR webhook, &lt;strong&gt;GitZoid&lt;/strong&gt; reads the diff and posts a review. &lt;strong&gt;WavePredict&lt;/strong&gt; runs forecasts on a schedule. &lt;strong&gt;WaveContent&lt;/strong&gt; drafts on a brief. The full catalog (these are examples, not the whole list) follows the same shape: a defined trigger, a scoped job, the same pipeline running every fire. The expensive thinking happened once, when each pipeline was designed. Every run after is structured.&lt;/p&gt;

&lt;p&gt;The pattern is the same across all four:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The trigger is deterministic (event, schedule, webhook).&lt;/li&gt;
&lt;li&gt;The data sources are scoped, not "anything the agent can find."&lt;/li&gt;
&lt;li&gt;The LLM is called at one or two specific steps where judgment is genuinely needed.&lt;/li&gt;
&lt;li&gt;The orchestration, the routing, the validation, and the side effects are code.&lt;/li&gt;
&lt;li&gt;The agent runs thousands of times per day with predictable cost and predictable shape.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Reliability Argument
&lt;/h2&gt;

&lt;p&gt;Thin harness isn't a stylistic preference for repeated work. The reliability data forces it.&lt;/p&gt;

&lt;p&gt;The top coding models (GPT 5, Claude Opus 4.1) score about &lt;strong&gt;70% on SWE-bench Verified&lt;/strong&gt; and collapse to &lt;strong&gt;23% Pass@1 on SWE-Bench Pro&lt;/strong&gt; (Sept 2025, arxiv 2509.16941), the long horizon, multi file variant. On commercial subsets it drops under 20%.&lt;/p&gt;

&lt;p&gt;Agentic loops on long, repeated tasks fail most of the time.&lt;/p&gt;

&lt;p&gt;You cannot run a business on 23%. You can, however, run a business on a deterministic pipeline that calls a 70% reliable model for &lt;strong&gt;one well scoped step&lt;/strong&gt;, validates the output, and retries. The surface area where the model can fail is smaller by construction.&lt;/p&gt;

&lt;p&gt;That's why scoped agents (Breeze's Customer Agent, Slack's Summarizer, GitZoid's PR review) hit reliability numbers a fat harness loop never will. They ask the model to do exactly one thing it's good at, in a context it can't wander out of.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Skeptical Read
&lt;/h2&gt;

&lt;p&gt;The people with no stock in the outcome are waving the same flag for repeated production work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simon Willison&lt;/strong&gt; (Jan 2025):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I think we are going to see a lot more froth about agents in 2025, but I expect the results will be a great disappointment to most of the people who are excited about this term."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Hamel Husain:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Be deeply skeptical of features that promise full automation without human validation… this stacking of abstractions often hides flaws behind a high score."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Neither is saying agents are useless. They're saying that betting your &lt;strong&gt;production reliability&lt;/strong&gt; on a fat harness loop is, today, a bad trade.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Anthropic Signal: Structure Over Slop
&lt;/h2&gt;

&lt;p&gt;Watch what the labs &lt;strong&gt;build&lt;/strong&gt;, not what they say.&lt;/p&gt;

&lt;p&gt;Anthropic, the lab most associated with "agents" in the public imagination, keeps quietly making the same architectural choice: &lt;strong&gt;prefer structured, editable artifacts over freeform generation&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code&lt;/strong&gt; is fat harness, but it emits &lt;strong&gt;code&lt;/strong&gt;. Not vibes, not pseudocode. A real diff you can run, test, and revert.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Design&lt;/strong&gt; generates production ready &lt;strong&gt;HTML, CSS, and JavaScript&lt;/strong&gt;, not images. The output is an editable artifact you can deploy to Vercel, hand to Claude Code, or open in a browser. The lab made an explicit bet that structured code beats generated pixels for design work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Skills&lt;/strong&gt; (December 2025) are folders of &lt;strong&gt;instructions, scripts, and resources&lt;/strong&gt; that agents load dynamically, shipped as a cross industry standard with Atlassian, Canva, Cloudflare, Figma, Notion, Ramp, and Sentry. Not a smarter agent. Codified, file backed building blocks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is consistent. Even when Anthropic ships fat harness, the &lt;strong&gt;artifact&lt;/strong&gt; is structured. Even when they ship a creative tool, the output is code. The bet is that production AI runs on structure, not on the model's mood.&lt;/p&gt;

&lt;p&gt;Thin harness is one expression of the same bet, applied to the agent itself: make the workflow a structured artifact, and call the LLM only where you need its judgment.&lt;/p&gt;




&lt;h2&gt;
  
  
  The WaveAssist Bet
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dev.to/"&gt;WaveAssist&lt;/a&gt; picked thin harness, on purpose, because the work we're built for is the work that runs every day, every Monday, every webhook, every commit.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run the intelligence once.&lt;/strong&gt; The model helps you design the pipeline at build time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run code forever.&lt;/strong&gt; The pipeline itself is compiled, versioned, and cheap to execute.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable cost.&lt;/strong&gt; You're not paying the LLM to replan every Monday at 9am.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable behavior.&lt;/strong&gt; Same inputs, same outputs. No drift because the model woke up feeling creative.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable uptime.&lt;/strong&gt; Code doesn't change its mind. Nodes run. Schedules fire. Webhooks hit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every agent we ship (GitZoid, GitDigest, WavePredict, WaveContent, SentimentRadar, PatternAnalyser, and the rest) is a &lt;strong&gt;compiled pipeline, not a runtime loop&lt;/strong&gt;. The expensive part happened once, at the start. Everything after is deterministic.&lt;/p&gt;

&lt;p&gt;We didn't pick thin because fat is bad. We picked thin because the work we're shipping is &lt;strong&gt;repeated production&lt;/strong&gt;, and that's the same architecture HubSpot, Slack, ClickUp, and Anthropic's Skills team picked for the same reason.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;The agent space isn't splitting into winners and losers on model quality. It's splitting on architecture.&lt;/p&gt;

&lt;p&gt;Pick by job:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Open ended, novel, exploratory.&lt;/strong&gt; Fat harness. Claude Code is the canonical example.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repeated, scheduled, scoped.&lt;/strong&gt; Thin harness. Almost every production agent inside SaaS today.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The mistake isn't picking one camp. It's using a fat harness loop where a thin harness pipeline would do, or shipping a thin harness for a job that genuinely needs a model in the loop.&lt;/p&gt;

&lt;p&gt;Pick your bet. Then ask your vendor which one they made, and whether it matches the work you're paying them to do. If they can't answer clearly, that &lt;strong&gt;is&lt;/strong&gt; the answer.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
