<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: CPDForge</title>
    <description>The latest articles on DEV Community by CPDForge (@cpdforge).</description>
    <link>https://dev.to/cpdforge</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3842717%2F5ac02ff9-3b89-4e5d-aa3d-a3dcf501db52.png</url>
      <title>DEV Community: CPDForge</title>
      <link>https://dev.to/cpdforge</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cpdforge"/>
    <language>en</language>
    <item>
      <title>Why Building Software Is Like Leading an MMO Raid</title>
      <dc:creator>CPDForge</dc:creator>
      <pubDate>Thu, 11 Jun 2026 12:38:07 +0000</pubDate>
      <link>https://dev.to/cpdforge/why-building-software-is-like-leading-an-mmo-raid-1cjk</link>
      <guid>https://dev.to/cpdforge/why-building-software-is-like-leading-an-mmo-raid-1cjk</guid>
      <description>&lt;h1&gt;
  
  
  Why Building Software Is Like Leading an MMO Raid
&lt;/h1&gt;

&lt;p&gt;A few years ago, if you'd told me that thousands of hours leading MMO raids would end up helping me build software products, I'd have laughed.&lt;/p&gt;

&lt;p&gt;Today I'm not so sure.&lt;/p&gt;

&lt;p&gt;I've spent years leading raids in games like &lt;strong&gt;Star Wars Galaxies&lt;/strong&gt; and &lt;strong&gt;Star Wars: The Old Republic&lt;/strong&gt;. What surprised me is how often the same lessons show up when building software.&lt;/p&gt;

&lt;p&gt;The technology changes.&lt;/p&gt;

&lt;p&gt;The tools change.&lt;/p&gt;

&lt;p&gt;The jargon changes.&lt;/p&gt;

&lt;p&gt;But the principles?&lt;/p&gt;

&lt;p&gt;Not so much.&lt;/p&gt;

&lt;p&gt;After building products, leading teams, surviving deployments, and spending more hours than I'd care to admit staring at architecture diagrams, I've realised that software development and MMO raid leadership have far more in common than most people think.&lt;/p&gt;




&lt;h2&gt;
  
  
  Most Wipes Are Self-Inflicted
&lt;/h2&gt;

&lt;p&gt;One of the first lessons every raid leader learns is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Most wipes are self-inflicted.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not every wipe.&lt;/p&gt;

&lt;p&gt;Not all wipes.&lt;/p&gt;

&lt;p&gt;But most of them.&lt;/p&gt;

&lt;p&gt;The boss usually isn't the problem.&lt;/p&gt;

&lt;p&gt;The raid wipes because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Somebody got greedy&lt;/li&gt;
&lt;li&gt;Somebody ignored the mechanic&lt;/li&gt;
&lt;li&gt;Somebody panicked&lt;/li&gt;
&lt;li&gt;Somebody thought they could improvise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Software projects are exactly the same.&lt;/p&gt;

&lt;p&gt;Most projects don't fail because the technology was impossible.&lt;/p&gt;

&lt;p&gt;They fail because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requirements weren't clear&lt;/li&gt;
&lt;li&gt;Priorities kept changing&lt;/li&gt;
&lt;li&gt;Validation was skipped&lt;/li&gt;
&lt;li&gt;Scope grew uncontrollably&lt;/li&gt;
&lt;li&gt;Assumptions went unchallenged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The technology rarely kills the project.&lt;/p&gt;

&lt;p&gt;The team usually does.&lt;/p&gt;




&lt;h2&gt;
  
  
  Don't Pull Extra Mobs
&lt;/h2&gt;

&lt;p&gt;This one has become a genuine software development philosophy for me.&lt;/p&gt;

&lt;p&gt;Every MMO player knows this moment.&lt;/p&gt;

&lt;p&gt;The group is making steady progress.&lt;/p&gt;

&lt;p&gt;Everything is under control.&lt;/p&gt;

&lt;p&gt;Then somebody says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"While we're here..."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Five minutes later you're fighting three extra packs, the healer is out of resources, and half the raid is dead.&lt;/p&gt;

&lt;p&gt;Software development has exactly the same trap.&lt;/p&gt;

&lt;p&gt;You're implementing a reporting feature.&lt;/p&gt;

&lt;p&gt;Someone says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"While we're here, we could also..."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then suddenly you're discussing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;New dashboards&lt;/li&gt;
&lt;li&gt;New permissions&lt;/li&gt;
&lt;li&gt;Notifications&lt;/li&gt;
&lt;li&gt;Analytics&lt;/li&gt;
&lt;li&gt;Exports&lt;/li&gt;
&lt;li&gt;AI summaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Congratulations.&lt;/p&gt;

&lt;p&gt;You just pulled extra mobs.&lt;/p&gt;

&lt;p&gt;One of the most useful questions I've learned to ask is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Does this solve the problem we're trying to solve right now?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the answer is no, it probably belongs in the backlog.&lt;/p&gt;

&lt;p&gt;Finish the current fight first.&lt;/p&gt;




&lt;h2&gt;
  
  
  Do The Mechanic
&lt;/h2&gt;

&lt;p&gt;Every raid eventually has that player.&lt;/p&gt;

&lt;p&gt;The one topping the damage charts.&lt;/p&gt;

&lt;p&gt;The one doing incredible numbers.&lt;/p&gt;

&lt;p&gt;The one who dies first because they ignored the mechanic.&lt;/p&gt;

&lt;p&gt;The mechanic doesn't care how talented you are.&lt;/p&gt;

&lt;p&gt;It doesn't care how much damage you're doing.&lt;/p&gt;

&lt;p&gt;If you don't do the mechanic, you're dead.&lt;/p&gt;

&lt;p&gt;Software projects have mechanics too.&lt;/p&gt;

&lt;p&gt;Things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Architecture reviews&lt;/li&gt;
&lt;li&gt;Testing&lt;/li&gt;
&lt;li&gt;QA&lt;/li&gt;
&lt;li&gt;Security checks&lt;/li&gt;
&lt;li&gt;Deployment validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nobody gets excited about them.&lt;/p&gt;

&lt;p&gt;Everybody wants to write code.&lt;/p&gt;

&lt;p&gt;But the mechanic still needs to be done.&lt;/p&gt;

&lt;p&gt;The teams that survive long term are usually the ones that respect the boring parts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Don't Stand In Fire
&lt;/h2&gt;

&lt;p&gt;Another timeless lesson.&lt;/p&gt;

&lt;p&gt;There is always fire.&lt;/p&gt;

&lt;p&gt;Sometimes it's literal fire.&lt;/p&gt;

&lt;p&gt;Sometimes it's:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hardcoded secrets&lt;/li&gt;
&lt;li&gt;Direct production database edits&lt;/li&gt;
&lt;li&gt;Skipping tests&lt;/li&gt;
&lt;li&gt;Ignoring warnings&lt;/li&gt;
&lt;li&gt;Undocumented architecture&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everybody knows it's dangerous.&lt;/p&gt;

&lt;p&gt;People still stand in it.&lt;/p&gt;

&lt;p&gt;Repeatedly.&lt;/p&gt;

&lt;p&gt;The lesson is simple:&lt;/p&gt;

&lt;p&gt;If something has already burned you three times, stop standing there.&lt;/p&gt;




&lt;h2&gt;
  
  
  The DPS Meter Lies
&lt;/h2&gt;

&lt;p&gt;One of the most underrated lessons from raid leadership.&lt;/p&gt;

&lt;p&gt;The highest DPS player is not always the most valuable player.&lt;/p&gt;

&lt;p&gt;Sometimes the most valuable person is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The healer who prevented a wipe&lt;/li&gt;
&lt;li&gt;The player handling mechanics&lt;/li&gt;
&lt;li&gt;The person explaining strategy&lt;/li&gt;
&lt;li&gt;The one spotting problems before they happen&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Software teams work the same way.&lt;/p&gt;

&lt;p&gt;The most commits don't necessarily create the most value.&lt;/p&gt;

&lt;p&gt;The engineer who prevents a production outage may contribute more than the person who writes thousands of lines of code.&lt;/p&gt;

&lt;p&gt;The person who simplifies a system may create more value than the person who adds five new features.&lt;/p&gt;

&lt;p&gt;Not all contributions show up on the meter.&lt;/p&gt;




&lt;h2&gt;
  
  
  Wait For Loot
&lt;/h2&gt;

&lt;p&gt;This might be my favourite lesson.&lt;/p&gt;

&lt;p&gt;The boss dies.&lt;/p&gt;

&lt;p&gt;Everyone gets excited.&lt;/p&gt;

&lt;p&gt;Then the raid leader says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Wait. Don't loot yet."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Usually because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Loot rules aren't set&lt;/li&gt;
&lt;li&gt;Someone is still running back&lt;/li&gt;
&lt;li&gt;A screenshot is needed&lt;/li&gt;
&lt;li&gt;Something still needs checking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good raid leaders don't sprint to the next boss the moment the current one dies.&lt;/p&gt;

&lt;p&gt;They stop.&lt;/p&gt;

&lt;p&gt;Review.&lt;/p&gt;

&lt;p&gt;Recover.&lt;/p&gt;

&lt;p&gt;Then move on.&lt;/p&gt;

&lt;p&gt;Software projects should do exactly the same.&lt;/p&gt;

&lt;p&gt;A feature ships.&lt;/p&gt;

&lt;p&gt;Great.&lt;/p&gt;

&lt;p&gt;Now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test it&lt;/li&gt;
&lt;li&gt;Monitor it&lt;/li&gt;
&lt;li&gt;Validate it&lt;/li&gt;
&lt;li&gt;Learn from it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only then start the next thing.&lt;/p&gt;

&lt;p&gt;Too many teams treat deployment as the finish line.&lt;/p&gt;

&lt;p&gt;It's usually the start of the learning phase.&lt;/p&gt;




&lt;h2&gt;
  
  
  The DPS Hero Is Usually The First To Die
&lt;/h2&gt;

&lt;p&gt;Every raid has one.&lt;/p&gt;

&lt;p&gt;The player convinced they're the main character.&lt;/p&gt;

&lt;p&gt;The one who thinks mechanics are for everyone else.&lt;/p&gt;

&lt;p&gt;The one who says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I've got this."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Thirty seconds later they're face down on the floor.&lt;/p&gt;

&lt;p&gt;Software projects have these moments too.&lt;/p&gt;

&lt;p&gt;The engineer who insists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Documentation is unnecessary&lt;/li&gt;
&lt;li&gt;Testing is optional&lt;/li&gt;
&lt;li&gt;Architecture reviews are bureaucracy&lt;/li&gt;
&lt;li&gt;Deployment checklists are for other people&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Often ends up discovering why those things existed in the first place.&lt;/p&gt;

&lt;p&gt;Usually in production.&lt;/p&gt;

&lt;p&gt;Usually on a Friday.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Boss Usually Isn't The Problem
&lt;/h2&gt;

&lt;p&gt;This is probably the biggest lesson of all.&lt;/p&gt;

&lt;p&gt;Most of the time, the boss isn't what kills you.&lt;/p&gt;

&lt;p&gt;It's everything around the boss.&lt;/p&gt;

&lt;p&gt;The lack of planning.&lt;/p&gt;

&lt;p&gt;The lack of coordination.&lt;/p&gt;

&lt;p&gt;The avoidable mistakes.&lt;/p&gt;

&lt;p&gt;The unnecessary complexity.&lt;/p&gt;

&lt;p&gt;Software projects are no different.&lt;/p&gt;

&lt;p&gt;The technology challenge is often only a small part of the problem.&lt;/p&gt;

&lt;p&gt;The bigger challenge is usually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Focus&lt;/li&gt;
&lt;li&gt;Discipline&lt;/li&gt;
&lt;li&gt;Prioritisation&lt;/li&gt;
&lt;li&gt;Communication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The boring stuff.&lt;/p&gt;

&lt;p&gt;The raid-leader stuff.&lt;/p&gt;




&lt;h2&gt;
  
  
  Don't Touch Anything, I Forgot To Set Random
&lt;/h2&gt;

&lt;p&gt;If you've ever led a raid, you've probably said something like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Don't loot yet. I forgot to set random."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Everyone freezes.&lt;/p&gt;

&lt;p&gt;Nobody touches anything.&lt;/p&gt;

&lt;p&gt;Because everyone understands that a tiny process mistake can create a much larger problem.&lt;/p&gt;

&lt;p&gt;Software has these moments too.&lt;/p&gt;

&lt;p&gt;They're usually disguised as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Don't deploy yet."&lt;/p&gt;

&lt;p&gt;"Don't restart that."&lt;/p&gt;

&lt;p&gt;"Don't touch production."&lt;/p&gt;

&lt;p&gt;"Wait, I forgot to update the environment variables."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The principle is the same.&lt;/p&gt;

&lt;p&gt;Slow down.&lt;/p&gt;

&lt;p&gt;Verify.&lt;/p&gt;

&lt;p&gt;Then proceed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Best Raid Leaders Aren't The Best Players
&lt;/h2&gt;

&lt;p&gt;This took me years to understand.&lt;/p&gt;

&lt;p&gt;The best raid leaders aren't necessarily:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The most skilled&lt;/li&gt;
&lt;li&gt;The best geared&lt;/li&gt;
&lt;li&gt;The highest DPS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They're usually the people who:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stay calm&lt;/li&gt;
&lt;li&gt;Keep everyone focused&lt;/li&gt;
&lt;li&gt;Prioritise correctly&lt;/li&gt;
&lt;li&gt;Reduce unnecessary chaos&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same is true in software.&lt;/p&gt;

&lt;p&gt;The best technical leaders aren't always the smartest people in the room.&lt;/p&gt;

&lt;p&gt;They're often the people who stop the team from creating problems for themselves.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;After years of leading raids and building software, I've ended up with a surprisingly simple development framework:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Do the mechanic.&lt;/p&gt;

&lt;p&gt;Don't stand in fire.&lt;/p&gt;

&lt;p&gt;Wait for loot.&lt;/p&gt;

&lt;p&gt;Don't pull extra mobs.&lt;/p&gt;

&lt;p&gt;Most wipes are self-inflicted.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It's not a bad framework for software development.&lt;/p&gt;

&lt;p&gt;And honestly, it's probably saved me more projects than some of the formal methodologies I've used over the years.&lt;/p&gt;

&lt;p&gt;Although it's admittedly harder to explain during architecture reviews.&lt;/p&gt;

&lt;p&gt;Then again...&lt;/p&gt;

&lt;p&gt;The longer I build software, the more I think good architecture and good raid leadership are really the same thing.&lt;/p&gt;

&lt;p&gt;Reduce unnecessary chaos.&lt;/p&gt;

&lt;p&gt;Focus on the current objective.&lt;/p&gt;

&lt;p&gt;And for the love of all that is holy...&lt;/p&gt;

&lt;p&gt;Don't pull extra mobs.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Beyond the Prompt: Building a Proposal-Based Workflow for AI Content Updates</title>
      <dc:creator>CPDForge</dc:creator>
      <pubDate>Tue, 09 Jun 2026 15:30:55 +0000</pubDate>
      <link>https://dev.to/cpdforge/beyond-the-prompt-building-a-proposal-based-workflow-for-ai-content-updates-bp3</link>
      <guid>https://dev.to/cpdforge/beyond-the-prompt-building-a-proposal-based-workflow-for-ai-content-updates-bp3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;AI should propose changes. Systems should decide whether those changes become reality.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most AI systems are designed to generate content.&lt;/p&gt;

&lt;p&gt;The problem is that production systems rarely need content generation.&lt;/p&gt;

&lt;p&gt;They need content maintenance.&lt;/p&gt;

&lt;p&gt;And maintenance is where AI becomes dangerous.&lt;/p&gt;

&lt;p&gt;When you're building software for regulated environments, the challenge isn't creating version one.&lt;/p&gt;

&lt;p&gt;It's safely updating version one after it's already been approved, deployed, audited, referenced, and relied upon.&lt;/p&gt;

&lt;p&gt;That's where we started rethinking how AI should interact with content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With Regenerate
&lt;/h2&gt;

&lt;p&gt;Imagine you have a compliance training course containing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Regulatory references&lt;/li&gt;
&lt;li&gt;Knowledge checks&lt;/li&gt;
&lt;li&gt;Real-world scenarios&lt;/li&gt;
&lt;li&gt;Internal procedures&lt;/li&gt;
&lt;li&gt;Approval history&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A regulation changes.&lt;/p&gt;

&lt;p&gt;You need to update a single section.&lt;/p&gt;

&lt;p&gt;Most AI systems approach this by sending the lesson to an LLM and asking it to regenerate the content.&lt;/p&gt;

&lt;p&gt;Technically, it works.&lt;/p&gt;

&lt;p&gt;Operationally, it creates a new problem.&lt;/p&gt;

&lt;p&gt;The AI might:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rewrite surrounding content unnecessarily&lt;/li&gt;
&lt;li&gt;Change instructional tone&lt;/li&gt;
&lt;li&gt;Remove important references&lt;/li&gt;
&lt;li&gt;Alter lesson structure&lt;/li&gt;
&lt;li&gt;Introduce subtle inaccuracies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The update itself might be correct.&lt;/p&gt;

&lt;p&gt;The rest of the lesson might no longer be.&lt;/p&gt;

&lt;p&gt;In high-stakes environments, that's not acceptable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Content Is Not A String
&lt;/h2&gt;

&lt;p&gt;One of the design decisions we made while building CPDForge AI was to stop thinking about content as large blocks of text.&lt;/p&gt;

&lt;p&gt;Instead, we treat content as structured documents.&lt;/p&gt;

&lt;p&gt;A simplified model looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Course
 └── Module
      └── Lesson
           └── Section
                └── Content Block
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That structure allows updates to be targeted precisely.&lt;/p&gt;

&lt;p&gt;Instead of telling AI:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Rewrite this lesson&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We can tell it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Update this section&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That distinction turns out to be surprisingly important.&lt;/p&gt;

&lt;p&gt;Because once content becomes structured, AI no longer needs to touch everything.&lt;/p&gt;

&lt;p&gt;It only needs to touch the thing that changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Generation To Proposals
&lt;/h2&gt;

&lt;p&gt;The next decision was even more important.&lt;/p&gt;

&lt;p&gt;The AI never directly edits production content.&lt;/p&gt;

&lt;p&gt;Instead, it generates a proposal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Production Content
        ↓
AI Analysis
        ↓
Proposed Change
        ↓
Validation
        ↓
Human Review
        ↓
Approved Update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model suggests.&lt;/p&gt;

&lt;p&gt;Humans decide.&lt;/p&gt;

&lt;p&gt;That simple shift dramatically improves trust.&lt;/p&gt;

&lt;p&gt;Instead of treating AI as an author, we treat it as a contributor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Granular Path Targeting
&lt;/h2&gt;

&lt;p&gt;Because content is structured, updates can be scoped to specific locations rather than entire lessons.&lt;/p&gt;

&lt;p&gt;Conceptually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modules[2].lessons[0].sections[4]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"update"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"regulatory_change"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI receives the affected section rather than the entire document.&lt;/p&gt;

&lt;p&gt;This reduces content drift, limits unintended changes, and makes updates easier to review.&lt;/p&gt;

&lt;p&gt;More importantly, it keeps the blast radius small.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Preservation Problem
&lt;/h2&gt;

&lt;p&gt;Even targeted updates create risk.&lt;/p&gt;

&lt;p&gt;An AI can still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Remove required knowledge checks&lt;/li&gt;
&lt;li&gt;Drop regulatory references&lt;/li&gt;
&lt;li&gt;Break expected structure&lt;/li&gt;
&lt;li&gt;Change instructional intent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So every proposal must be validated before it reaches a reviewer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deterministic Validation
&lt;/h2&gt;

&lt;p&gt;LLMs are non-deterministic.&lt;/p&gt;

&lt;p&gt;Compliance systems shouldn't be.&lt;/p&gt;

&lt;p&gt;Before a proposal can be approved, it must pass validation checks such as:&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema Validation
&lt;/h3&gt;

&lt;p&gt;Does the proposal still conform to the expected structure?&lt;/p&gt;

&lt;h3&gt;
  
  
  Required Component Validation
&lt;/h3&gt;

&lt;p&gt;Are mandatory elements still present?&lt;/p&gt;

&lt;h3&gt;
  
  
  Citation Validation
&lt;/h3&gt;

&lt;p&gt;Have required references been preserved?&lt;/p&gt;

&lt;h3&gt;
  
  
  Structural Integrity Validation
&lt;/h3&gt;

&lt;p&gt;Does the update still fit within the expected hierarchy?&lt;/p&gt;

&lt;p&gt;The objective is simple:&lt;/p&gt;

&lt;p&gt;Prevent the AI from damaging content while attempting to improve it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Human Review Still Matters
&lt;/h2&gt;

&lt;p&gt;There is a temptation to automate everything.&lt;/p&gt;

&lt;p&gt;We've found the opposite works better.&lt;/p&gt;

&lt;p&gt;The AI identifies potential improvements.&lt;/p&gt;

&lt;p&gt;The platform validates the proposal.&lt;/p&gt;

&lt;p&gt;Humans make the final decision.&lt;/p&gt;

&lt;p&gt;For regulated content, that distinction matters.&lt;/p&gt;

&lt;p&gt;Not because humans are perfect.&lt;/p&gt;

&lt;p&gt;But because accountability still matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI As A Pull Request
&lt;/h2&gt;

&lt;p&gt;The more we developed this workflow, the more it started to resemble modern software development.&lt;/p&gt;

&lt;p&gt;Developers don't usually push unreviewed code directly into production.&lt;/p&gt;

&lt;p&gt;They create pull requests.&lt;/p&gt;

&lt;p&gt;Those pull requests are reviewed, validated, tested, and approved before being merged.&lt;/p&gt;

&lt;p&gt;We're increasingly treating AI-generated content updates the same way.&lt;/p&gt;

&lt;p&gt;The AI creates the equivalent of a pull request.&lt;/p&gt;

&lt;p&gt;The platform validates it.&lt;/p&gt;

&lt;p&gt;Humans decide whether it should be merged.&lt;/p&gt;

&lt;p&gt;That model feels significantly safer than allowing direct mutation of production content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Lesson
&lt;/h2&gt;

&lt;p&gt;This pattern extends far beyond compliance training.&lt;/p&gt;

&lt;p&gt;The same principle applies to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Documentation systems&lt;/li&gt;
&lt;li&gt;Knowledge bases&lt;/li&gt;
&lt;li&gt;Legal content&lt;/li&gt;
&lt;li&gt;Internal policies&lt;/li&gt;
&lt;li&gt;CMS platforms&lt;/li&gt;
&lt;li&gt;Enterprise workflows&lt;/li&gt;
&lt;li&gt;Any system where correctness matters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most valuable AI systems won't necessarily be the ones that generate the most content.&lt;/p&gt;

&lt;p&gt;They'll be the ones that help organisations maintain complex information safely.&lt;/p&gt;

&lt;p&gt;Content generation is rapidly becoming commoditised.&lt;/p&gt;

&lt;p&gt;Content governance is not.&lt;/p&gt;

&lt;p&gt;As builders, we spend a lot of time thinking about generation.&lt;/p&gt;

&lt;p&gt;Increasingly, I think we should be thinking about maintenance instead.&lt;/p&gt;

&lt;p&gt;Because once version one exists, the real challenge begins.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions For Other Builders
&lt;/h2&gt;

&lt;p&gt;If you're building AI into a production system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are you allowing direct mutation of live content?&lt;/li&gt;
&lt;li&gt;How are you validating AI-generated changes?&lt;/li&gt;
&lt;li&gt;What safeguards exist when the model gets it wrong?&lt;/li&gt;
&lt;li&gt;Are you treating AI as an author or as a reviewer?&lt;/li&gt;
&lt;li&gt;Have you adopted a proposal-based workflow similar to pull requests?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'd be interested to hear how others are approaching this problem.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
    <item>
      <title>The Most Dangerous AI Output Isn’t Wrong — It’s “Almost Right”</title>
      <dc:creator>CPDForge</dc:creator>
      <pubDate>Sat, 04 Apr 2026 09:19:47 +0000</pubDate>
      <link>https://dev.to/cpdforge/-the-most-dangerous-ai-output-isnt-wrong-its-almost-right-kpn</link>
      <guid>https://dev.to/cpdforge/-the-most-dangerous-ai-output-isnt-wrong-its-almost-right-kpn</guid>
      <description>&lt;p&gt;Most people think the biggest risk with AI is hallucination.&lt;/p&gt;

&lt;p&gt;Completely wrong answers.&lt;br&gt;&lt;br&gt;
Obvious mistakes.&lt;br&gt;&lt;br&gt;
Stuff you can spot instantly.&lt;/p&gt;

&lt;p&gt;That’s not what caused problems for us.&lt;/p&gt;

&lt;p&gt;The real issue showed up later — once things &lt;em&gt;looked&lt;/em&gt; like they were working.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The outputs weren’t wrong.&lt;br&gt;&lt;br&gt;
They were almost right.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And that’s a much harder problem to deal with.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why “Almost Right” Is Worse Than Wrong
&lt;/h2&gt;

&lt;p&gt;If something is clearly wrong, you catch it.&lt;/p&gt;

&lt;p&gt;You fix it.&lt;br&gt;&lt;br&gt;
You move on.&lt;/p&gt;

&lt;p&gt;But when something is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;90% correct
&lt;/li&gt;
&lt;li&gt;Well structured
&lt;/li&gt;
&lt;li&gt;Confidently written
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…it passes through unnoticed.&lt;/p&gt;

&lt;p&gt;And that’s where systems start to break.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Looks Like in Practice
&lt;/h2&gt;

&lt;p&gt;These weren’t big failures.&lt;/p&gt;

&lt;p&gt;They were small, subtle ones:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A field slightly misclassified
&lt;/li&gt;
&lt;li&gt;A rule applied in the wrong context
&lt;/li&gt;
&lt;li&gt;A structure that looks valid but doesn’t align with the system
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Individually, they don’t matter.&lt;/p&gt;

&lt;p&gt;At scale, they compound.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Problem: AI Stabilises Its Own Mistakes
&lt;/h2&gt;

&lt;p&gt;Here’s what we realised:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;AI doesn’t just generate errors — it reinforces them.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once a slightly incorrect pattern appears, the model tends to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repeat it
&lt;/li&gt;
&lt;li&gt;Expand on it
&lt;/li&gt;
&lt;li&gt;Make it look more consistent over time
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of random errors, you get:&lt;/p&gt;

&lt;p&gt;Clean, consistent, wrong outputs.&lt;/p&gt;

&lt;p&gt;Which are much harder to detect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Happens
&lt;/h2&gt;

&lt;p&gt;AI isn’t reasoning in the way we expect.&lt;/p&gt;

&lt;p&gt;It’s optimising for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coherence
&lt;/li&gt;
&lt;li&gt;Pattern completion
&lt;/li&gt;
&lt;li&gt;Internal consistency
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not correctness.&lt;/p&gt;

&lt;p&gt;So if an early assumption is slightly off, the model will build a very convincing version of reality around it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where This Breaks Real Systems
&lt;/h2&gt;

&lt;p&gt;This becomes critical when AI is used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured content generation
&lt;/li&gt;
&lt;li&gt;Compliance or policy outputs
&lt;/li&gt;
&lt;li&gt;Anything reused or scaled
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because now you don’t just have an error.&lt;/p&gt;

&lt;p&gt;You have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A repeatable error
&lt;/li&gt;
&lt;li&gt;A scalable error
&lt;/li&gt;
&lt;li&gt;A system-level error
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What We Changed
&lt;/h2&gt;

&lt;p&gt;We stopped trusting “good-looking outputs.”&lt;/p&gt;

&lt;p&gt;Instead, we built around one principle:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Every output is suspect until proven stable.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  1. Pattern Detection Over Single Output Review
&lt;/h3&gt;

&lt;p&gt;Instead of asking:&lt;br&gt;
“Is this output correct?”&lt;/p&gt;

&lt;p&gt;We ask:&lt;br&gt;
“Is this pattern consistently correct across outputs?”&lt;/p&gt;

&lt;p&gt;This exposes hidden drift fast.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Intent vs Output Validation
&lt;/h3&gt;

&lt;p&gt;We separate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What the system is supposed to do
&lt;/li&gt;
&lt;li&gt;What the AI actually produced
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then compare them explicitly.&lt;/p&gt;

&lt;p&gt;If they don’t align, it fails — even if it looks right.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Breaking the Feedback Loop
&lt;/h3&gt;

&lt;p&gt;We avoid feeding AI its own outputs without checks.&lt;/p&gt;

&lt;p&gt;Because that’s how:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Small errors
become reinforced patterns
become system behaviour
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Counterintuitive Bit
&lt;/h2&gt;

&lt;p&gt;Making outputs more polished made the problem worse.&lt;/p&gt;

&lt;p&gt;Cleaner language increases trust.&lt;br&gt;&lt;br&gt;
More trust reduces scrutiny.&lt;/p&gt;

&lt;p&gt;Which allows bad patterns to survive longer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters Right Now
&lt;/h2&gt;

&lt;p&gt;A lot of AI tooling is focused on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Making outputs better
&lt;/li&gt;
&lt;li&gt;Making them more human
&lt;/li&gt;
&lt;li&gt;Making them more polished
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But that increases risk if you’re not validating underneath.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;If your AI outputs look great but your system still feels unreliable:&lt;/p&gt;

&lt;p&gt;You’re probably dealing with “almost right” errors.&lt;/p&gt;

&lt;p&gt;And those are much harder to catch than obvious failures.&lt;/p&gt;




&lt;h2&gt;
  
  
  Question for Anyone Building with AI
&lt;/h2&gt;

&lt;p&gt;If you’re using AI in production workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What breaks first when you scale?&lt;/li&gt;
&lt;li&gt;Do you validate outputs, or just trust them if they look good?&lt;/li&gt;
&lt;li&gt;Have you run into “clean but wrong” behaviour?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Genuinely curious how others are handling this.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>From Prompts to Systems: Fixing AI Agent Drift in Production</title>
      <dc:creator>CPDForge</dc:creator>
      <pubDate>Mon, 30 Mar 2026 13:50:55 +0000</pubDate>
      <link>https://dev.to/cpdforge/from-prompts-to-systems-fixing-ai-agent-drift-in-production-pcm</link>
      <guid>https://dev.to/cpdforge/from-prompts-to-systems-fixing-ai-agent-drift-in-production-pcm</guid>
      <description>&lt;h2&gt;
  
  
  Why My AI Agent Kept Getting Things Wrong (And What Actually Fixed It)
&lt;/h2&gt;

&lt;p&gt;At first, it worked.&lt;/p&gt;

&lt;p&gt;I gave the AI a clear prompt. It responded well. Structured, relevant, even a bit impressive.&lt;/p&gt;

&lt;p&gt;Then I tried again.&lt;/p&gt;

&lt;p&gt;Same prompt. Slightly different output.&lt;br&gt;&lt;br&gt;
Then again — and something felt off.&lt;br&gt;&lt;br&gt;
Not completely wrong… just inconsistent.&lt;/p&gt;

&lt;p&gt;That’s when it became a problem.&lt;/p&gt;

&lt;p&gt;Because I wasn’t building a demo. I was building a product.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Problem: “Almost Right” Is Not Good Enough
&lt;/h2&gt;

&lt;p&gt;When you’re working with LLMs in isolation, variability is fine. Even interesting.&lt;/p&gt;

&lt;p&gt;When you’re building something people rely on — it isn’t.&lt;/p&gt;

&lt;p&gt;I started seeing patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Outputs drifting in structure
&lt;/li&gt;
&lt;li&gt;Key instructions being ignored
&lt;/li&gt;
&lt;li&gt;Tone and formatting changing between runs
&lt;/li&gt;
&lt;li&gt;Occasionally… things just made up
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nothing catastrophic. Just unreliable.&lt;/p&gt;

&lt;p&gt;And that’s worse.&lt;/p&gt;

&lt;p&gt;Because you can’t trust it.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Context: This Wasn’t Just a Chatbot
&lt;/h2&gt;

&lt;p&gt;One important detail — this wasn’t an internal tool or a sandbox experiment.&lt;/p&gt;

&lt;p&gt;This was a &lt;strong&gt;user-facing AI agent&lt;/strong&gt;, interacting with both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;logged-in users (with context, data, and history)
&lt;/li&gt;
&lt;li&gt;prospective users (with no context at all)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which meant I effectively needed &lt;strong&gt;two behaviours&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one that could operate with structured internal data and constraints
&lt;/li&gt;
&lt;li&gt;one that could explain, guide, and respond more openly without access to that context
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Trying to handle both with the same prompt quickly broke down.&lt;/p&gt;

&lt;p&gt;The agent would:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;assume context that didn’t exist
&lt;/li&gt;
&lt;li&gt;overreach when it should stay generic
&lt;/li&gt;
&lt;li&gt;or lose structure when switching between modes
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s when it became clear the issue wasn’t just prompting — it was &lt;strong&gt;context control and behavioural separation&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why This Happens (and Why It’s Not a Bug)
&lt;/h2&gt;

&lt;p&gt;It took a bit of stepping back to realise:&lt;/p&gt;

&lt;p&gt;The model wasn’t failing — I was asking it to behave like something it isn’t.&lt;/p&gt;

&lt;p&gt;LLMs are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stateless (unless you force context)
&lt;/li&gt;
&lt;li&gt;Probabilistic (not deterministic)
&lt;/li&gt;
&lt;li&gt;Context-sensitive (and context degrades fast)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What I was treating as “rules” were really just:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Suggestions with good intentions&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even system prompts didn’t fully solve it.&lt;/p&gt;

&lt;p&gt;They help — but they don’t enforce behaviour.&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Tried First (and Why It Didn’t Work)
&lt;/h2&gt;

&lt;p&gt;Like most people, I went through the usual iterations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Making prompts longer
&lt;/li&gt;
&lt;li&gt;Repeating instructions
&lt;/li&gt;
&lt;li&gt;Adding “IMPORTANT:” everywhere
&lt;/li&gt;
&lt;li&gt;Trying to be hyper-specific
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It improved things slightly… but not enough.&lt;/p&gt;

&lt;p&gt;The problem wasn’t clarity.&lt;/p&gt;

&lt;p&gt;The problem was &lt;strong&gt;control&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Shift: From Prompts to Systems
&lt;/h2&gt;

&lt;p&gt;The breakthrough came when I stopped thinking in terms of prompts and started thinking in terms of &lt;strong&gt;structure&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Tell the model what to do”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I moved to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Define how the model is allowed to behave”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s a completely different mindset.&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Built: A Structured Instruction Layer
&lt;/h2&gt;

&lt;p&gt;I ended up creating what I originally called an “instruction bible”.&lt;/p&gt;

&lt;p&gt;In reality, it’s closer to a &lt;strong&gt;structured instruction system&lt;/strong&gt; layered on top of the model.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Persistent rules (not buried in prompts)
&lt;/h3&gt;

&lt;p&gt;Instead of mixing everything into one prompt, I separated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Role definition
&lt;/li&gt;
&lt;li&gt;Behaviour rules
&lt;/li&gt;
&lt;li&gt;Output constraints
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"compliance_ai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Do not invent regulations"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Flag uncertainty explicitly"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Prioritise clarity over completeness"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"output_format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"structured_sections"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This becomes the source of truth, not just part of the conversation.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Modular instructions
&lt;/h3&gt;

&lt;p&gt;Different tasks = different instruction sets.&lt;/p&gt;

&lt;p&gt;Instead of one giant prompt, I used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generation mode
&lt;/li&gt;
&lt;li&gt;Review mode
&lt;/li&gt;
&lt;li&gt;Analysis mode
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each with its own constraints.&lt;/p&gt;

&lt;p&gt;This reduced cross-contamination between behaviours.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Controlled outputs
&lt;/h3&gt;

&lt;p&gt;I stopped accepting “natural” responses.&lt;/p&gt;

&lt;p&gt;Everything had to follow a structure.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sections must exist
&lt;/li&gt;
&lt;li&gt;Headings must match
&lt;/li&gt;
&lt;li&gt;Lists must be formatted consistently
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the output didn’t comply, it was rejected or reprocessed.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Reduced ambiguity
&lt;/h3&gt;

&lt;p&gt;I removed anything vague.&lt;/p&gt;

&lt;p&gt;No:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“be helpful”
&lt;/li&gt;
&lt;li&gt;“be clear”
&lt;/li&gt;
&lt;li&gt;“be concise”
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Define structure
&lt;/li&gt;
&lt;li&gt;Define constraints
&lt;/li&gt;
&lt;li&gt;Define boundaries
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model performs much better when it has less room to interpret.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;Once this layer was in place, the difference was immediate.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Outputs became consistent
&lt;/li&gt;
&lt;li&gt;Structure stabilised
&lt;/li&gt;
&lt;li&gt;Hallucination dropped significantly
&lt;/li&gt;
&lt;li&gt;Reuse became possible
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most importantly:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I could actually trust the output in a product setting&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not perfect — but predictable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Realisation
&lt;/h2&gt;

&lt;p&gt;The real lesson wasn’t about prompts.&lt;/p&gt;

&lt;p&gt;It was this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Prompt engineering doesn’t scale. Systems do.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can get good results with clever prompts.&lt;/p&gt;

&lt;p&gt;But if you want:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reliability
&lt;/li&gt;
&lt;li&gt;repeatability
&lt;/li&gt;
&lt;li&gt;product-grade output
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where This Fits in the Bigger Picture
&lt;/h2&gt;

&lt;p&gt;This lines up with a broader shift happening right now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;From chatbots → agents
&lt;/li&gt;
&lt;li&gt;From prompts → orchestration
&lt;/li&gt;
&lt;li&gt;From “AI responses” → controlled systems
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’re moving away from:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Ask the model something”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Toward:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Design how the model operates”&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;LLMs are powerful — but they’re not plug-and-play components.&lt;/p&gt;

&lt;p&gt;If you want to build something real with them, you have to accept:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You’re not just writing prompts
&lt;/li&gt;
&lt;li&gt;You’re designing behaviour
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And once you start treating it that way, everything changes.&lt;/p&gt;




&lt;p&gt;If you’re building with AI and hitting similar issues, I’d be interested to hear how you’re handling it — especially where things break.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>We tried to generate a compliance course with AI. It didn’t go well.</title>
      <dc:creator>CPDForge</dc:creator>
      <pubDate>Wed, 25 Mar 2026 08:07:10 +0000</pubDate>
      <link>https://dev.to/cpdforge/we-tried-to-generate-a-compliance-course-with-ai-it-didnt-go-well-56n7</link>
      <guid>https://dev.to/cpdforge/we-tried-to-generate-a-compliance-course-with-ai-it-didnt-go-well-56n7</guid>
      <description>&lt;p&gt;&lt;strong&gt;We started off trying to build a compliance course.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We ended up building the system required to trust one.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Turns out they’re not the same thing.&lt;/p&gt;

&lt;p&gt;That’s when everything changed.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧪 The First Version (Looked Fine… Until It Didn’t)
&lt;/h2&gt;

&lt;p&gt;The initial idea was simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use AI to generate a compliance training course.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pick a topic like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;risk assessment
&lt;/li&gt;
&lt;li&gt;workplace safety
&lt;/li&gt;
&lt;li&gt;ESG fundamentals
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feed it into a model, get a structured course out.&lt;/p&gt;

&lt;p&gt;And technically — that worked.&lt;/p&gt;

&lt;p&gt;We got:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;modules
&lt;/li&gt;
&lt;li&gt;lessons
&lt;/li&gt;
&lt;li&gt;headings
&lt;/li&gt;
&lt;li&gt;even quizzes
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;On the surface, it looked decent.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But once you actually read it properly…&lt;/p&gt;




&lt;h2&gt;
  
  
  ❌ What Was Broken
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Shallow Content&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It explained things, but didn’t really teach anything.&lt;br&gt;&lt;br&gt;
No depth. No real-world context. No edge cases.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Inconsistent Structure&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Some lessons were detailed. Others felt like placeholders.&lt;br&gt;&lt;br&gt;
No consistency across the course.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;No Instructional Flow&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It wasn’t designed — it was assembled.&lt;br&gt;&lt;br&gt;
Content chunks, not a learning journey.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;And the Big One: Reliability&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In compliance training, “almost correct” isn’t acceptable.&lt;br&gt;&lt;br&gt;
It’s a risk.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚠️ The Realisation
&lt;/h2&gt;

&lt;p&gt;We assumed the problem was:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“How do we generate better content?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It wasn’t.&lt;/p&gt;

&lt;p&gt;The real problem was:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do we make that content consistent, reliable, and safe to use?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI was doing exactly what it’s good at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;producing plausible output
&lt;/li&gt;
&lt;li&gt;filling gaps convincingly
&lt;/li&gt;
&lt;li&gt;sounding right
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But that’s not the same as being trustworthy.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔧 What Broke First
&lt;/h2&gt;

&lt;p&gt;Our original pipeline looked something like:&lt;/p&gt;

&lt;p&gt;Prompt → LLM → Output course&lt;/p&gt;

&lt;p&gt;And for a moment, that felt like enough.&lt;/p&gt;

&lt;p&gt;Until we started testing it properly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sections contradicted each other
&lt;/li&gt;
&lt;li&gt;Concepts repeated in different ways
&lt;/li&gt;
&lt;li&gt;Terminology drifted across lessons
&lt;/li&gt;
&lt;li&gt;Some parts were strong, others clearly weak
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;You could generate a course.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You just couldn’t rely on it.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧱 What We Had to Build Instead
&lt;/h2&gt;

&lt;p&gt;The moment things changed was when we stopped treating this as a generation problem.&lt;/p&gt;

&lt;p&gt;We started treating it as a system problem.&lt;/p&gt;

&lt;p&gt;The pipeline evolved into something more like:&lt;/p&gt;

&lt;p&gt;Input&lt;br&gt;
→ Structured Generation&lt;br&gt;
→ Validation Layer&lt;br&gt;
→ Targeted Rewriting&lt;br&gt;
→ Enrichment (quizzes, scenarios, examples)&lt;br&gt;
→ Compliance Checks&lt;br&gt;
→ Output&lt;/p&gt;

&lt;p&gt;Each layer existed for a reason.&lt;/p&gt;

&lt;p&gt;Because every time we skipped one — something failed.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧩 The Hard Parts (That Don’t Show Up in Demos)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Structure Enforcement&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We had to stop the model from improvising.&lt;/p&gt;

&lt;p&gt;That meant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fixed lesson frameworks
&lt;/li&gt;
&lt;li&gt;defined section types
&lt;/li&gt;
&lt;li&gt;controlled outputs
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Targeted Improvement (Not Regeneration)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Regenerating everything just moved the problem around.&lt;/p&gt;

&lt;p&gt;Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;identify weak sections
&lt;/li&gt;
&lt;li&gt;rewrite only those
&lt;/li&gt;
&lt;li&gt;preserve what already works
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Cross-Course Consistency&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This was harder than expected.&lt;/p&gt;

&lt;p&gt;We needed to deal with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;duplicated concepts
&lt;/li&gt;
&lt;li&gt;mismatched terminology
&lt;/li&gt;
&lt;li&gt;uneven difficulty
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which meant introducing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal rules
&lt;/li&gt;
&lt;li&gt;pattern checks
&lt;/li&gt;
&lt;li&gt;consistency constraints
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Compliance Awareness&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;This is where most tools fall down.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We needed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;alignment with recognised frameworks
&lt;/li&gt;
&lt;li&gt;the ability to adapt as guidance evolves
&lt;/li&gt;
&lt;li&gt;detection of weak or risky content
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 The Shift
&lt;/h2&gt;

&lt;p&gt;At some point, we stopped thinking in prompts.&lt;/p&gt;

&lt;p&gt;We started thinking in systems.&lt;/p&gt;

&lt;p&gt;AI became one part of the process — not the solution.&lt;/p&gt;




&lt;h2&gt;
  
  
  🛠️ If You’re Building with AI
&lt;/h2&gt;

&lt;p&gt;It’s very easy to focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;better prompts
&lt;/li&gt;
&lt;li&gt;better outputs
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;But the real leverage is in:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;constraints
&lt;/li&gt;
&lt;li&gt;validation
&lt;/li&gt;
&lt;li&gt;iteration
&lt;/li&gt;
&lt;li&gt;control
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because generation is easy.&lt;/p&gt;

&lt;p&gt;Making it usable is not.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Where This Landed
&lt;/h2&gt;

&lt;p&gt;What started as “generate a course” became:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structure
&lt;/li&gt;
&lt;li&gt;validation
&lt;/li&gt;
&lt;li&gt;rewriting
&lt;/li&gt;
&lt;li&gt;enrichment
&lt;/li&gt;
&lt;li&gt;compliance
&lt;/li&gt;
&lt;li&gt;delivery
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not because we wanted more features —&lt;br&gt;&lt;br&gt;
but because without them, none of it worked.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;That was the real lesson.&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;AI doesn’t remove complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It just hides it — until it matters.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>webdev</category>
      <category>softwareengineering</category>
    </item>
  </channel>
</rss>
