CPDForge

Posted on Mar 25

We tried to generate a compliance course with AI. It didn’t go well.

#machinelearning #webdev #ai #softwareengineering

We started off trying to build a compliance course.

We ended up building the system required to trust one.

Turns out they’re not the same thing.

That’s when everything changed.

🧪 The First Version (Looked Fine… Until It Didn’t)

The initial idea was simple:

Use AI to generate a compliance training course.

Pick a topic like:

risk assessment
workplace safety
ESG fundamentals

Feed it into a model, get a structured course out.

And technically — that worked.

We got:

modules
lessons
headings
even quizzes

On the surface, it looked decent.

But once you actually read it properly…

❌ What Was Broken

Shallow Content

It explained things, but didn’t really teach anything.

No depth. No real-world context. No edge cases.

Inconsistent Structure

Some lessons were detailed. Others felt like placeholders.

No consistency across the course.

No Instructional Flow

It wasn’t designed — it was assembled.

Content chunks, not a learning journey.

And the Big One: Reliability

In compliance training, “almost correct” isn’t acceptable.

It’s a risk.

⚠️ The Realisation

We assumed the problem was:

“How do we generate better content?”

It wasn’t.

The real problem was:

How do we make that content consistent, reliable, and safe to use?

AI was doing exactly what it’s good at:

producing plausible output
filling gaps convincingly
sounding right

But that’s not the same as being trustworthy.

🔧 What Broke First

Our original pipeline looked something like:

Prompt → LLM → Output course

And for a moment, that felt like enough.

Until we started testing it properly.

Sections contradicted each other
Concepts repeated in different ways
Terminology drifted across lessons
Some parts were strong, others clearly weak

You could generate a course.

You just couldn’t rely on it.

🧱 What We Had to Build Instead

The moment things changed was when we stopped treating this as a generation problem.

We started treating it as a system problem.

The pipeline evolved into something more like:

Input
→ Structured Generation
→ Validation Layer
→ Targeted Rewriting
→ Enrichment (quizzes, scenarios, examples)
→ Compliance Checks
→ Output

Each layer existed for a reason.

Because every time we skipped one — something failed.

🧩 The Hard Parts (That Don’t Show Up in Demos)

Structure Enforcement

We had to stop the model from improvising.

That meant:

fixed lesson frameworks
defined section types
controlled outputs

Targeted Improvement (Not Regeneration)

Regenerating everything just moved the problem around.

Instead:

identify weak sections
rewrite only those
preserve what already works

Cross-Course Consistency

This was harder than expected.

We needed to deal with:

duplicated concepts
mismatched terminology
uneven difficulty

Which meant introducing:

internal rules
pattern checks
consistency constraints

Compliance Awareness

This is where most tools fall down.

We needed:

alignment with recognised frameworks
the ability to adapt as guidance evolves
detection of weak or risky content

🧠 The Shift

At some point, we stopped thinking in prompts.

We started thinking in systems.

AI became one part of the process — not the solution.

🛠️ If You’re Building with AI

It’s very easy to focus on:

better prompts
better outputs

But the real leverage is in:

constraints
validation
iteration
control

Because generation is easy.

Making it usable is not.

🚀 Where This Landed

What started as “generate a course” became:

structure
validation
rewriting
enrichment
compliance
delivery

Not because we wanted more features —

but because without them, none of it worked.

That was the real lesson.

AI doesn’t remove complexity.

It just hides it — until it matters.

Top comments (1)

CPDForge • Mar 25

Curious how others are handling this.

If you're using AI to generate content (courses, docs, etc.) — how are you dealing with:

consistency across outputs
reliability / “almost correct” risk
maintaining structure at scale

Feels like most tools focus on generation… but not what happens after.

Would be great to hear how others are solving it.