Shinsuke KAGAWA

Posted on Feb 4

Design Integration Checkpoints Before Letting LLMs Code

#ai #productivity #programming #architecture

Once you stop trying to control AI generation and start designing verification, you immediately hit the next problem: integration.
And this is where most AI-generated systems actually break.

Everything works.
Until it doesn't.

Each layer looks correct in isolation.
Tests pass.
Types line up.

And then the system breaks where those layers meet.

Why "Everything Works" Until It Doesn't

This is a verification problem, not an implementation problem.
When you build systems layer by layer, integration happens very late.

Layer-by-layer development
Phase 1: Data layer ────────────────✓
Phase 2: Service layer ─────────────✓
Phase 3: API layer ────────────────✓
Phase 4: UI layer ─────────────────✓
Phase 5: Integration ── 💥 breaks here

Each layer is implemented in isolation.
So you don't actually know if everything connects correctly until the end.

This problem becomes much worse with AI-generated code.

LLMs don't hold the entire system in mind at once.
They optimize locally, based on the current context — and they often miss hidden contracts between layers.

A Painful Integration Bug That "Worked"

One of the most painful bugs I faced didn't involve crashes or errors.

The AI chatbot worked.

It returned responses.
Logs looked normal.
Nothing failed.

But when we tested it in the real environment, the answers were subtly — but consistently — wrong.

What actually went wrong

The root cause wasn't a single mistake, but a combination of issues across layers:

Mock implementations silently left in place
LLM fallbacks that prioritized "returning something" instead of failing fast
Duplicate logic across layers, created while implementing each layer separately

The common thread? I wasn't tracking what else might break.

Each layer looked correct in isolation.
Tests passed.
No alerts fired.

Because the system always returned some response, it created a false sense of confidence.
We didn't notice the problem immediately — and by the time we did, identifying the real cause across layers was extremely difficult.

Bugs that silently "work" are far more dangerous than bugs that crash.

Make Integration Explicit

I now spend about five minutes defining integration checkpoints.
Not documentation. Just verification.

The goal is simple: define where things must connect, and how I'll know they actually do.

Now, before implementation, I write a very small design note.

Not a formal design document.
Nothing formal.

Just a checklist that answers two questions:

What parts of the system are affected?
Where do things need to integrate — and how do I verify it?

Step 1: List What's Affected

First, I write down what is directly or indirectly impacted.

Change: Add image generation feature

Direct impact:
  - infrastructure/image/functions.ts
  - application/services/queryClassificationService.ts
  - application/services/imageGenerationService.ts

Indirect impact:
  - conversationService.ts (function calling flow)

No impact:
  - existing text generation services
  - other function handlers

This immediately clarifies the blast radius.

I don't aim for perfection —
I just want to avoid being surprised later.

Step 2: Define Integration Checkpoints

Next, I decide where integration must be verified and how.

Integration point 1: Function selection
Location: ConversationService.generateContentWithFunctionCalling

How to verify:
  1. Send a request asking for an image
  2. Confirm query classification returns `image_generation`
  3. Confirm the correct function is selected in logs

Expected result:
  - Log shows: Executing function: generateImage

And another one:

Integration point 2: Image generation and posting
Location: ImageGenerationService → MessagingClient.uploadFile

How to verify:
  1. Image data is returned from the image client
  2. The file is posted to the chat thread

Expected result:
  - Image appears in the chat

Now I know exactly what "working" means.

That's it.

Why This Works (Especially with AI)

When I give this to an LLM, it changes how implementation happens.

Instead of "build this feature," it's more like:
"Connect A to B. Here's how we'll know it works."

This also pairs well with building features end-to-end:

Feature-based development
Feature A: Data → Service → API → UI → Verify
Feature B: Data → Service → API → UI → Verify
Feature C: Data → Service → API → UI → Verify

Each feature is fully integrated before moving on.

The Result

Before this habit, integration bugs often cost me hours.

After introducing these small design notes:

AI-generated code still has small issues
But features no longer completely break at integration
Unexpected behavior is caught much earlier

Five minutes of thinking up front easily saves hours of debugging later.

Who This Is For

This approach works well if you:

Use AI coding tools
Build layered architectures
Want fast feedback instead of perfect design docs

This is not about writing more documentation.
It's just about making integration explicit before code is written.

Final Thoughts

AI tools are incredibly powerful — but they optimize locally.

If we don't define integration points explicitly, we end up debugging systems that look correct but behave incorrectly.

A small design checklist has made a huge difference for me.

Hope this saves you some painful debugging.

DEV Community