Duncan Brown

Posted on Apr 6

AI Works Better When Behaviour Is Explicit

#architecture #ai #testing #ddd

AI coding assistants are generally quite good at producing code.

However, they are less reliable when they have to decide what that code should do.

In other words, they struggle less with syntax than with intent.

Given a clear description of behaviour, an assistant can often produce a reasonable implementation. Given an ambiguous one, it will still produce something — but that “something” may not align with what was actually intended.

It’s a consequence of how we describe systems, and not a failure of the model.

The Real Problem Is Ambiguity

In most codebases, behaviour is only partially explicit.

Some of it lives in:

method names
comments
documentation
tests
conversations between developers

The rest is assumed.

Those assumptions work reasonably well when the same people are working on the system and context is shared. They break down as soon as:

new developers join
features are modified across teams
systems grow in scope
or code is generated rather than written

AI coding assistants make the breakdown more visible.

They don’t share your assumptions.

They operate on what is written, not what is implied.

Try this experiment: pick an open-source project and scan its code, PRs, and tickets for the kinds of artifacts mentioned above.

Are you able to write a functional specification that perfectly describes the system’s intended behaviour? Do you think a coding agent would perform any better?

Where BDD Fits

Behaviour-driven development is often discussed as a testing technique.

More accurately, it is a way of making behaviour explicit.

A scenario like:

Given a document is submitted
When it is reviewed
Then it should receive a score between 0 and 100

doesn’t describe implementation.

It describes intent, and that distinction matters.

When behaviour is expressed this way:

there is less room for interpretation
there are fewer implicit assumptions
and there exists a clearer boundary between what the system does and how it does it

That clarity certainly benefits humans, but it also benefits systems that generate code.

Why This Matters for AI-Assisted Development

When behaviour is implicit, AI has to infer intent.

That inference is where things start to go wrong.

An assistant may:

implement the “most likely” interpretation
generalize beyond what was intended
introduce edge cases that were never discussed
or omit constraints that were never stated explicitly

The result often looks reasonable in isolation, but it may not match the actual expectations of the system.

When behaviour is expressed explicitly — for example, through Gherkin-style scenarios — that ambiguity is reduced.

The assistant no longer has to guess what the system should do.

It can focus on how to implement what has already been defined.

This shifts the problem from interpretation to execution as we move from an imperative style to a declarative style.

BDD as a Constraint System

In previous discussions, I’ve described architecture as a constraint system.

Patterns like:

bounded contexts
aggregates
dependency direction
ubiquitous language

all restrict how a system is allowed to evolve.

Behaviour-driven development introduces another form of constraint:

It constrains behaviour.

A well-defined set of scenarios limits:

what the system is expected to do
how it should respond under specific conditions
and what outcomes are considered valid

These constraints operate at a different level than architectural boundaries, but they serve the same purpose.

They reduce the space in which incorrect changes can occur.

For humans, this improves communication.

For AI-assisted workflows, it reduces guesswork.

Tooling, Not the Point

Frameworks like Cucumber or other Gherkin-based tools are often used to execute these scenarios.

That’s useful, but it’s not the most important part.

The primary value of BDD in this context is not test execution.

It’s the act of making behaviour explicit.

You can get much of the benefit even without a full BDD toolchain, as long as:

behaviour is clearly described
expectations are shared
and scenarios are treated as part of the system’s definition

The tooling helps enforce it, similarly to how we might use ArchUnit to enforce architectural constraints.

The clarity is what makes it work.

Where This Helps — and Where It Doesn’t

Making behaviour explicit improves outcomes, but it does not eliminate the need for discipline.

BDD does not:

define architectural boundaries
prevent poor domain modelling
or replace the need for governance

It complements those things.

It works best when combined with:

clear architectural constraints
a well-defined ubiquitous language
and enforcement mechanisms that keep the system aligned over time

Without those, even well-written scenarios can (and do) drift.

Closing Thoughts

AI coding assistants are not inherently unreliable.

As anyone who's used one knows, they are sensitive to ambiguity.

When intent is implicit, they infer.

When behaviour is explicit, they implement.

Behaviour-driven development is an excellent way to make that intent visible.

Not as a testing technique alone, but as a constraint on what the system is supposed to do.

In systems that evolve quickly — whether through teams, automation, or AI-assisted development — that constraint becomes increasingly valuable.

DEV Community