Do NOT Think of a Pink Elephant

#ai #agents #agentskills #machinelearning

You thought of a pink elephant, didn't you?

Same goes for LLMs too.

"Do not use mocks in tests."

Clear, direct, unambiguous instruction. The agent read it — I can see it in the trace. Then it wrote a test file with unittest.mock on line 3. Thanks...

I've seen this play out hundreds of times. A developer writes a rule, the agent loads it, and it does exactly what the rule said not to do. The natural conclusion: instructions are unreliable. The agent is probabilistic. You can't trust it.

That's wrong. The instruction was the problem.

The pink elephant

There's a well-known effect in psychology called ironic process theory (Daniel Wegner, 1987). Tell someone "don't think of a pink elephant," and they immediately think of a pink elephant. The act of suppressing a thought requires activating it first.

Something structurally similar happens with AI instructions.

"Do not use mocks in tests" introduces the concept of mocking into the context. The tokens mock, tests, use — these are exactly the tokens the model would produce when writing test code with mocks. You've put the thing you're banning right in the generation path.

This doesn't mean restrictive instructions are useless. It means a bare restriction is incomplete.

The anatomy of a complete instruction

The instructions that work — reliably, across thousands of runs — have three components. But the order you write them in matters as much as whether they're there at all.

Here's how most people write it:

# Human-natural ordering — constraint first
Do not use unittest.mock in tests.
Use real service clients from tests/fixtures/.
Mocked tests passed CI last quarter while the production
integration was broken — real clients catch this.

All three components are present. Restriction, directive, context. But the restriction fires first — the model activates {mock, unittest, tests} before it ever sees the alternative. You've front-loaded the pink elephant.

Now flip it:

# Golden ordering — directive first
Use real service clients from tests/fixtures/.
Real integration tests catch deployment failures and configuration
errors that would otherwise reach production undetected.
Do not use unittest.mock.

Same three components. Different order. The directive establishes the desired pattern first. The reasoning reinforces it. The restriction fires last, when the positive frame is already dominant.

In my experiments — 500 runs per condition, same model, same context — constraint-first produces violations 31% of the time. Directive-first with positive reasoning: 7%.

The pink elephant isn't just about missing components. It's about which concept the model sees first.

Three layers, in this order:

Directive — what to do. This goes first. It establishes the pattern you want in the generation path before the prohibited concept appears.
Context — why. Reasoning that reinforces the directive without mentioning the prohibited concept. "Real integration tests catch deployment failures" adds mass to the positive pattern. Reasoning that mentions the prohibited concept doubles the violation rate.
Restriction — what not to do. This goes last. Negation provides weak suppression — but weak suppression is enough when the positive pattern is already dominant.

The part nobody expects

Here's what surprised me: the ordering effect is larger than any other variable I've measured.

Precise naming vs. vague categories? 28 percentage points. Exact scope vs. broad scope? 74 points across the range. But reordering — same words, same components, just flipped — accounts for 25 points on its own. And it compounds with everything else.

Most developers write instructions the way they'd write them for a human: state the problem, then the solution. "Don't do X. Instead, do Y." It's natural. It's also the worst ordering for an LLM.

Never write "Don't use X. Instead, use Y." Write "Use Y. Here's why Y works. Don't use X."

Formatting helps too — structure is not decoration. I covered that in depth in 7 Formatting Rules for the Machine. But formatting on top of bad ordering is polishing the wrong end. Get the order right first.

What this looks like in practice

Here's a real instruction I see in the wild:

When writing tests, avoid mocking external services. Try to
use real implementations where possible. This helps catch
integration issues early. If you must mock, keep mocks minimal
and focused.

Count the problems:

"Avoid" — hedged, not direct
"external services" — category, not construct
"Try to" — escape hatch built into the instruction
"where possible" — another escape hatch
"If you must mock" — reintroduces mocking as an option within the instruction that prohibits it
Constraint-first ordering — the prohibition leads, the alternative follows
No structural separation — restriction, directive, hedge, and escape hatch all in one paragraph

Now rewrite it:

**Use the service clients** in `tests/fixtures/stripe.py` and
`tests/fixtures/redis.py`.

> Real service clients caught a breaking Stripe API change
> that went undetected for 3 weeks in payments - integration
> tests against live endpoints surface these immediately.

*Do not import* `unittest.mock` or `pytest.monkeypatch`.

Directive first — names the exact files. Context second — the specific incident, reinforcing why the directive matters without mentioning the prohibited concept. Restriction last — names the exact imports, fires after the positive pattern is established. No hedging. No escape hatches.

Try it

For any instruction in your AGENTS.md/CLAUDE.md or SKILLS.md files:

Start with the directive. Name the file, the path, the pattern. Use backticks. If there's no alternative to lead with, you're writing a pink elephant.
Add the context. One sentence. The specific incident or the specific reason the directive works. Do not mention the thing you're about to prohibit — reasoning that references the prohibited concept halves the benefit.
End with the restriction. Name the construct — the import, the class, the function. Bold it. No "try to avoid" or "where possible."
Format each component distinctly. The directive, context, and restriction should be visually and structurally separate. Don't merge them into one paragraph.