Ken

Posted on Jun 11 • Originally published at nxus.systems

A Fluent LLM Answer Is Not the Same as an Inspected Answer

#ai #llm #python #devtools

Last time I hit a guardrail, it did not offer to repair my car.

This one will not repair the car either. But it can help repair an answer that
forgot where the car is.

Here is the small version of the problem:

I need to get my car washed and the carwash is only 50 meters away. Should I
drive there or just walk?

An LLM can answer that walking is better. The distance is short. Walking saves
fuel. Walking is simple.

That sounds reasonable until you ask what actually moved.

Walking moves the person to the car wash. It does not move the car.

That is not a grammar problem or a tone problem. The answer violates a
precondition: the car must be at the wash before the car can be washed.

Prompting can sometimes fix this one case. So can switching models. The same
class of failure can still show up across local models, hosted commercial
models, coding assistants, and agent frameworks.

The more useful pattern is not "write a better prompt and hope." The useful
pattern is hybrid reasoning:

LLM draft
  -> structured facts
  -> selected inspection
  -> evidence-backed repair packet
  -> revised answer
  -> fact extraction again
  -> selected inspection again

The important part is the last line.

The repair is not the finish line. The repaired answer still has to pass
inspection.

Why This Is Hybrid Reasoning

"Guardrails" has become a popular word for LLM safety and reliability, but the
term can hide very different mechanisms. A keyword filter, a schema validator,
a formal solver, a decision table, and a Bayesian network are not the same
tool.

The pattern here is more specific:

language model
  -> structured representation
  -> selected reasoning mechanism
  -> feedback
  -> revised language
  -> selected reasoning mechanism again

The LLM drafts, extracts, and repairs. The non-LLM components do the parts they
are better suited for:

CLIPS inspects explicit rules.
Solver/Z3 inspects feasibility and constraints.
ZEN inspects decision tables and policy admissibility.
Bayesian networks update review-risk posteriors under uncertainty.

The key design choice is selection. Do not force every mechanism into every
problem.

Four Small Scenarios

The public common-sense-guardrails example uses four scenarios:

Scenario	What can go wrong	Inspection that fits
`car-wash`	The answer moves the person, not the car.	CLIPS for object presence; Solver/Z3 for feasibility evidence.
`coupon-stack`	The answer stacks discounts that policy or margin rules do not allow.	CLIPS and ZEN for policy; BN for review risk.
`pallet-door`	The answer suggests pushing a wide pallet through a narrower door.	CLIPS for the rule surface; Solver/Z3 for dimensional feasibility.
`cold-chain`	The answer ignores certified refrigerated handling and traceability.	CLIPS and ZEN for policy; BN for incomplete compliance evidence.

The pallet-door case has the same practical absurdity as the car-wash case.
"Just push the wide pallet through the narrow door" is not a logistics plan.
It is a sentence that avoided doing geometry.

The ultimate comic version would combine all four:

Someone needs their car washed, wants to use multiple coupons, and has an
extra-wide pallet of fresh-frozen fish strapped to the roof of their car.

That would exercise object presence, coupon policy, dimensional feasibility,
and cold-chain handling in one memorable errand.

It is ridiculous. It is also a good reminder that production guardrails often
belong to different owners. Marketing or finance may own coupon policy.
Logistics may own pallet feasibility. QA or safety may own cold-chain handling.
A platform team may own the repair loop.

Those groups should not all be forced to edit one monolithic prompt every time
one policy or constraint changes.

What The Guardrail Looks Like

For the car-wash case, the native CLIPS rule is direct:

(defrule car-required-at-wash
  (required-object
    (object car)
    (required-location car_wash)
    (current-location ?where)
    (present-at-required-location false))
  (moved-object
    (action-id ?action)
    (object person)
    (to car_wash))
  =>
  (assert
    (guardrail-finding
      (status fail)
      (rule-id car-required-at-wash)
      (severity error)
      (message "Walking moves the person to the wash, but the car remains at home."))))

For coupon and cold-chain scenarios, Bayesian Network scoring adds a different
kind of inspection. It does not prove a contradiction; it makes review risk
explicit enough to route, repair, or escalate:

coupon-stack / --guardrails auto
  selected: clips, zen, bn
  BN attempt 1: needs_review = 0.95064 -> fail
  BN attempt 2: needs_review = 0.222 -> pass

cold-chain / --guardrails auto
  selected: clips, zen, bn
  BN attempt 1: needs_review = 0.921 -> fail
  BN attempt 2: needs_review = 0.1247 -> pass

Live Output Will Vary

While preparing the full Field Note, we tried to get a neat live capture from a
local Ollama model.

That did happen. But not every time, and not in exactly the same way.

One model reproduced the naive car-wash failure and repaired cleanly. Another
reached a final pass, but the repaired prose was awkward. Another exposed
structured-output fragility before the deeper inspections could run cleanly.

For a minute, that was frustrating.

Then it became the point.

Live LLM output can vary. Model version, local server load, decoding behavior,
context handling, provider adapters, JSON behavior, timeout behavior, and small
prompt/runtime differences can all change what comes back.

That is why the intermediate artifacts matter:

What did the draft recommend?
What facts were extracted?
Which inspections were selected?
Which findings failed?
What repair packet was built?
Did the revised answer pass inspection?

The final paragraph alone is not enough.

Try It

Full Field Note: https://nxus.systems/field-notes/guardrail-loops-for-llm-repair
Example docs: https://docs.nxus.systems/nxuskit/examples/integrations/common-sense-guardrails/
Example source: https://github.com/nxus-SYSTEMS/nxusKit-examples/tree/main/examples/integrations/common-sense-guardrails
SDK: https://github.com/nxus-SYSTEMS/nxusKit

The lesson is not that one model always gets the car-wash question wrong. The
lesson is that a fluent answer is not the same thing as an inspected answer.

For workflows where correctness matters, let the LLM draft. Then make the facts
explicit, run the selected inspections, repair from evidence, and inspect the
repair.

DEV Community