Onions and Filters

#ai #programming #productivity #agents

When I started building my first harness around a coding agent, I did not picture an onion. I pictured a constraint system.

The LLM, on its own, can do almost anything. It can write code, hallucinate APIs, edit the wrong file, run a shell command in a directory it should not be in, decide a test failure is acceptable and move on. The space of things it might do on any given turn is enormous. The job of the harness, the way I thought about it, was to shrink that space.

That is the framing I learned in math. You start with a set; you add conditions; the set gets smaller until what remains is what you actually want. Each filter is a predicate. The harness is a sequence of filters.

Birgitta Böeckeler has been doing the most work I have seen to popularize the term harness engineering and to give it a working vocabulary. Her mental model is the onion: the agent at the center, with concentric layers of harness around it, each one closer or further from the model's reasoning loop. Tools, context, hooks, sandboxes, observability. The model sits in the middle and reaches out through the layers; the layers stand between the model and the world.

I like the onion. It is a good teaching shape because it gives you somewhere to point. "That belongs in this layer, not that one." "This hook fires here." But the onion is not the model I reach for when I am deciding what to add.

Two pictures of the same machine

The onion and the filter system describe the same artifact from opposite directions.

The onion looks outward from the model. Each layer is something you wrap around the agent so it can do its job: a tool surface, a system prompt, a sandbox, a review step. The vocabulary is additive. You give the agent tools. You provide context. You equip it.

The filter looks inward at the output. Each filter is a predicate over what the agent could do but should not. You take the full space of agent behaviors and shave off everything that does not survive the predicate. A sandbox is a filter: only filesystem operations inside this directory survive. A type check is a filter: only diffs that compile survive. A required review is a filter: only changes the reviewer agrees with survive.

Same machine, opposite framing. The onion is about what you add. The filter is about what you remove.

Why the framing matters

The two framings are equivalent in what they can describe, but they push you toward different decisions.

When I think in onion layers, I think about capabilities. "Does the agent have the tool it needs? Does it have the context to use it well?" The instinct is to add. Another tool, another hook, another piece of context loaded into the prompt.

When I think in filters, I think about constraints. "What is the agent currently allowed to do that it should not be? What slips through?" The instinct is to remove. A tighter sandbox, a stricter pre-commit, a smaller allowlist, a narrower file scope on the rule that keeps firing in the wrong place.

Both instincts are right at different moments. An under-capable agent needs more tools. An over-capable agent needs more filters. Most harnesses I have seen fail in the second direction, not the first: the agent has plenty of capability and not enough constraint, and the symptom is that it does the wrong thing confidently.

The math habit

The reason I default to the filter framing is that I learned to think this way before I ever wrote a harness.

In a math problem, you do not list the elements of the answer set. You write the set. Then you add conditions. The integers, then "positive", then "less than 100", then "prime". The answer is whatever survives every condition.

A harness is the same shape. The set is "things this agent could output". The conditions are the filters you stack. The answer, on any given turn, is whatever output survives all of them. You do not enumerate good behavior; you constrain bad behavior out.

This is also why a harness composed of filters is easier to reason about than one composed of layers. Filters compose by conjunction: each one is independently true or false of a given output. If something bad gets through, you ask which filter failed, or which filter was missing. If something good gets blocked, you ask which filter is too strict. The debugging move is local.

Layers, by contrast, have a position. You have to argue about which layer a new piece of behavior belongs to. Is the validation a tool concern, a sandbox concern, or a review concern? The onion gives you geography, and geography invites turf wars.

Where the onion still wins

The onion is the better picture when you are explaining the harness to someone who has not built one.

People understand layers. People understand "the agent sits in the middle and the world is outside". The onion makes it obvious that the agent does not see the world directly, that everything it does passes through something you control. That intuition is load-bearing for anyone new to the idea, and the filter picture is too abstract to carry it.

The onion is also better when you are drawing the architecture: where does this hook fire, what does it see, who reads its output. Position matters there. The onion gives you a way to put it on a whiteboard.

But once the architecture is in place and you are tuning the harness day to day, the question is almost always a filter question. What is the agent doing that I do not want? Which constraint, added or tightened, stops it? The work is subtractive even when the picture is additive.

The rule is constrain, not add

The unlock for me was realizing that almost every harness decision I make is a decision about constraints, even when it does not look like one.

A new tool seems additive, but the interesting design question is what the tool cannot do, what arguments it refuses, what state it will not touch. A new piece of context seems additive, but the question is what behavior it filters out by being present. A new agent in the review pipeline is a filter on the diffs that reach me.

The onion tells you where the piece goes. The filter tells you what the piece is for. I want both pictures available, but when I am writing a rule or adding a hook, the filter is the one I am holding in my head.

The harness does not give the agent power. The agent already has the power. The harness decides what the agent is allowed to do with it.