What if LLMs needed a spine, not a bigger brain?

#agents #llm #ai #architecture

I’ve been building something for the past few months, and I’m still trying to figure out whether I’m hitting a real problem or just over-structuring something that better prompting would already solve.

My starting intuition is simple: LLMs are very good at generating, but much less reliable when you expect continuity from them. As soon as you want an agent that can hold a line, remember things cleanly, recover after tension, and stay coherent over time, you start seeing the limits of the model on its own. Not necessarily because it lacks intelligence, but because it lacks a kind of skeleton.

In many systems, the LLM does everything at once: it speaks, it decides, it improvises its own memory and its own frame. And that works, until it starts to drift. Prompting can take you pretty far, but it still feels fragile.

That’s the space I’m exploring. The idea is to move governance outside the model: the LLM generates, but it does not decide on its own. An explicit policy layer handles decisions, state and memory carry continuity, and a timeline keeps an inspectable trace of what happened.

What I’m seeing so far is mostly more stability: less invention around internal state, better constraint-following, firmer boundaries under prompt injection, and less drift over long sequences.

That said, I don’t want to oversell it: I haven’t formally proven state causality, the actual impact of governed memory, or deterministic replay yet. What I have are strong signals, not hard proof.

I’m posting this mainly to test the framing. Does this way of thinking resonate? Are there better or earlier projects working on the same problem? Or am I just adding structure to something that better prompting will eventually absorb?

At the core, the question I keep coming back to is simple: if the LLM is the muscle, what does the skeleton look like?

DEV Community

What if LLMs needed a spine, not a bigger brain?

Top comments (0)