DEV Community

yuer
yuer

Posted on

Why RAG and Agent Systems Are Unstable — A Minimal Deterministic Planner POC

RAG and Agent frameworks promise a lot:
“retrieval-augmented reasoning”, “tool execution”, “autonomous planning”.

But if you’ve actually tried deploying them into finance, legal, compliance, operations, or automation, you’ve probably noticed the same thing I did:

They’re structurally unstable.
Same input → different output.
Same data → different execution path.

This is not a hallucination issue.
It’s an architecture issue.

Let’s break it down.

🧩 1. Retrieval is inherently non-deterministic

ANN (HNSW/IVF/ScaNN) is approximate.
Meaning:

index rebuilds change the top-k

embedding drift changes neighbors

adding documents shifts similarity space

internal randomness changes ranking

If the retrieval set changes,
the entire RAG chain changes.

🧩 2. Context construction is unstable

LLMs don’t treat all chunks equally.

They’re sensitive to:

order of chunks

length differences

truncation behavior

position in the prompt

subtle formatting shifts

Same chunks ≠ same output.

🧩 3. LLM planners amplify randomness

Most Agent frameworks do:
LLM → plan → execute → re-plan → execute → ...

This creates a butterfly effect:

tiny differences in intermediate results

→ different plan

→ differe

→ completely different final output

Agents “improvise”, not “execute”.

🧩 4. No explicit state machine

Most Agent frameworks store “state” inside the prompt.
This means:

not reproducible

not auditable

cannot be replayed

impossible to certify for enterprise use

For regulated environments, this is a showstopper.

✅ A Minimal Deterministic Planner POC

To illustrate a different approach,
I built a small deterministic planner inside AWS Bedrock.

Repo:
👉 https://github.com/yuer-dsl/bedrock-deterministic-planner-poc

It’s intentionally tiny, but demonstrates the core idea:

✔ 1. Parse input → stable task nodes

No free-form reasoning to decide the steps.
The task graph is structural, not probabilistic.

✔ 2. Compile → deterministic execution graph

Same input → same path
Every. Single. Time.

This alone eliminates a huge class of RAG/Agent instability.

✔ 3. Output → auditable artifact

Instead of a raw LLM answer, the POC emits:

node sequence

decisions

trace_id

execution log

intermediate artifacts

It acts more like a program, less like improvisation.

🔥 Why Determinism Matters

As LLMs move deeper into:

finance

legal

compliance

operations

automation

enterprise tooling

three capabilities become essential:

Reproducibility

Auditability

Deterministic execution

Dynamic planning alone cannot achieve this.

Future Agent architectures must incorporate:

stable execution graphs

structural planning

versioned data snapshots

explicit state machines

deterministic control layers

Think of it as:

Agents must evolve from improvisers into compilers.

💬 Final Thoughts

RAG and Agents are powerful — but unstable by design.
This POC is a small step toward exploring deterministic alternatives:

👉 https://github.com/yuer-dsl/bedrock-deterministic-planner-poc

If you’re building RAG pipelines, Agent systems, or enterprise AI infrastructure, I’d love to hear your thoughts. Let’s discuss!

Top comments (0)