RAG and Agent frameworks promise a lot:
“retrieval-augmented reasoning”, “tool execution”, “autonomous planning”.
But if you’ve actually tried deploying them into finance, legal, compliance, operations, or automation, you’ve probably noticed the same thing I did:
They’re structurally unstable.
Same input → different output.
Same data → different execution path.
This is not a hallucination issue.
It’s an architecture issue.
Let’s break it down.
🧩 1. Retrieval is inherently non-deterministic
ANN (HNSW/IVF/ScaNN) is approximate.
Meaning:
index rebuilds change the top-k
embedding drift changes neighbors
adding documents shifts similarity space
internal randomness changes ranking
If the retrieval set changes,
the entire RAG chain changes.
🧩 2. Context construction is unstable
LLMs don’t treat all chunks equally.
They’re sensitive to:
order of chunks
length differences
truncation behavior
position in the prompt
subtle formatting shifts
Same chunks ≠ same output.
🧩 3. LLM planners amplify randomness
Most Agent frameworks do:
LLM → plan → execute → re-plan → execute → ...
This creates a butterfly effect:
tiny differences in intermediate results
→ different plan
→ differe
→ completely different final output
Agents “improvise”, not “execute”.
🧩 4. No explicit state machine
Most Agent frameworks store “state” inside the prompt.
This means:
not reproducible
not auditable
cannot be replayed
impossible to certify for enterprise use
For regulated environments, this is a showstopper.
✅ A Minimal Deterministic Planner POC
To illustrate a different approach,
I built a small deterministic planner inside AWS Bedrock.
Repo:
👉 https://github.com/yuer-dsl/bedrock-deterministic-planner-poc
It’s intentionally tiny, but demonstrates the core idea:
✔ 1. Parse input → stable task nodes
No free-form reasoning to decide the steps.
The task graph is structural, not probabilistic.
✔ 2. Compile → deterministic execution graph
Same input → same path
Every. Single. Time.
This alone eliminates a huge class of RAG/Agent instability.
✔ 3. Output → auditable artifact
Instead of a raw LLM answer, the POC emits:
node sequence
decisions
trace_id
execution log
intermediate artifacts
It acts more like a program, less like improvisation.
🔥 Why Determinism Matters
As LLMs move deeper into:
finance
legal
compliance
operations
automation
enterprise tooling
three capabilities become essential:
Reproducibility
Auditability
Deterministic execution
Dynamic planning alone cannot achieve this.
Future Agent architectures must incorporate:
stable execution graphs
structural planning
versioned data snapshots
explicit state machines
deterministic control layers
Think of it as:
Agents must evolve from improvisers into compilers.
💬 Final Thoughts
RAG and Agents are powerful — but unstable by design.
This POC is a small step toward exploring deterministic alternatives:
👉 https://github.com/yuer-dsl/bedrock-deterministic-planner-poc
If you’re building RAG pipelines, Agent systems, or enterprise AI infrastructure, I’d love to hear your thoughts. Let’s discuss!
Top comments (0)