🎥 Full demo video:
https://www.youtube.com/watch?v=zgHvAkcBINs
(This article is about AI controllability — not trading advice.)
I Turned ChatGPT Web Into a Controllable Risk Engine — No Code, No API, No RAG
Most developers assume LLMs need:
Prompt engineering
RAG
Agent frameworks
Tool calling
Memory routers
Long templates
But what happens if you take all of that away — and force an LLM to operate under bare-metal conditions?
That’s exactly the experiment I ran.
I removed everything:
No API
No code
No tools
No plugins
No RAG
No vector DB
No system prompts
No prompt engineering
Just the ChatGPT web UI, the same interface every regular user has.
What emerged wasn’t a chatbot.
It behaved like a controllable, auditable, stable risk engine.
And that result challenges a lot of assumptions about LLMs today.
- Why This Experiment Matters
The narrative around LLMs has been consistent:
“LLMs hallucinate”
“LLMs can’t maintain state”
“You need RAG for reliability”
“Agents are required for complex tasks”
“Prompt engineering is essential”
But these beliefs haven’t been tested under extreme minimalism.
So I asked a different question:
Can a modern LLM behave like a deterministic decision system using only constraints expressed in natural language?
This experiment suggests the answer is yes.
- The Setup: A “Bare-Metal” LLM Runtime
I asked ChatGPT to act as a risk engine for a trading scenario, with strict rules:
It must enforce constraints
It must reject unsafe actions
It must maintain state
It must provide full reasoning chains
It must not hallucinate data
It must remain consistent across many turns
All of this without:
Code
APIs
Agents
Tools
Retrieval
This is a brutal test for any reasoning system.
And yet… it worked.
- What Emerged: A Language-State Runtime
Across the session, ChatGPT demonstrated:
3.1 Constraint Enforcement
The model consistently refused actions that violated risk boundaries:
It rejected unsafe entries
It explained why
It proposed alternatives
It maintained the system’s “risk mode”
This is classic behavior of a constraints engine, not a chatbot.
3.2 Stable State Across Turns
The model retained:
The scenario
The rules
The goals
The risk model
The previous decisions
Without drifting
Without contradicting itself
Without forgetting context
No external memory — just semantic state stability.
3.3 Full Audit Trails
Every decision included:
Reasoning chains
Risk classification
Behavior logic
Market structure interpretation
Boundary conditions
A fully auditable output — rare for unconstrained LLMs.
3.4 Zero Hallucination Without RAG
No retrieval.
No grounding.
No tools.
Yet the model avoided:
Fabricated data
False numbers
Incorrect market logic
This challenges the common belief that RAG is required for reliability.
- A Hypothesis: LLMs as Behavior Engines
The experiment suggests a new model for thinking about LLMs:
LLMs can act as behavior machines when the developer supplies structured constraints instead of traditional prompts.
This leads to what I call:
Language-State Runtime (LSR)
Where:
Language defines control rules
The model forms internal state graphs
Behavior becomes stable through constraints
Execution is deterministic at a semantic level
This is not prompt engineering.
This is behavior programming through expression structure.
- Implications for AI Engineering
If this holds true, it means:
We might not need:
❌ Complex agent stacks
❌ Prompt engineering templates
❌ RAG pipelines for stability
❌ Tool orchestration for reasoning
We might need instead:
✔ Behavior constraints
✔ State-oriented language design
✔ Semantic control loops
✔ Expressive rule systems
This shifts AI from:
“Build tools around the model”
to:
“Program the model through structured constraints.”
A fundamentally different architecture.
- Important Disclaimer
This experiment uses a trading scenario, but:
It is purely about AI controllability
It is not financial advice
No trading strategy is being recommended
Markets are unpredictable
The purpose is to test AI behavior under constraints
The focus is on LLM behavior, not financial outcomes.
Conclusion: We’ve Underestimated LLM Controllability
This experiment started as a curiosity.
It became a discovery.
LLMs are far more controllable than current frameworks assume.
And maybe the future of AI isn’t built on:
Agents
RAG
Prompts
But on:
Constraints
Structure
Behavioral rules
If ChatGPT Web — under zero code and zero scaffolding — can behave like a stable risk engine…
then we might need to completely rethink how AI systems should be designed.
⭐ If you want a follow-up article, let me know. I can explore:
How Language-State Runtimes work
How to design constraint-based AI systems
Why prompt engineering may be overrated
Why RAG isn’t always necessary
How to get deterministic behavior from LLMs
Top comments (0)