yuer

Posted on Dec 8, 2025

I Turned ChatGPT Web Into a Controllable Risk Engine — No Code, No API, No RAG

🎥 Full demo video:
https://www.youtube.com/watch?v=zgHvAkcBINs

(This article is about AI controllability — not trading advice.)

Most developers assume LLMs need:

Prompt engineering

RAG

Agent frameworks

Tool calling

Memory routers

Long templates

But what happens if you take all of that away — and force an LLM to operate under bare-metal conditions?

That’s exactly the experiment I ran.

I removed everything:

No API

No code

No tools

No plugins

No RAG

No vector DB

No system prompts

No prompt engineering

Just the ChatGPT web UI, the same interface every regular user has.

What emerged wasn’t a chatbot.

It behaved like a controllable, auditable, stable risk engine.

And that result challenges a lot of assumptions about LLMs today.

Why This Experiment Matters

The narrative around LLMs has been consistent:

“LLMs hallucinate”

“LLMs can’t maintain state”

“You need RAG for reliability”

“Agents are required for complex tasks”

“Prompt engineering is essential”

But these beliefs haven’t been tested under extreme minimalism.

So I asked a different question:

Can a modern LLM behave like a deterministic decision system using only constraints expressed in natural language?

This experiment suggests the answer is yes.

The Setup: A “Bare-Metal” LLM Runtime

I asked ChatGPT to act as a risk engine for a trading scenario, with strict rules:

It must enforce constraints

It must reject unsafe actions

It must maintain state

It must provide full reasoning chains

It must not hallucinate data

It must remain consistent across many turns

All of this without:

Code

APIs

Agents

Tools

Retrieval

This is a brutal test for any reasoning system.

And yet… it worked.

What Emerged: A Language-State Runtime

Across the session, ChatGPT demonstrated:

3.1 Constraint Enforcement

The model consistently refused actions that violated risk boundaries:

It rejected unsafe entries

It explained why

It proposed alternatives

It maintained the system’s “risk mode”

This is classic behavior of a constraints engine, not a chatbot.

3.2 Stable State Across Turns

The model retained:

The scenario

The rules

The goals

The risk model

The previous decisions

Without drifting
Without contradicting itself
Without forgetting context

No external memory — just semantic state stability.

3.3 Full Audit Trails

Every decision included:

Reasoning chains

Risk classification

Behavior logic

Market structure interpretation

Boundary conditions

A fully auditable output — rare for unconstrained LLMs.

3.4 Zero Hallucination Without RAG

No retrieval.
No grounding.
No tools.

Yet the model avoided:

Fabricated data

False numbers

Incorrect market logic

This challenges the common belief that RAG is required for reliability.

A Hypothesis: LLMs as Behavior Engines

The experiment suggests a new model for thinking about LLMs:

LLMs can act as behavior machines when the developer supplies structured constraints instead of traditional prompts.

This leads to what I call:

Language-State Runtime (LSR)

Where:

Language defines control rules

The model forms internal state graphs

Behavior becomes stable through constraints

Execution is deterministic at a semantic level

This is not prompt engineering.
This is behavior programming through expression structure.

Implications for AI Engineering

If this holds true, it means:

We might not need:

❌ Complex agent stacks
❌ Prompt engineering templates
❌ RAG pipelines for stability
❌ Tool orchestration for reasoning

We might need instead:

✔ Behavior constraints
✔ State-oriented language design
✔ Semantic control loops
✔ Expressive rule systems

This shifts AI from:

“Build tools around the model”

to:

“Program the model through structured constraints.”

A fundamentally different architecture.

Important Disclaimer

This experiment uses a trading scenario, but:

It is purely about AI controllability

It is not financial advice

No trading strategy is being recommended

Markets are unpredictable

The purpose is to test AI behavior under constraints

The focus is on LLM behavior, not financial outcomes.

Conclusion: We’ve Underestimated LLM Controllability

This experiment started as a curiosity.
It became a discovery.

LLMs are far more controllable than current frameworks assume.

And maybe the future of AI isn’t built on:

Agents

RAG

Prompts

But on:

Constraints

Structure

Behavioral rules

If ChatGPT Web — under zero code and zero scaffolding — can behave like a stable risk engine…

then we might need to completely rethink how AI systems should be designed.

⭐ If you want a follow-up article, let me know. I can explore:

How Language-State Runtimes work

How to design constraint-based AI systems

Why prompt engineering may be overrated

Why RAG isn’t always necessary

How to get deterministic behavior from LLMs

DEV Community

I Turned ChatGPT Web Into a Controllable Risk Engine — No Code, No API, No RAG

Top comments (0)