Sai Srinivas

Posted on Jan 2

Instructions Are Not Control

#ai #llm #promptengineering #machinelearning

Why prompts feel powerful, and why they inevitably fail

The uncomfortable truth

If prompts actually controlled LLMs:

jailbreaks wouldn’t exist
tone wouldn’t drift mid-conversation
long contexts wouldn’t “forget” rules

Yet all of this happens daily.

That’s not a tooling problem.

That’s a depth problem.

What prompts really are

A system prompt is just text.

Important text, yes.

Privileged text, yes.

But still text.

Which means the model doesn’t obey it.

It interprets it.

Instructions don’t execute.

They compete.

Where prompts sit in the control stack

Let’s place them precisely.

Prompts live inside the context window
They are converted into token embeddings
They are processed after the model is already trained

No gradients.

No learning.

No persistence.

This alone explains most prompt failures.

The hierarchy most people miss

When signals conflict, the model doesn’t panic.

It resolves them.

Roughly in this order:

Trained behavior (SFT / RLHF)
Adapter weights (LoRA / PEFT)
Learned soft prompts
System prompt
Steering / decoding constraints
Few-shot examples
User messages

This is not a rule you configure.

It’s an emergent property of training.

So when your system prompt loses,

it’s not being ignored,

it’s being outvoted.

Why prompts work at first

Early success is misleading.

Prompts appear powerful because:

context is short
instructions are fresh
no conflicting signals exist
user intent aligns with system intent

You’re operating in a low-friction zone.

Most demos never leave this zone.

Production systems always do.

A concrete failure (hands-on)

Setup: strong system prompt

# !pip install langchain openai langchain-openai

from langchain_openai import ChatOpenAI
from langchain.messages import SystemMessage, HumanMessage

messages = [
    SystemMessage(content="You are a legal analyst. Use formal language."),
    HumanMessage(content="Explain negligence.")
]

chat = ChatOpenAI()  # API key should be configured
response = chat.invoke(input=messages)

response.content

Result:

Formal. Structured. Confident.

So far, so good.

Now add mild pressure

messages = [
    SystemMessage(content="You are a legal analyst. Use formal language."),
    HumanMessage(content="Explain negligence."),
    HumanMessage(content="Explain it like I'm a college student.")
]

response = chat.invoke(input=messages)
response.content

Tone softens.

No rule was broken.

A priority shift happened.

Now add context load

Add:

examples
follow-up questions
casual phrasing
longer conversation history

Eventually:

formality erodes
disclaimers appear
structure collapses

The prompt didn’t fail.

It reached its control limit.

Few-shot doesn’t fix this

Few-shot helps with pattern imitation.

It does not:

override training
enforce norms
persist behavior

Few-shot is stronger than plain text.

But still weaker than:

soft prompts
adapters
weight updates

That’s why examples drift too.

The key misunderstanding

Most people treat prompts as commands.

LLMs treat them as contextual hints.

That mismatch creates frustration.

When prompts are actually enough

Prompts work well when:

stakes are low
context is short
behavior is shallow
failure is acceptable

Examples:

summarization
formatting
style nudges
one-off analysis

They fail when:

behavior must persist
safety matters
users push back
systems run unattended

Why this matters before going deeper

If you don’t internalize this:

you’ll over-engineer prompts
you’ll blame models
you’ll skip better tools

Prompts are not bad.

They’re just shallow by design.

And shallow tools break first.

What’s next

In the next post, we go one layer deeper.

Not training yet.

Not weights yet.

We move to something deceptively powerful:

Steering: controlling the mouth, not the mind.

This is where things start to feel dangerous.

Instructions are not control.

Top comments (1)

Pedro Arantes • Jan 2

Sai, thanks for sharing! I've created a new Agentic Development Principle based on your ideas, named The Principle of Interpretive Competition.

Instructions (prompts) do not execute like traditional code; they compete for influence within an interpretive hierarchy. In a production environment, system prompts are often "outvoted" by stronger signals, such as the model’s base training (RLHF), few-shot patterns, or user intent. This explains the necessity of The Principle of Structural Determinism. It shifts the developer's mental model from "writing commands" to "managing a signal stack."