Why Your LLM Ignores Detailed Instructions (It's Not a Bug)

#rag

You've been there. You write a meticulous 100-step prompt. You stuff it into a 1M-token context. The model ignores half of it.

This isn't a bug. It's the structural ceiling of LLMs — and understanding it will change how you design AI systems.

The "Human Chunk" Problem

LLMs are trained on human-written text. Humans write in natural units: blog posts, emails, functions, conversation turns. I call these human chunks.

The model's probability space is structured around these chunks. When you input fine-grained instructions, the model doesn't process them at your granularity — it elevates them to human-chunk level. A 100-step procedure becomes "do the task."

What This Means for Your System Design

# This won't work as expected:
prompt = """
Step 1: Check if X
Step 2: If X, do Y
Step 3: Verify Y was done
... (97 more steps)
"""
response = llm.call(prompt)
# Model processes this as one big chunk, not 100 steps

# This works better:
result_1 = llm.call("Check if X")
result_2 = llm.call(f"Given {result_1}, do Y")
result_3 = llm.call(f"Verify: {result_2}")
# Each call is at human-chunk granularity

The Design Principle

Systems that accept this ceiling — stateless chains, tasks split at human-chunk granularity — naturally improve as models get better. Systems that fight this ceiling need re-engineering every model update.

Prompt engineering is optimization within the ceiling. It's valuable, but it doesn't change the ceiling itself.

Takeaway

Stop trying to overcome the human-chunk ceiling with more detailed prompts. Design around it instead. Your system will be simpler, more robust, and will scale with model improvements automatically.

What patterns have you found for working with this ceiling instead of against it?