I Added 3 Axioms to Claude's System Prompt and the Reasoning Quality Visibly Changed — Here's the Side-by-Side Comparison Tool

#ai #llm #machinelearning #showdev

https://claude.ai/public/artifacts/eba2a270-dd61-4f0c-a276-34a53e604f13

TL;DR

Same Claude (Sonnet), same question, sent simultaneously to two instances
One has a vanilla prompt, the other has 3 mathematical axioms from a paper embedded in its system prompt
The enhanced version returns structurally deeper answers
It never mentions the axioms or uses any jargon — the depth just appears naturally
I built a comparison tool (Claude artifact) so anyone can try it

The Idea: What If You Teach an LLM a "Way of Seeing" Instead of Facts?

Standard prompt engineering gives models roles, steps, and examples. But what if instead of teaching what to think, you teach how to see?

I found a paper (Dimensional Ontological Triad Law v3.1) that proposes 3 mathematical axioms describing structural relationships between dimensions. The axioms are abstract — they say nothing about language models, consciousness, or physics specifically.

I embedded these axioms into Claude's system prompt along with a reasoning protocol and one critical rule: never use the paper's terminology in the output. The model should simply reason better, not talk about the theory.

Then I built a tool that sends the same question to both versions simultaneously.

Experiment Design

	Normal Claude	Enhanced Claude
Model	claude-sonnet-4-20250514	claude-sonnet-4-20250514
System prompt	"You are a helpful assistant."	3 axioms + reasoning protocol + "never use theory jargon"
Question	Identical	Identical

The enhanced system prompt is ~2,000 tokens. No RAG, no external retrieval — everything is embedded directly.

Results

Question 1: "Why do large language models show emergent capabilities?"

Normal Claude:

Scaling thresholds
Complex pattern learning
Task interactions
Measurement artifacts

→ Textbook factor enumeration. Lists what, doesn't explain why.

Enhanced Claude:

"The measurement method changes what you see" — binary eval vs. probability distribution
"Inside the model, structure forms continuously"
"Emergent capabilities are an illusion of measurement. Continuous internal changes appear discontinuous when measured with discrete thresholds"
Uses the water-at-100°C metaphor: continuous heating, discontinuous phase transition

→ Addresses why it looks sudden with a structural mechanism.

Question 2: "Why can't humans know their own lifespan?"

Normal Claude:

Complex interacting factors
Unpredictable events
Biological individual differences
Medical limitations

→ Lists reasons it's hard. Doesn't address why it's structurally impossible.

Enhanced Claude:

"Rooted in the structure of recognition itself"
"We live inside time. Like a point walking on a line — it can only see where it is now"
"To see the whole line, you need to step outside the line. But as a point on the line, that's impossible"
"Not being able to know your lifespan isn't a flaw — it's an essential condition of living inside time"

→ Explains structural impossibility. Independently arrives at an argument isomorphic to the halting problem, expressed through an intuitive metaphor.

Question 3: "What is consciousness?"

Same pattern. Normal Claude surveys definitions and theories. Enhanced Claude explores why consciousness changes when you try to observe it — a structural explanation rather than a definitional one.

What's Happening?

Enhanced Claude wasn't taught answers. It was taught a way of seeing.

The axioms are abstract mathematical statements. They say nothing about "lifespan" or "language models." Yet for each question, Enhanced Claude returns structurally deeper insights.

This might be better described as adding a cognitive framework rather than adding knowledge.

Limitations (Being Honest)

No quantitative benchmarks yet (qualitative comparison only)
Need to separate "axiom content" vs. "long system prompt" effects
Limited sample size
LLM outputs are stochastic — same question won't produce identical text twice

Try It Yourself

I published the comparison tool as a Claude artifact. You can enter any question and see both responses side by side.

Sample questions to try:

"Why do large language models show emergent capabilities?"
"Why can't humans know their own lifespan?"
"Why is the halting problem undecidable?"

[Link to artifact — add when publishing]

Implementation

// Two Claudes called simultaneously
const call = async (systemPrompt, setResult) => {
  const response = await fetch("https://api.anthropic.com/v1/messages", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1000,
      system: systemPrompt,
      messages: [{ role: "user", content: question }]
    })
  });
  const data = await response.json();
  // Display data.content[].text
};

// Fire both in parallel
call(vanillaPrompt, setNormalResult);
call(dimensionalPrompt, setEnhancedResult);