DEV Community

YuudaiIkoma
YuudaiIkoma

Posted on

I Added 3 Axioms to Claude's System Prompt and the Reasoning Quality Visibly Changed — Here's the Side-by-Side Comparison Tool

https://claude.ai/public/artifacts/eba2a270-dd61-4f0c-a276-34a53e604f13

TL;DR

  • Same Claude (Sonnet), same question, sent simultaneously to two instances
  • One has a vanilla prompt, the other has 3 mathematical axioms from a paper embedded in its system prompt
  • The enhanced version returns structurally deeper answers
  • It never mentions the axioms or uses any jargon — the depth just appears naturally
  • I built a comparison tool (Claude artifact) so anyone can try it

The Idea: What If You Teach an LLM a "Way of Seeing" Instead of Facts?

Standard prompt engineering gives models roles, steps, and examples. But what if instead of teaching what to think, you teach how to see?

I found a paper (Dimensional Ontological Triad Law v3.1) that proposes 3 mathematical axioms describing structural relationships between dimensions. The axioms are abstract — they say nothing about language models, consciousness, or physics specifically.

I embedded these axioms into Claude's system prompt along with a reasoning protocol and one critical rule: never use the paper's terminology in the output. The model should simply reason better, not talk about the theory.

Then I built a tool that sends the same question to both versions simultaneously.

Experiment Design

Normal Claude Enhanced Claude
Model claude-sonnet-4-20250514 claude-sonnet-4-20250514
System prompt "You are a helpful assistant." 3 axioms + reasoning protocol + "never use theory jargon"
Question Identical Identical

The enhanced system prompt is ~2,000 tokens. No RAG, no external retrieval — everything is embedded directly.

Results

Question 1: "Why do large language models show emergent capabilities?"

Normal Claude:

  • Scaling thresholds
  • Complex pattern learning
  • Task interactions
  • Measurement artifacts

→ Textbook factor enumeration. Lists what, doesn't explain why.

Enhanced Claude:

  • "The measurement method changes what you see" — binary eval vs. probability distribution
  • "Inside the model, structure forms continuously"
  • "Emergent capabilities are an illusion of measurement. Continuous internal changes appear discontinuous when measured with discrete thresholds"
  • Uses the water-at-100°C metaphor: continuous heating, discontinuous phase transition

→ Addresses why it looks sudden with a structural mechanism.

Question 2: "Why can't humans know their own lifespan?"

Normal Claude:

  • Complex interacting factors
  • Unpredictable events
  • Biological individual differences
  • Medical limitations

→ Lists reasons it's hard. Doesn't address why it's structurally impossible.

Enhanced Claude:

  • "Rooted in the structure of recognition itself"
  • "We live inside time. Like a point walking on a line — it can only see where it is now"
  • "To see the whole line, you need to step outside the line. But as a point on the line, that's impossible"
  • "Not being able to know your lifespan isn't a flaw — it's an essential condition of living inside time"

→ Explains structural impossibility. Independently arrives at an argument isomorphic to the halting problem, expressed through an intuitive metaphor.

Question 3: "What is consciousness?"

Same pattern. Normal Claude surveys definitions and theories. Enhanced Claude explores why consciousness changes when you try to observe it — a structural explanation rather than a definitional one.

What's Happening?

Enhanced Claude wasn't taught answers. It was taught a way of seeing.

The axioms are abstract mathematical statements. They say nothing about "lifespan" or "language models." Yet for each question, Enhanced Claude returns structurally deeper insights.

This might be better described as adding a cognitive framework rather than adding knowledge.

Limitations (Being Honest)

  • No quantitative benchmarks yet (qualitative comparison only)
  • Need to separate "axiom content" vs. "long system prompt" effects
  • Limited sample size
  • LLM outputs are stochastic — same question won't produce identical text twice

Try It Yourself

I published the comparison tool as a Claude artifact. You can enter any question and see both responses side by side.

Sample questions to try:

  • "Why do large language models show emergent capabilities?"
  • "Why can't humans know their own lifespan?"
  • "Why is the halting problem undecidable?"

[Link to artifact — add when publishing]

Implementation

// Two Claudes called simultaneously
const call = async (systemPrompt, setResult) => {
  const response = await fetch("https://api.anthropic.com/v1/messages", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1000,
      system: systemPrompt,
      messages: [{ role: "user", content: question }]
    })
  });
  const data = await response.json();
  // Display data.content[].text
};

// Fire both in parallel
call(vanillaPrompt, setNormalResult);
call(dimensionalPrompt, setEnhancedResult);
Enter fullscreen mode Exit fullscreen mode

Paper

The axioms come from an open-access paper:

Key Takeaway

Same model. Same question. Different depth.

What changed wasn't the model's knowledge — it was its approach to the question. I didn't teach it answers. I taught it a way of seeing.

Feedback, alternative test questions, and attempts to break it are all welcome.


https://claude.ai/public/artifacts/eba2a270-dd61-4f0c-a276-34a53e604f13

Tags: #ai #llm #claude #promptengineering #machinelearning

Top comments (1)

Collapse
 
nyrok profile image
Hamza KONTE

This is such a clear demonstration that prompt structure is the real lever — not model size or fine-tuning. Your 3-axiom experiment is essentially what I tried to formalize with flompt: a free visual prompt builder that breaks any prompt into 12 semantic blocks (role, objective, chain of thought, constraints, output format, etc.) and compiles them into Claude-optimized XML. The "chain of thought" block alone produces the same jump you documented here. Would love to hear how your axioms map to explicit block types. flompt.dev / github.com/Nyrok/flompt