DEV Community

Cover image for One Prompt Change That Stopped My Agent Hallucinating
Gantz AI for Gantz

Posted on

One Prompt Change That Stopped My Agent Hallucinating

My agent was confident. Articulate. Completely wrong.

User: "What's in config.json?"

Agent: "Based on your config.json, you have:
- Database: PostgreSQL on port 5432
- Cache: Redis on port 6379
- API key: starts with 'sk-'"

User: "...I don't have a config.json"
Enter fullscreen mode Exit fullscreen mode

It made up an entire file. Convincingly.

One prompt change fixed it.

The problem

My agent would hallucinate when it should have used tools.

User: "How many lines in main.py?"

Agent: "main.py has approximately 150 lines of code."

# Reality: main.py has 47 lines. Agent never read it.
Enter fullscreen mode Exit fullscreen mode
User: "What tests are failing?"

Agent: "Looking at your test suite, it appears the
authentication tests are failing, specifically
test_login_invalid_password and test_session_timeout."

# Reality: No tests are failing. Agent never ran them.
Enter fullscreen mode Exit fullscreen mode

The agent was answering from imagination instead of observation.

Why this happens

LLMs are trained on patterns. They've seen millions of config files, test suites, and codebases.

When you ask "What's in config.json?", the model thinks:

"Config files usually have database settings, API keys, ports...
I'll describe a typical config file."

It's not lying. It's pattern-matching. But for an agent, pattern-matching without verification is hallucination.

The fix

I added one line to my system prompt:

# Before
SYSTEM_PROMPT = """
You are a helpful coding assistant with access to file and shell tools.
"""

# After
SYSTEM_PROMPT = """
You are a helpful coding assistant with access to file and shell tools.

IMPORTANT: Never state facts about files, code, or system state without
first using a tool to verify. If asked about a file, read it first.
If asked about tests, run them first. Never guess.
"""
Enter fullscreen mode Exit fullscreen mode

That's it. One rule: verify before stating.

The results

Before:

User: "What's in config.json?"
Agent: "Based on your config.json..." (hallucinated)
Enter fullscreen mode Exit fullscreen mode

After:

User: "What's in config.json?"
Agent: πŸ”§ read_file({"path": "config.json"})
"config.json contains:
{
  'debug': true,
  'port': 3000
}
That's all - just debug mode and port settings."
Enter fullscreen mode Exit fullscreen mode

Before:

User: "How many lines in main.py?"
Agent: "main.py has approximately 150 lines." (wrong)
Enter fullscreen mode Exit fullscreen mode

After:

User: "How many lines in main.py?"
Agent: πŸ”§ run_command({"command": "wc -l main.py"})
"main.py has 47 lines."
Enter fullscreen mode Exit fullscreen mode

Before:

User: "Are the tests passing?"
Agent: "The authentication tests appear to be failing..." (made up)
Enter fullscreen mode Exit fullscreen mode

After:

User: "Are the tests passing?"
Agent: πŸ”§ run_command({"command": "npm test"})
"All 23 tests pass. No failures."
Enter fullscreen mode Exit fullscreen mode

Why it works

The prompt creates a mental checkpoint:

Without the rule:
  Question β†’ Pattern match β†’ Answer (often wrong)

With the rule:
  Question β†’ "Do I need to verify?" β†’ Yes β†’ Use tool β†’ Answer (correct)
Enter fullscreen mode Exit fullscreen mode

The model pauses before answering. That pause makes it reach for tools instead of imagination.

Making it stronger

The basic rule works. These variations make it bulletproof:

Variation 1: Explicit examples

SYSTEM_PROMPT = """
You are a coding assistant with file and shell access.

NEVER state facts about the codebase without verifying first:
- Asked about a file? Read it.
- Asked about tests? Run them.
- Asked about git status? Check it.
- Asked about errors? Look at logs.

Don't say "I think" or "probably" or "typically". Verify and know.
"""
Enter fullscreen mode Exit fullscreen mode

Variation 2: Call out the failure mode

SYSTEM_PROMPT = """
You are a coding assistant with file and shell access.

WARNING: You have a tendency to guess about file contents and system
state based on common patterns. This leads to confidently wrong answers.

Before stating ANY fact about:
- File contents β†’ read the file
- Test results β†’ run the tests
- System state β†’ check with a command
- Dependencies β†’ check package.json/requirements.txt

If you catch yourself about to say something without tool verification,
stop and use a tool instead.
"""
Enter fullscreen mode Exit fullscreen mode

Variation 3: The "show your work" approach

SYSTEM_PROMPT = """
You are a coding assistant with file and shell access.

Always show your work:
1. State what you need to find out
2. Use a tool to find it
3. Report what the tool showed

Never skip step 2. No exceptions.
"""
Enter fullscreen mode Exit fullscreen mode

Common hallucination triggers

Watch out for these - they tempt the model to guess:

"What's in...?"

User: "What's in the .env file?"

Bad:  "Typically, .env contains DATABASE_URL, API_KEY..."
Good: πŸ”§ read_file({".env"}) β†’ actual contents
Enter fullscreen mode Exit fullscreen mode

"How many...?"

User: "How many tests do we have?"

Bad:  "Based on the test structure, approximately 50 tests..."
Good: πŸ”§ run_command({"find . -name '*.test.js' | wc -l"}) β†’ actual count
Enter fullscreen mode Exit fullscreen mode

"Is/Are...?"

User: "Is the server running?"

Bad:  "The server should be running on port 3000..."
Good: πŸ”§ run_command({"curl localhost:3000/health"}) β†’ actual status
Enter fullscreen mode Exit fullscreen mode

"Why is...?"

User: "Why is the build failing?"

Bad:  "Build failures are often caused by syntax errors or..."
Good: πŸ”§ run_command({"npm run build 2>&1 | tail -50"}) β†’ actual error
Enter fullscreen mode Exit fullscreen mode

"What version...?"

User: "What version of React are we using?"

Bad:  "Based on modern practices, probably React 18..."
Good: πŸ”§ read_file({"package.json"}) β†’ actual version
Enter fullscreen mode Exit fullscreen mode

The before/after pattern

Train yourself to recognize the pattern:

❌ Hallucination pattern:
   User asks β†’ Agent answers from knowledge β†’ Wrong

βœ… Verification pattern:
   User asks β†’ Agent uses tool β†’ Agent answers from tool output β†’ Correct
Enter fullscreen mode Exit fullscreen mode

Every answer about the current codebase should follow the tool first.

What about common knowledge?

The rule applies to facts about the current codebase/system. General knowledge is fine:

# This is fine - general knowledge
User: "What's the difference between let and const?"
Agent: "let allows reassignment, const doesn't..." (no tool needed)

# This needs verification - specific to this codebase
User: "Do we use let or const in this project?"
Agent: πŸ”§ search({"query": "\\blet\\b|\\bconst\\b"})
"The codebase uses const 234 times and let 12 times."
Enter fullscreen mode Exit fullscreen mode

Handling "I don't know"

Sometimes the tool reveals the answer doesn't exist:

User: "What's in config.json?"

Agent: πŸ”§ read_file({"path": "config.json"})
# Tool returns: "Error: File not found"

Agent: "There is no config.json in the current directory.
Would you like me to create one, or are you looking for
a different config file?"
Enter fullscreen mode Exit fullscreen mode

This is correct behavior. "File doesn't exist" is better than making up contents.

Implementation with Gantz

Using Gantz Run, the system prompt enforces verification:

# gantz.yaml
system: |
  You are a coding assistant.

  RULE: Never state facts about files, code, or system state without
  using a tool first. When asked about something, verify it.

  You have these tools:
  - read: Read file contents
  - search: Search for text in files
  - run: Run shell commands

  Use them before answering questions about the codebase.

tools:
  - name: read
    description: Read a file. Use this before stating anything about file contents.
    parameters:
      - name: path
        type: string
        required: true
    script:
      shell: cat "{{path}}"

  - name: search
    description: Search files. Use this to find code before making claims about it.
    parameters:
      - name: query
        type: string
        required: true
    script:
      shell: rg "{{query}}" . --max-count=20

  - name: run
    description: Run a command. Use this to check system state.
    parameters:
      - name: command
        type: string
        required: true
    script:
      shell: "{{command}}"
Enter fullscreen mode Exit fullscreen mode

The descriptions reinforce when to use each tool.

Testing for hallucinations

Try these prompts to test if your agent hallucinates:

# Should trigger tool use, not guessing
"What's in package.json?"
"How many files are in src/?"
"What does the main function do?"
"Are there any TODO comments?"
"What port does the server run on?"

# If the agent answers without using a tool, it's hallucinating
Enter fullscreen mode Exit fullscreen mode

Watch for phrases that indicate guessing:

  • "Typically..."
  • "Usually..."
  • "Based on common patterns..."
  • "I would expect..."
  • "Probably..."

These mean the model is about to hallucinate.

Summary

The hallucination fix:

# Add this to your system prompt
"""
Never state facts about files, code, or system state
without first using a tool to verify.
"""
Enter fullscreen mode Exit fullscreen mode

That's it. One rule.

Before this rule:

  • Agent guesses from patterns
  • Confident but wrong
  • User loses trust

After this rule:

  • Agent verifies with tools
  • Accurate answers
  • User trusts the agent

Stop guessing. Start verifying.


What's the worst thing your agent confidently made up?

Top comments (0)