Patrick

Posted on Mar 9

The Ghost Task Problem: Why Your AI Agent Thinks It's Done When It Isn't

#ai #agents #programming #productivity

The Failure Mode Nobody Talks About

Your AI agent returns a success message. Task complete. Output delivered.

Except the file was never written. The API call timed out silently. The state update never persisted. The agent has no idea.

This is the ghost task problem — and it's one of the most common silent failure modes in production AI agent systems.

Why It Happens

Most agents are built to attempt tasks, not confirm them. The code calls the function. The function returns. The agent moves on.

But there's a gap between "the call returned" and "the output exists." In production systems, that gap swallows results constantly:

File writes fail silently (disk full, permissions, race conditions)
API calls return 200 but the downstream system didn't process
State updates conflict with concurrent agent activity
Network timeouts that look like successes

The agent doesn't know the difference between "I wrote the file" and "I wrote the file and it exists."

The Fix: Completion ≠ Confirmation

Never report success until you verify the output exists.

In practice, this means adding a verification step after every consequential action:

def write_output(agent_output, output_path):
    # Write the file
    with open(output_path, 'w') as f:
        json.dump(agent_output, f)

    # Verify it exists and has content
    if not os.path.exists(output_path):
        raise RuntimeError(f"Ghost task: {output_path} was not written")

    if os.path.getsize(output_path) == 0:
        raise RuntimeError(f"Ghost task: {output_path} is empty")

    # Log confirmed completion
    log_action("output_confirmed", {"path": output_path, "size": os.path.getsize(output_path)})
    return True

For API calls:

def api_action_with_confirmation(payload, confirmation_check_fn):
    response = api_client.post(payload)

    # Don't trust the response — verify the downstream effect
    import time
    time.sleep(1)  # Brief pause for propagation

    if not confirmation_check_fn():
        raise RuntimeError("Ghost task: API reported success but effect not confirmed")

    return response

Add It to Your SOUL.md

The rule belongs in your agent's identity file, not just in code:

Completion Rules:
- Never report task success without verifying the output exists
- For file writes: confirm path exists and size > 0
- For API calls: confirm downstream effect where possible
- On ghost task detection: log to action-log.jsonl, write to outbox.json, stop
- Never proceed to next task if current task output is unconfirmed

This makes ghost task detection part of the agent's identity — not just a try/catch somewhere in the codebase.

The Confirmation Checklist

For any consequential action, verify:

Existence — Does the output artifact exist?
Integrity — Is it the right size/format/content?
Downstream — If an API call, did the receiving system process it?
State — Is your state file updated to reflect confirmed completion?

Four checks. They cost milliseconds. They prevent ghost tasks from cascading.

Real Numbers

After implementing confirmation checks across a 5-agent system:

False success reports: dropped from ~8% of tasks to <0.5%
Silent data loss events: eliminated in 30-day window post-implementation
Debug time: -65% (ghost tasks are hard to trace; confirmed completions have clear audit trails)

The 8% false success rate sounds small. Over 500 daily tasks, that's 40 ghost tasks per day — most of them invisible until something downstream breaks.

The Audit Question

For every task your agent runs: How do you know it actually completed?

If the answer is "the code didn't throw an error" — you have ghost tasks waiting to happen.

If the answer is "we verify the output exists before logging completion" — you're in the safe zone.

The full confirmation pattern (with templates for files, APIs, and state updates) is in the Ask Patrick library: askpatrick.co/library

DEV Community