DEV Community

Cover image for My OpenClaw agent started writing nonsense and the real fix was a kill switch, not a better prompt
Lars Winstand
Lars Winstand

Posted on • Originally published at standardcompute.com

My OpenClaw agent started writing nonsense and the real fix was a kill switch, not a better prompt

I hit a thread on r/openclaw with the perfect title: “How to stop an insane model from openclaw.”

That’s the whole post, honestly.

Because once /abort stops working, you are not doing prompt engineering anymore. You are doing incident response.

The original poster was running OpenClaw with Ollama and Kimi-K2.6:cloud. The agent started dumping gibberish. /abort didn’t help. stop didn’t help. Restarting Ollama didn’t help.

That’s the moment where a lot of people reach for a better system prompt.

I think that’s the wrong instinct.

If a coding agent has shell access and write access, the real question is not:

“How do I make the model behave?”

It’s:

“How do I make a bad run cheap to kill and easy to clean up?”

My take: self-healing agents are only safe if they are easy to contain, supervise, and terminate from outside the chat loop.

So if you’re running OpenClaw, or building similar agent flows in n8n, Make, Zapier, OpenClaw, or your own runner, here’s the setup I’d actually trust.

The big mistake: treating prompts like a safety system

Prompts matter.

Good prompts reduce ambiguity. Narrow scopes help. Short tasks are easier to recover from than “go refactor the app.”

But prompts are steering.

They are not brakes.

That distinction gets ignored because the happy path looks great. You point GPT-5.4 or Claude Opus 4.6 at a coding task, it edits files, runs tests, explains the diff, and everyone starts believing the agent is reliable.

Then one run goes feral and you remember what this really is: a probabilistic system with tool access.

If /abort is dead, your prompt is no longer the control plane.

Your architecture is.

What actually helped in the OpenClaw thread

The most useful reply in that thread was also the least glamorous.

A Reddit user basically said: don’t trust that repo path anymore, and run the agent in a git worktree or disposable clone so the blast radius stays contained.

That is exactly right.

Not:

  • add another safety reminder
  • repeat the task more clearly
  • ask the model to be careful

Instead:

  • isolate the workspace
  • gate writes
  • supervise liveness
  • kill fast when the run degrades

That’s how you make agent failures survivable.

Rule #1: never let a coding agent work directly in your main checkout

If an agent can edit files, the first thing you should control is blast radius.

The cheapest way to do that for local dev is usually git worktree.

Create a disposable workspace:

git worktree add -b agent-sandbox ../repo-agent-sandbox
Enter fullscreen mode Exit fullscreen mode

Or track from your main branch:

git worktree add --track -b agent-sandbox ../repo-agent-sandbox origin/main
Enter fullscreen mode Exit fullscreen mode

Now the agent can do weird stuff in a separate working tree instead of vandalizing your main checkout.

That one step changes the whole risk profile.

Execution mode What happens when the agent goes weird
Direct repo execution Highest blast radius. Fastest setup, worst failure mode.
Disposable git worktree Cheap isolation for normal coding tasks. Easy diff review and cleanup.
Disposable clone or container Better isolation, more overhead. Best for long-running or less trusted agents.

My opinion: git worktree should be the default for coding agents.

Running directly in your main repo should feel sketchy, because it is.

Rule #2: approvals belong outside the model

This is where OpenClaw is actually better than people give it credit for.

OpenClaw’s approvals system gives you host-level policy enforcement. That matters because external policy is real control. Model instructions are not.

Check the current policy:

openclaw exec-policy show
Enter fullscreen mode Exit fullscreen mode

See the cautious preset:

openclaw exec-policy preset cautious --json
Enter fullscreen mode Exit fullscreen mode

Set approvals from stdin:

openclaw approvals set --stdin <<'EOF'
{ version: 1, defaults: { security: "full", ask: "off" } }
EOF
Enter fullscreen mode Exit fullscreen mode

If I’m letting an agent write code, I want the host policy to be the source of truth.

Not the prompt.

Not the model’s self-reported plan.

The host.

My practical approval rules

  • Read-only analysis: mostly fine to automate
  • Code edits inside a sandbox worktree: allowed, but review before merge
  • Shell commands that can rewrite state: require approval
  • Anything touching deploys, secrets, infra, or production data: separate environment or no autonomous execution

Yes, this is slower than YOLO mode.

That’s the point.

Guardrails are supposed to feel annoying right before they save you.

Rule #3: if /abort fails, you need a kill path outside the chat loop

One of the funniest replies in that thread was just:

ctrl C

Blunt, but correct.

If the in-band abort path is broken, you need out-of-band control.

That means your agent runner should support a hard kill from the supervisor layer, not just from the conversation layer.

At minimum, I want all of these available:

  • keyboard interrupt
  • process kill
  • child process cleanup
  • workspace disposal
  • retry suppression

If your architecture depends on the model cleanly cooperating with shutdown, you do not have a kill switch.

You have a suggestion box.

Rule #4: add a heartbeat, because zombie sub-agents are real

A separate OpenClaw thread about WhatsApp reliability had a reply that jumped out at me:

“It ended up being sub agent that are still running.”

That’s not a prompt problem.

That’s orchestration drift.

Now we’re talking about:

  • hung jobs
  • child workers that never exit
  • retries that stack
  • liveness checks that don’t exist
  • supervisors that never declare failure

This is the same class of problem you get in n8n, Make, Zapier, or any custom agent runner. If the workflow can loop unattended, then you need process supervision.

Minimum viable heartbeat design

  1. Liveness timeout: if no valid event arrives for 30 to 60 seconds, mark the run unhealthy
  2. Sanity check: detect repeated malformed tool calls, repeated identical output, or obvious gibberish loops
  3. Step budget: cap tool invocations and file mutations per task
  4. Retry budget: one retry max
  5. Hard abort: kill the process tree and discard the workspace

That is what “self-healing” should mean in practice.

Not “retry forever until vibes improve.”

Sometimes the healthy behavior is to declare the run unrecoverable and tear it down.

A simple supervisor pattern

Here’s the rough shape I’d use around an OpenClaw run:

#!/usr/bin/env bash
set -euo pipefail

WORKTREE="../repo-agent-$(date +%s)"
BRANCH="agent-run-$(date +%s)"
TIMEOUT_SECONDS=45

cleanup() {
  pkill -P $$ || true
  git worktree remove "$WORKTREE" --force || true
}

trap cleanup EXIT INT TERM

git worktree add -b "$BRANCH" "$WORKTREE"
cd "$WORKTREE"

# Run the agent under an outer timeout.
timeout "$TIMEOUT_SECONDS" openclaw run "Update only src/auth/login.ts and run auth tests"
Enter fullscreen mode Exit fullscreen mode

This is not fancy. That’s why I like it.

You can make it smarter later with:

  • event stream monitoring
  • output sanity checks
  • child process tracking
  • model fallback routing
  • diff-based rollback rules

But even this basic wrapper is better than trusting /abort to save you.

Better prompts still help. They just help in a different way.

I don’t want to overcorrect and pretend prompts don’t matter.

They do.

Specific tasks are easier to supervise.

This is bad:

Fix the auth flow, clean up the frontend, and improve error handling.
Enter fullscreen mode Exit fullscreen mode

This is much better:

Edit only src/auth/login.ts.
Do not modify any other files.
Run npm test -- auth/login.test.ts.
Stop after reporting the diff and test result.
Enter fullscreen mode Exit fullscreen mode

That kind of prompt improves success rate.

But the important distinction is this:

prompts improve good runs

guardrails improve bad runs

If I have to choose which one matters more for an unattended coding agent, I’m picking guardrails every time.

The cost problem shows up fast when agents fail badly

There’s also a very practical economics angle here.

A looping agent is not just a reliability bug. Under per-token pricing, it becomes a billing bug too.

That’s true whether the loop lives in:

  • OpenClaw
  • n8n
  • Make
  • Zapier
  • a custom worker queue

A broken retry policy can quietly burn money while doing nothing useful.

This is exactly why flat-rate compute is a much better fit for always-on agents and automations. If you can route or retry without doing mental token math every time, you can build safer supervisors.

That’s one of the reasons I like what Standard Compute is doing: it gives you an OpenAI-compatible API with flat monthly pricing instead of per-token billing, and it can dynamically route across models like GPT-5.4, Claude Opus 4.6, and Grok 4.20.

For agent workflows, that changes behavior.

You can:

  • retry once without worrying about a surprise bill
  • route away from a degraded model stack
  • run 24/7 automations without token anxiety
  • keep your existing OpenAI SDK setup

If you’re building unattended automations, cost predictability is not a nice-to-have. It changes how aggressive you can be with supervision and recovery.

The baseline setup I’d use tomorrow

If I were configuring OpenClaw for real work, this would be my default:

1) Start every run in a disposable workspace

Use git worktree for normal coding tasks.

Use a disposable clone or container for higher-risk runs.

2) Put approvals outside the model

Use OpenClaw host approvals as the actual enforcement layer.

Default to cautious.

3) Add a supervisor heartbeat

If the agent or a sub-agent stops making sane progress for 30 to 60 seconds, kill it.

4) Retry once, and narrow the task

Don’t rerun the exact same broken context five times.

Retry once with a smaller scope, cleaner context, or a different model.

5) Normalize hard aborts

Ctrl+C, timeout, pkill, child cleanup, sandbox deletion.

If /abort works, great.

If it doesn’t, your system should still be safe.

The real lesson

What I liked about that OpenClaw thread is that the best replies were not from prompt obsessives.

They were from people thinking like operators.

They asked the right questions:

  • What if sub-agents are still running?
  • What if the workspace is no longer trustworthy?
  • What if the in-chat abort path is fake comfort?

That’s the shift.

Once you build always-on agents, you are not just writing prompts anymore.

You are designing failure boundaries.

And when the model starts writing nonsense, the right move is not to beg it to calm down.

Cut power.

Contain damage.

Review the diff.

Start fresh somewhere disposable.

Top comments (0)