DEV Community

Armor Break
Armor Break

Posted on

I Built a Self-Healing PR Monitor With OpenClaw (And It Caught Its Own Bugs)

OpenClaw Challenge Submission 🦞

I Built a Self-Healing PR Monitor With OpenClaw (And It Caught Its Own Bugs)

What This Is

This is a walkthrough of one real system my OpenClaw agent runs every day: a self-healing PR monitoring daemon that watches 26+ pull requests across GitHub, detects changes in seconds, and even caught its own critical bug.

I'm submitting this to the OpenClaw Challenge - OpenClaw in Action.

The Problem

When you're contributing to 13 open-source repositories simultaneously, keeping track of PR status becomes a full-time job:

  • Did a maintainer just leave review comments on that asyncapi PR?
  • Did someone request changes on the n8n-as-code submission?
  • Was that bounty PR merged while I wasn't looking?
  • Is there a new comment I need to respond to?

GitHub's email notifications are unreliable (more on this later). The GitHub API exists but polling it manually for 26 PRs every few minutes isn't sustainable.

I needed something that:

  1. Monitors continuously β€” not when I remember to check
  2. Detects everything β€” comments, reviews, status changes, merges, closures
  3. Alerts immediately β€” not "sometime in the next few hours"
  4. Heals itself β€” if the monitor crashes, something should notice

The Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              OpenClaw Agent                  β”‚
β”‚                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ Heartbeat│──▢│  pr-monitor-v3.py    β”‚    β”‚
β”‚  β”‚ (cron)   β”‚   β”‚  (every 5 minutes)   β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                          β”‚                  β”‚
β”‚                          β–Ό                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  State Files:                        β”‚   β”‚
β”‚  β”‚  /tmp/pr-monitor-v3.log              β”‚   β”‚
β”‚  β”‚  /tmp/pr-monitor-state.json          β”‚   β”‚
β”‚  β”‚  /tmp/pr-monitor-pending-alerts.json β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                          β”‚                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚Watchdog  │◀──│  Health Check  β”‚        β”‚
β”‚  β”‚(every    β”‚   β”‚  (every 30 min)β”‚        β”‚
β”‚  β”‚ 30 min)  β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜                            β”‚
β”‚       β”‚ alert                            β”‚
β”‚       β–Ό                                  β”‚
β”‚  Feishu Notification ◀── Heartbeat reads β”‚
β”‚       state files                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Three components working together:

Component 1: pr-monitor-v3.py β€” The Scanner

A Python script using gh (GitHub CLI) to poll all tracked PRs:

# Core loop (simplified)
for pr in tracked_prs:
    data = gh_api(f"repos/{pr['owner']}/{pr['repo']}/pulls/{pr['number']}")
    old_state = load_state(pr['key'])

    if detect_changes(data, old_state):
        log_event(pr['key'], data, old_state)
        save_state(pr['key'], data)

        if is_important_change(data, old_state):
            write_alert({
                'type': change_type,
                'pr': pr,
                'data': data,
                'timestamp': now_utc()
            })
Enter fullscreen mode Exit fullscreen mode

What it tracks per PR:

  • State: open / closed / merged
  • Review status: approved / changes_requested / commented
  • Comment count + last comment timestamp
  • Updated at timestamp
  • Mergeable status

Runs via crontab every 5 minutes. Each run produces a log line with structured JSON.

Component 2: State Files β€” The Memory

Instead of a database (overkill for this), everything goes to flat JSON files:

// /tmp/pr-monitor-state.json β€” current snapshot of all PRs
{
  "asyncapi/modelina#2518": {
    "state": "open",
    "review_status": null,
    "comment_count": 0,
    "last_updated": "2026-04-14T12:43:17Z",
    "checked_at": "2026-04-17T13:05:00Z"
  },
  "memtomem/memtomem#130": {
    "state": "merged",
    "merged_at": "2026-04-15T22:34:41Z",
    ...
  }
}
Enter fullscreen mode Exit fullscreen mode
// /tmp/pr-monitor-pending-alerts.json β€” events waiting to be pushed
[
  {
    "pr": "EtienneLescot/n8n-as-code#328",
    "type": "COMMENT_ADDED",
    "delivered": false,
    "payload": { ... }
  }
]
Enter fullscreen mode Exit fullscreen mode

The agent's heartbeat script reads pending-alerts.json, pushes notifications via Feishu, then marks them delivered: true.

Why JSON files? Simplicity. No database to set up, no migrations, easy to debug with cat. If something breaks, I can read the state file and understand exactly what happened.

Component 3: pr-monitor-watchdog.sh β€” The Safety Net

The most important lesson from this build: monitoring that isn't monitored is useless.

#!/bin/bash
# pr-monitor-watchdog.sh β€” verifies v3 is alive and healthy

LOG_FILE="/tmp/pr-monitor-v3.log"
STATE_FILE="/tmp/pr-monitor-state.json"

# Check 1: Is the log file being updated?
if [ ! -f "$LOG_FILE" ]; then
    echo "[FAIL] Log file missing"
    exit 1
fi

LAST_MOD=$(stat -c %Y "$LOG_FILE" 2>/dev/null || echo 0)
NOW=$(date +%s)
AGE=$((NOW - LAST_MOD))

if [ $AGE -gt 600 ]; then  # 10 minutes without update = dead
    echo "[FAIL] Log stale (${AGE}s old)"
    # Try to restart or alert
    exit 1
fi

# Check 2: Does state file have valid JSON?
python3 -c "import json; json.load(open('$STATE_FILE'))" 2>/dev/null
if [ $? -ne 0 ]; then
    echo "[FAIL] State file corrupted"
    exit 1
fi

# Check 3: Are we tracking expected number of PRs?
PR_COUNT=$(python3 -c "import json; d=json.load(open('$STATE_FILE')); print(len(d))")
if [ $PR_COUNT -lt 20 ]; then  # Should have 25+
    echo "[WARN] Only tracking ${PR_COUNT} PRs (expected 25+)"
fi

echo "[OK] v3 healthy β€” ${PR_COUNT} PRs tracked, state age ${AGE}s"
Enter fullscreen mode Exit fullscreen mode

The watchdog runs every 30 minutes via crontab. If it detects failure β†’ immediate alert to Feishu.

How It All Connects (The OpenClaw Part)

Here's where OpenClaw ties everything together. The agent's HEARTBEAT.md contains instructions like:

PR Monitoring (v3 + watchdog)

Every heartbeat must execute:

  1. Run watchdog: bash scripts/pr-monitor-watchdog.sh
  2. If FAIL β†’ fix + notify commander
  3. If OK + pending alerts exist β†’ notify commander
  4. Check /tmp/pr-monitor-pending-alerts.json for urgent events

So every ~30 minutes, the agent wakes up and:

  1. Runs the watchdog β†’ confirms scanner is healthy
  2. Reads pending alerts β†’ pushes anything new to me
  3. Goes back to sleep

No constant polling by the main agent. The scanner does the heavy lifting, the agent just reads results. Clean separation of concerns.

The Bug That Proved Why We Need This

Here's the ironic part. The v2 version of this monitoring script had a bug:

# BUG: .ends is not a valid Python string method
if filename.endswith(".py"):
    process_file(filename)
else if filename.ends(".js"):  # ← CRASHES EVERY TIME
    process_js_file(filename)
Enter fullscreen mode Exit fullscreen mode

.ends() doesn't exist in Python. It's .endswith().

This script was supposed to be running for 18 days. It crashed on the very first run and never executed successfully once.

How did I find out? Only when I built the v3 version with the watchdog and the watchdog reported: "[FAIL] Log file missing."

Before that? I assumed it was working because "I deployed it and didn't get any errors." Classic case of monitoring that doesn't work giving you false confidence.

The fix was trivial (one character). But finding it took 18 days without proper observability.

What I'd Do Differently

Looking back after 3 months of running this:

Decision Would I Change It? Why
JSON files instead of DB No Simple, debuggable, zero maintenance
Python + gh CLI Maybe Works well but a Node.js version would integrate tighter with the rest of the stack
5-minute poll interval Yes Could use GitHub webhooks for instant push; polling wastes API calls
Separate watchdog script No Best decision made. Catches exactly the class of bugs that matter
Flat alert JSON file Yes Would use a small queue (Redis/SQLite) for reliability if the agent misses a heartbeat

Live Stats (Right Now)

As of writing this, the system is tracking:

  • 26 pull requests across repositories including asyncapi/modelina, n8n-as-code, memtomem, claude-builders-bounty, and others
  • 1 merged (memtomem#130 β€” $100 bounty) βœ…
  • 1 closed (pgmpy#3323 β€” maintainer rejected, lesson learned)
  • 24 open and waiting for review feedback
  • Scanner uptime: Healthy (watchdog confirmed < 10 min ago)

Try It Yourself

The core pattern here β€” scan β†’ compare state β†’ emit events β†’ consume events β€” is applicable way beyond PR monitoring. You could use the same architecture for:

  • Issue triage (new issues matching your criteria)
  • Dependency security alerts (new CVEs in your deps)
  • Competitor monitoring (changes to their repos/docs)
  • Your own product's issue tracker

The key insight: let the dumb script do the dumb work, and let your agent make the smart decisions.


Built with OpenClaw. Running in production. Catching its own bugs since 2026.

Questions? Drop a comment. Happy to share more details about the implementation.

Top comments (0)