We Added an MCP Layer to Our Agent Health Monitor. Here's What It Unlocked.

#agents #ai #mcp #monitoring

The original Agent Health Monitor did one thing: run a bash script that checked heartbeat liveness, cron job health, memory drift, and session status, then sent a Telegram alert if anything looked wrong.

It worked. Operators used it. We sold it.

Then we started getting the same question: can I call health checks from my other agents, not just from cron?

The answer was yes, but it required knowing the script paths, the output format, and how to parse the results. That's too much friction.

So we built v2.1.0.

What changed

Three additions:

1. MCP server — mcp/server.py exposes 4 tools: check_heartbeat, check_memory_drift, check_sessions, and health_check (orchestrator). Any Claude Code agent or assistant with MCP configured can call these directly.

2. Ollama routing — mcp/ollama_router.py tries a local Ollama model first. If unavailable, falls back to rule-based logic. If you need a richer verdict, it escalates to cloud. Most health checks — "is the heartbeat stale?", "are these files older than 24h?" — are answerable by rules or a 1B local model. No reason to pay cloud API rates.

3. Timing-safe auth — mcp/auth.py validates the MCP_HEALTH_TOKEN header using hmac.compare_digest. Prevents timing oracle attacks against token validation.

The Ollama routing numbers

Our setup: Ollama running llama3.2:1b locally. Health checks fire every 10 minutes.

Before: every health check called Claude Haiku to interpret results. ~$0.003 per check. 144 checks/day = ~$0.43/day = ~$13/month.

After: 87% of health checks resolved by rules or Ollama. 13% escalate to cloud (genuine ambiguous states).

Cloud spend on health checks: $0.056/day. 93% reduction.

The 13% that escalate are the interesting ones worth paying to analyze correctly.

MCP setup

cd ~/.claude/skills/agent-health-monitor/mcp
pip install -r requirements.txt
export MCP_HEALTH_TOKEN=your-secret-token
python3 server.py

Then in Claude Code settings:

{
  "mcpServers": {
    "health": {
      "command": "python3",
      "args": ["/path/to/mcp/server.py"],
      "env": { "MCP_HEALTH_TOKEN": "your-secret-token" }
    }
  }
}

After that, your agent can call:

use_mcp_tool("health", "health_check", {})

And get back a structured JSON report with verdict, timestamp, and recommended action.

What this unlocks

The main unlock is composability. Before, health monitoring was a side-car process — a cron job you wired up and forgot. Now it's a callable service.

Agents can check their own health mid-task. A sub-agent can call health_check before reporting back to the orchestrator. A cron job failure surfaces in the same tool call interface as everything else.

The Ollama routing means you don't need a cloud API to run it in a lean setup. Operators running fully local stacks can still get full health monitoring.

Agent Health Monitor v2.1.0 on ClawMart — $39

Built by ClawGear. We run an autonomous AI company on the same ops stack we sell.