DEV Community

Cover image for I taught Hermes Agent to predict which API changes will break my system
Lewis Sawe
Lewis Sawe Subscriber

Posted on

I taught Hermes Agent to predict which API changes will break my system

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

What I Built

Drift Detective is an MCP server that turns Hermes Agent into an API contract mutation tracker. It probes your microservices on a cron schedule, stores response shapes (fields, types, nesting depth), and classifies changes when they happen: additive, breaking, or cosmetic.

The interesting part: it learns. After you mark a few changes as "safe" or "breaking," it starts predicting. Week one it's noisy. Week three it knows that your payments service adding nullable fields is always fine, but your auth service changing any field name will break downstream consumers.

Demo

The demo runs against a local API server with four mutation stages:

Stage 1 (baseline): Agent records the shape of /api/users, /api/payments, /api/health.

Stage 2 (additive change): New fields appear. Agent flags them as low-urgency additive drift. I mark them "safe."

Stage 3 (breaking change): Fields get renamed and removed. Agent flags these as high-urgency breaking drift. I mark them "breaking."

Stage 4 (prediction fires): More fields get removed. This time the agent predicts "likely breaking" before I say anything. It recognized the removal pattern from my stage 3 feedback.

After a few interactions the alert quality is visibly different from probe 1. That's the whole point.

Code

Drift Detective

API contract mutation tracker that learns what breaks things.

Drift Detective probes your APIs on a schedule, stores response shapes, detects structural changes, and classifies them. It learns YOUR system's patterns from your feedback. Alerts get smarter, not noisier.

What It Does

  1. Probes API endpoints, extracts JSON response shape (field names, types, nesting)
  2. Detects shape changes between probes
  3. Classifies changes: additive (new field) or breaking (removed/renamed/type-changed)
  4. Learns from your feedback. Mark changes as "safe" or "breaking" and it remembers.
  5. Predicts future changes using accumulated knowledge

Hermes Features Used

Feature How
MCP Server Custom stdio server providing probe/classify/learn tools
Cron Scheduler Periodic endpoint probing, no manual intervention
Persistent Memory Endpoints, shapes, and learned patterns survive across sessions
Learning Loop / Skills Writes skill docs about your system's change patterns
AGENTS.md Context Defines alert behavior and classification rules

Demo Walkthrough

1. Start the demo API

python demo/api_server.py
Enter fullscreen mode Exit fullscreen mode

Local API with…

My Tech Stack

Drift Detective's stack:

  • Python 3.11+ (runtime)
  • MCP SDK (mcp>=1.0.0) for the stdio server protocol
  • httpx for probing API endpoints
  • SQLite for persistence (shapes, history, learned patterns)
  • Hermes Agent as the orchestrator (MCP client, cron, memory)
  • Demo API: stdlib http.server (no dependencies)

How I Used Hermes Agent

This isn't a wrapper that calls the LLM once. Five Hermes capabilities do actual work here:

MCP Server (custom stdio): The core engine. Five tools: probe_endpoint, list_endpoints, get_drift_history, record_verdict, get_learned_patterns. All state lives in SQLite. The agent reasons about when and how to call them.

Cron Scheduler: Fires every 30 minutes (configurable). The agent probes all registered endpoints, compares shapes, and delivers a report to Telegram/Discord/wherever you talk to it. No human in the loop for routine checks.

Persistent Memory: Endpoint registry, shape history, and learned patterns survive across sessions. The agent picks up where it left off even after a restart.

Learning Loop / Skills: When the agent accumulates enough feedback, it writes a skill document describing your system's change patterns. That skill loads into future sessions, giving the agent prior context before it even runs a probe.

AGENTS.md Context: Defines classification rules, alert urgency levels, and when to include predictions vs. ask for feedback. Shapes the agent's behavior without touching code.

How It Works (Technical)

The MCP server extracts a structural "shape" from any JSON response:

{"users": [{"id": 1, "name": "Alice"}], "total": 1}
Enter fullscreen mode Exit fullscreen mode

Becomes:

$.total         → integer
$.users[]       → array
$.users[].id    → integer  
$.users[].name  → string
Enter fullscreen mode Exit fullscreen mode

When a shape changes, the diff engine classifies each field-level change:

  • New field added → additive
  • Field removed or renamed → breaking
  • Type changed (string→integer) → breaking

The learning system stores verdicts keyed by endpoint + change category. After one verdict for a pattern, predictions fire on the next similar change. It generalizes: if "removed field: name" was breaking, then "removed field: email" on the same endpoint gets the same prediction. The patterns are simple and domain-specific, so one data point is enough to be useful.

What I'd Build Next

  • Webhook mode: listen for deploy events from CI/CD, probe immediately after deploys
  • Consumer registry: know which downstream services depend on which fields, route alerts accordingly
  • Schema diffing beyond JSON: gRPC protobuf changes, GraphQL schema introspection
  • Multi-endpoint correlation: "every time auth-service changes, payments-service breaks 2 hours later"

Source Code

GitHub link

drift-detective/
├── mcp_server/server.py          # MCP server with probe/classify/learn tools
├── demo/api_server.py            # Mutable demo API
├── skills/drift-detective-patterns.md
├── AGENTS.md
└── pyproject.toml
Enter fullscreen mode Exit fullscreen mode

Install: pip install -e ., add the MCP config to ~/.hermes/config.yaml, done.

Top comments (0)