What I Built
I built Hermes Sentry-Core, an autonomous, self-healing Site Reliability Engineering (SRE) and infrastructure optimization engine.
Traditional monitoring relies on passive alerting—when something breaks, a system pings a human on-call who then has to wake up, pull logs, diagnose the issue, and manually restart services. Hermes Sentry-Core flips this paradigm by acting as an automated "first-responder." Operating as a headless background daemon, it continuously monitors system health. When a failure occurs, it intercepts container error states, uses an LLM to cognitively parse and diagnose the root cause, executes targeted, low-risk remediation scripts (like cycling Docker containers), and finally broadcasts a concise, cryptographic-grade operational triage report straight to a Telegram channel.
It solves the problem of alert fatigue and reduces mean-time-to-recovery (MTTR) for known or transient infrastructure failures.
Demo
Code
https://github.com/dueprincipati/hermes-sentry-core
My Tech Stack
Agent Framework: Hermes Agent Framework
LLM Core: Anthropic Claude 3.5 Sonnet (Fallback: Claude 3 Haiku)
Infrastructure Management: Docker, Kubernetes (via kubectl)
Communication Gateway: Telegram Bot API
System Tools: standard POSIX utilities (grep, awk, cat, systemctl)
How I Used Hermes Agent
Hermes Agent was foundational in designing the secure, autonomous loop of this project. I leaned heavily on the following agentic capabilities:
- Natural Language Cron Trigger Subsystem: I utilized the Hermes abstraction layer to create continuous, intelligent monitoring loops that watch system wellness vectors without writing complex bash watchdogs.
- Deterministic Guardrails & Security: This is where Hermes truly shines for an SRE tool. I configured the agent with strict conservative_auto boundaries in config.json. By strictly allowing only non-destructive commands (docker, kubectl, grep, systemctl) and locking scope to parsing only the last 200 lines of logs, Hermes Agent allowed me to safely grant an LLM terminal access without fear of it executing dangerous operations (like rm or database wipes).
- Headless Gateways: I used the Hermes gateway system to bypass traditional web UIs entirely, routing the agent's diagnostic Markdown reports directly into a Telegram control feed, perfectly mimicking how SRE teams already communicate. Hermes provided the exact balance of AI autonomy and rigid security boundaries necessary to build a self-healing infrastructure tool.

Top comments (0)