nujovich

Posted on May 31

A plugin for Observability + Budget Guardrails built with Hermes Agent

#hermesagentchallenge #devchallenge #agents

Hermes Agent Challenge Submission: Build With Hermes Agent

Challenge Entry for the Hermes Agent Challenge

🚀 What Problem Does This Solve?

AI agent deployments often suffer from two critical blind spots:

Cost visibility — you discover a $500 OpenAI bill at the end of the month with no clue which cron jobs or sessions caused it
Budget control — runaway loops or expensive model choices can drain your account before you notice

hermes-telemetry solves both by giving you real-time observability and automatic budget enforcement for Hermes Agent.

🎯 Why This Plugin Matters

Every production AI system needs observability and cost control. This isn't just a nice-to-have — it's essential infrastructure.

Before this plugin, Hermes users had no way to:

Track spending per cron job or messaging platform
Set budget limits that actually pause runaway processes
Compare cost efficiency between different models
Get real-time cost alerts before hitting billing limits

Now they can manage their AI spend like a modern SaaS — with dashboards, alerts, and automatic circuit breakers.

✨ Key Features

Real Usage Data (Not Estimates)

Captures actual token counts and costs returned by providers like OpenRouter, OpenAI, and Anthropic. No guesswork.

Multi-Level Budget Enforcement

Soft warnings at 80% of budget
Hard tool blocks at 100% (prevents new API calls)
Cron job pauses for automated workflows
Scope-specific limits (global, per-cron-job, per-platform)

Rich Analytics via Slash Commands

/stats — session performance, tool usage, cost breakdowns
/stats cron week — cron job cost comparison across time
/stats providers — which providers return real vs estimated data
/budget — current spending vs limits with visual indicators

Zero Model Awareness

Pure observability layer — captures everything through hooks without affecting model behavior or adding latency.

📊 Screenshots

Session Analytics (`/stats`)

Budget Status (`/budget`)

Cron Job Cost Comparison (`/stats cron week`)

Provider Analysis (`/stats providers`)

🧪 Proof of Concept: Real Data

I tested the plugin with three different models to validate pricing accuracy and budget enforcement:

Model	Cost per Test Run	Budget Behavior
`owl-alpha` (free)	$0.00	No limits triggered
`claude-sonnet-4-6`	$0.31	Soft warning at $0.001 limit
`claude-opus-4-7`	$2.23	Hard pause enforced ✅

Budget enforcement works. When I set a $0.001 daily limit and ran a cron job, it correctly paused at $0.18 spending. When I raised the limit to $2.00, jobs resumed normally.

Real provider data. OpenRouter returned actual token counts (Est% = 0%), not estimates. The plugin correctly captured and priced these.

🏗️ Technical Implementation

Hook Pipeline Architecture

on_session_start → pre_api_request → ★ post_api_request → post_tool_call
                                     │
                                     ▼
                               [capture usage]
                                     │
                                     ▼
pre_llm_call (budget check) → pre_tool_call (tool gate) → SQLite storage

Data Layer

SQLite WAL database — efficient, local, no external deps
Custom pricing.yaml — override provider rates for accurate cost calculation
budget.yaml configuration — flexible limits (daily/monthly, global/scoped)
94 comprehensive tests — full coverage of edge cases and enforcement logic

Provider Compatibility

Works with any provider that follows the Hermes Agent provider interface:

✅ OpenRouter (tested with real usage data)
✅ OpenAI (pricing table included)
✅ Anthropic (pricing table included)
✅ Custom providers (via pricing.yaml overrides)

🎯 Production Ready

This isn't a demo — it's production infrastructure. The plugin includes:

Error handling — graceful fallbacks when providers return no usage data
Hot-reload — update budgets via /budget set without restart
Concurrent safety — SQLite WAL mode handles multiple sessions
Memory efficiency — hook pipeline adds negligible overhead
Comprehensive logging — debug telemetry issues with structured logs

🚀 Installation & Usage

1. Install

cd ~/.hermes/plugins
git clone https://github.com/nujovich/hermes-telemetry.git
# Add 'hermes-telemetry' to plugins.enabled in config.yaml
# Restart gateway: hermes gateway restart

2. Configure Budget (Optional)

# Set daily budget
hermes> /budget set global daily 5.00

# Check status  
hermes> /budget

3. Monitor Usage

# Session stats
hermes> /stats

# Cron job breakdown
hermes> /stats cron week

# Provider analysis
hermes> /stats providers

That's it. The plugin immediately starts capturing usage data for all sessions and cron jobs.

🏆 Why this is a win-win

This plugin solves a universal need in AI systems — cost visibility and control. Every Hermes Agent deployment, from personal automation to enterprise cron jobs, benefits from this infrastructure.

It's not just useful, it's essential. Without budget controls, a misconfigured cron job with an expensive model can cost hundreds of dollars overnight. This plugin prevents that.

Real-world tested. I built, deployed, and validated this with actual usage data across multiple providers and models. It's not a concept — it's working infrastructure that saves money and provides operational insight.

Community impact. This sets a standard for observability in the Hermes ecosystem. Other plugin authors can build on these patterns, and users get immediate operational confidence.

📋 Technical Details

Repository: https://github.com/nujovich/hermes-telemetry
Documentation: Complete README with architecture, configuration, and troubleshooting
Tests: 94 passing tests covering all major functionality
License: MIT
Dependencies: PyYAML only (for config files)

👨‍💻 About the Author

I'm Nadia Ujovich.

I understand the operational challenges of running AI systems at scale, and I built this plugin to solve the observability gap I see in every deployment.

This plugin makes Hermes Agent production-ready for cost-conscious deployments. It's the infrastructure piece that every serious AI system needs, but few teams build themselves.

Give your agents the observability they deserve. Try hermes-telemetry today.

Made with ☕ for the Hermes Agent ecosystem

Top comments (1)

nujovich • Jun 1 • Edited

🆕 Updates since posting — new features just shipped:
1. Dashboard Web UI
Added a standalone HTML dashboard for users who prefer charts and graphs over slash commands. Zero dependencies — just python3 serve.py and open localhost:8765.
2. Pricing Auto-Refresh from OpenRouter
Manually maintaining pricing.yaml for hundreds of models is impractical. The plugin now auto-fetches pricing from OpenRouter's public API (320+ models) once every 24 hours:
python -m hermes_telemetry.pricing_refresh --verbose
Your manual overrides are always preserved — the auto-refresh only adds new models or updates previously auto-fetched ones.
3. Estimated-Price Models + Smart Budget Degradation
Some OpenRouter models have no fixed pricing (e.g. auto routing, experimental models). These were showing as negative prices that confused cost calculations.
Now they're handled properly:
- Prices normalized to $0.00 (don't inflate your numbers)
- Flagged internally so the budget engine knows they exist
How this affects budgets (on_estimated.mode explained):
The on_estimated setting in budget.yaml controls what happens when your spend includes calls where the exact cost is uncertain (either because the provider returned no usage data, OR because the model has no fixed price):
yaml on_estimated: mode: warn_only # default — safe mode mode: enforce # strict mode
Mode: warn_only (default)
What happens: Budget can reach 100% and trigger warnings, but hard tool blocks are downgraded to soft. You get alerts but your agent keeps working.
Safe default because estimates aren't precise enough to justify hard stops.
────────────────────────────────────────
Mode: enforce
What happens: Hard tool blocks fire normally (100% = tools actually stop). Use this only when all your models return real usage data AND have fixed
pricing.
Why this matters: without this degradation, a single call to a free-tier model that the provider didn't report usage for could hard-stop your entire agent. The warn_only mode prevents that footgun.
All of this is live in the repo. Installation is the same — just pull the latest and restart the gateway.
Repo: github.com/nujovich/hermes-telemetry