Preetha

Posted on May 30

I Built a Medication Safety Companion for Home Care with Hermes Agent

#hermesagentchallenge #devchallenge #agents #healthcare

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

Across millions of homes, older adults managing multiple prescriptions face the same quiet crisis every day.

"I think I took my blood thinner twice today. Or maybe I forgot it.
I honestly can't remember."

That conversation is where this project started.

What I Built

The Medication Safety Companion is a home-care assistant that runs inside Telegram. Older adults talk to it by voice note or text message — the same way they already use their phone. It records doses, figures out which pill they mean when the description is vague, stops them from taking the same dose twice, and escalates immediately when something dangerous is happening. It also sends proactive reminders when a dose is overdue, with no action needed from the patient.

Everything the patient and caregiver sees is delivered through Hermes Agent. The patient never interacts with Python. They talk to Hermes — through Telegram — and Hermes handles the full conversation from greeting to confirmation to escalation.

The core design decision: Hermes is in charge of the conversation. A deterministic Python pipeline is in charge of every safety decision. The two never swap roles.

Demo

Here is what an actual Telegram session looks like. All of these were live on the running system.

https://github.com/preetharaj/Rx-Guardian/blob/main/docs/hermes_demo.gif
https://github.com/preetharaj/Rx-Guardian/blob/main/docs/Telegram_demo.gif

Confirming a dose by voice

The patient sends a voice note through Telegram. Hermes receives it via its Telegram gateway, passes it through faster-whisper for local transcription, and routes the transcript to the safety pipeline. The patient sees the transcription echoed back so they can catch any mishearing.

Patient sends voice note: "I took my heart pill"

🎙️ I heard: I took my heart pill

Got it. I've recorded that you took your heart pill at 08:14 UTC.
Let me know if you miss a dose.

When the description is ambiguous

The system never guesses. It asks. Hermes holds the clarification session open across turns using its built-in memory.

Patient: "the white pill"

I found more than one pill that could match.
1) Metoprolol 50mg  — white oval, blue cap bottle
2) Lisinopril 10mg  — white round, orange bottle, morning dose
3) Amlodipine 5mg   — white round, yellow bottle, evening dose

Which one did you take?

Patient: "2"

Got it. I've recorded that you took your pressure pill at 08:17 UTC.

When the dose was already taken

Patient: "I took my heart pill again"

I already have a record that you took your heart pill today at 08:14 UTC.
You don't need to take it again.
If you think there's a mistake, ask your caregiver to check.

When something unsafe is mentioned

Normal flow stops. No dose gets logged. Hermes delivers this message verbatim — the response formatting step is bypassed entirely for escalations.

Patient: "I want to take an old pain pill with my blood thinner"

I need to stop here. That medication may not be safe to take
alongside your current prescriptions. Please do not take it right now.
Contact your doctor or pharmacist before taking anything new.
I've made a note for your caregiver.

When it sounds like an emergency

Patient: "I accidentally took four pills by mistake"

This sounds urgent. Please call 995 right now or ask someone
nearby to call for you. Tell them which medication was taken
and roughly how much. I have notified your caregiver.
Do not take anything else until help arrives.

Proactive reminder — Hermes cron, no patient action needed

Every morning, Hermes's scheduler runs dispatch.py --remind. If a dose is overdue, Hermes sends the reminder to Telegram automatically. The patient does not have to open the app first.

⚠️ Reminder: You haven't confirmed your blood thinner yet today.
This is an important medication. Please take it and reply:
'I took my blood thinner'

The audit log

/logs

📋 Recent events:
🚨 [2026-05-29 01:09] EMERGENCY_ESCALATION
⚠️  [2026-05-29 01:08] DRUG_INTERACTION_ALERT
✅  [2026-05-29 01:06] MED_CONFIRMED — heart pill
✅  [2026-05-29 01:05] MED_CONFIRMED — pressure pill
🔍  [2026-05-29 01:05] MED_AMBIGUOUS
🔒  [2026-05-28 14:22] MED_DUPLICATE_BLOCKED — heart pill

Code

GitHub: preetharaj/Rx-Guardian

My Tech Stack

Layer	What it does
Hermes Agent	The entire user-facing layer — receives messages and voice notes via Telegram gateway, manages conversation and session memory, runs the skill, delivers responses, schedules and sends proactive reminders via cron
Safety pipeline	Pure Python — deterministic rules only, zero AI involvement in any safety decision
Response formatting	OpenRouter free tier (nvidia/nemotron:free) — called by Hermes after the pipeline decides, only to rephrase output into warmer language
Voice transcription	faster-whisper running locally — Hermes passes Telegram voice notes through it, no audio leaves the device
Database	SQLite with WAL mode — 4 tables, immutable audit log
Tests	pytest — 94 tests, all passing without any API key

Two rows in that table share responsibility for what most people would call "the AI part": Hermes Agent and Response formatting. They do entirely different things. Hermes Agent is the patient-facing intelligence — it manages the full conversation, remembers what was said two turns ago, routes voice notes, and delivers all messages through Telegram. Response formatting is a narrow utility step downstream, called only after the safety pipeline has already made its decision, and its only job is to turn a structured result like {outcome: CONFIRMED, message: "..."} into a warmer sentence. A bad API response or a model hallucination in that step cannot change whether a dose is confirmed or an escalation fires — the pipeline already ran.

Core Python modules

lookup.py               — matches "heart pill" to Metoprolol, handles ambiguous descriptions
ambiguity_handler.py    — manages multi-turn clarification sessions with session keys
duplicate_guard.py      — checks audit log, blocks repeat doses within the 6-hour window
confidence_rules.py     — handles low-confidence voice transcriptions
emergency_escalation.py — 80+ trigger phrases across 5 categories, fires before anything else
caregiver_override.py   — explicit two-step correction with full audit trail
safety_router.py        — wires all the above together in a fixed priority order
reminder.py             — checks overdue doses and generates reminder messages
dispatch.py             — single CLI entry point that Hermes calls via terminal, returns JSON

How Hermes Agent Powers This Project

Hermes is not a wrapper around a prompt here. It is the patient-facing layer, the conversation manager, the voice transcription router, the scheduler, and the delivery channel. Without Hermes, this is a Python script that nobody can talk to.

Let me be specific about each piece.

1. The Telegram gateway handles voice and text

When a patient sends a voice note to the Telegram bot, Hermes receives it through its native Telegram gateway integration. The gateway downloads the audio, passes it through faster-whisper for local transcription, and forwards the resulting text to the skill for processing. The patient never had to install anything beyond Telegram. They did not type a command or navigate a menu. They just sent a voice message the same way they would send one to a family member.

This is Hermes's gateway doing real work. The voice-to-text pipeline, the Telegram connection, the message routing — all of it is handled by Hermes before the first line of safety Python runs.

2. The skill is the orchestration contract

The project is packaged as a native Hermes skill stored in hermes-skill/med-safety/SKILL.md. Dropping this file into ~/.hermes/skills/ registers /med-safety as a slash command and loads all the rules Hermes will follow for every interaction.

The SKILL.md contains:

A step-by-step procedure for what Hermes does on each turn
An outcome table mapping every pipeline result code to a specific response behaviour
Hard rules Hermes must never break (ESCALATION → deliver verbatim, nothing else)
The path and argument format for calling dispatch.py

Alongside it is a SOUL.md — Hermes's built-in personality system. This defines the voice of the assistant: calm, short sentences, everyday words, no medical jargon, one clear next step at the end of every message. The older adult on the other end of this conversation does not want to parse clinical language when they are worried about whether they double-dosed a blood thinner. The SOUL.md enforces that consistently across every response without repeating the instruction in every prompt.

This is why a skill is a better fit than a system prompt. A SOUL.md + SKILL.md combination gives the agent a spec it treats like a contract, not a suggestion.

3. The terminal tool separates conversation from safety decisions

Hermes calls dispatch.py via its terminal tool on every turn. The script runs the full safety pipeline and returns a JSON result:

{
  "outcome": "CONFIRMED",
  "message": "Got it. I've recorded that you took your heart pill at 08:14 UTC.",
  "session_key": null,
  "log_id": 7
}

Hermes reads the outcome field and acts according to the rules in SKILL.md. It never has to decide whether a dose was safe, whether a combination is dangerous, or whether something is an emergency. Those decisions came back in the JSON. Hermes just delivers.

This separation is intentional. The terminal tool pattern lets you put safety-critical logic in code you can test, audit, and reason about, while leaving the agent to do what it is actually good at: understanding natural language, managing conversation state, and delivering messages to a human.

4. Session memory tracks multi-turn clarification

When "the white pill" comes back AMBIGUOUS, the pipeline returns a session_key — a unique identifier for the open clarification session. Hermes holds this key in its session memory and passes it automatically on the next turn:

python dispatch.py --session "ambig_20260529_083200" "2"

The patient just said "2". They did not say which session they were answering, what question was asked, or which medications were in the list. Hermes remembered all of that across turns. The session_key mechanism works because Hermes's memory layer exists and I did not have to build a separate state store to use it.

5. Cron drives the proactive reminder loop

Inside a Hermes session, this instruction registers a scheduled job:

Create a cron job: every day at 08:30
  cd /path/to/project && python dispatch.py --remind
If result is not [SILENT], send the message to Telegram.

dispatch.py --remind checks the reminder engine, compares each medication's scheduled time against the audit log, and returns reminder messages only for doses that are overdue and have not been sent yet today. Hermes's cron scheduler runs this check and delivers the result through Telegram without any polling loop, background thread, or separate infrastructure.

A morning reminder that reaches the patient before they have forgotten is more useful than one they have to ask for. Hermes cron made that easy. I expected to spend a day on this part. It took about an hour.

6. The Telegram bot's own JobQueue as a redundancy layer

The Telegram bot also runs an independent JobQueue check every 15 minutes through python-telegram-bot. If Hermes cron misses a window — restart, connectivity, anything — the bot sends the reminder anyway. A patient's medication reminder should not depend on a single point of failure. Two independent schedules pointing at the same reminder.py logic is the right call.

The architecture in one diagram

Patient voice note or text (Telegram)
    │
    ▼
Hermes Agent — Telegram gateway
    │  receives message, transcribes voice via faster-whisper,
    │  loads med-safety skill, reads SOUL.md personality
    │
    ▼
Hermes calls terminal: python dispatch.py "I took my heart pill"
    │
    ▼
safety_router.py — deterministic Python, no AI
    ├── emergency_escalation.py  — 80+ unsafe keywords, fires first
    ├── confidence_rules.py      — low-confidence STT → ask to repeat
    ├── lookup.py                — 4-pass medication matching
    ├── ambiguity_handler.py     — 2+ matches → clarification session
    └── duplicate_guard.py       — confirmed in last 6h → blocked
    │
    ▼ JSON: {outcome, message, session_key, log_id}
    │
    ▼
If outcome == ESCALATION:
    Hermes delivers message verbatim. Response formatting bypassed.
Else:
    OpenRouter rephrases into SOUL.md-consistent language
    Hermes delivers to Telegram
    │
    ▼
Audit log (SQLite) — immutable, every event written, nothing deleted

Escalation messages never reach OpenRouter. The moment format_response() sees outcome == ESCALATION, it returns the message as-is. The sentence telling someone to call emergency services will never be softened by a model trying to sound less alarming.

Safety rules, tested in code

python -m pytest tests/ -v
# 94 passed in 2.4s

Every safety property has a test. A few that matter:

"the white pill" with three candidates → AMBIGUOUS, nothing logged yet
Same medication confirmed twice within 6 hours → DUPLICATE_BLOCKED
MED_UNCERTAIN in the log → also blocks a second dose attempt
"I accidentally took four pills" → EMERGENCY_ESCALATION, message contains "995"
"I want to take ibuprofen" → DRUG_INTERACTION_ALERT, no dose confirmed
"I took fish oil" → SUPPLEMENT response, never logged as a prescription
"I took it again" → UNKNOWN_MED, not confirmed as insulin

That last one was a real bug. "I took it again" strips down to "it" through the intent extractor, and "it" fuzzy-matched "insulin" because both contain the same token. One silent wrong confirmation of a critical medication. The fix was a stopword list that prevents pronouns from reaching the medication matcher at all.

Running it yourself

git clone <your-repo>
cd med-safety-companion
pip install -r requirements.txt
pip install "python-telegram-bot[job-queue]" faster-whisper

cp .env.example .env
# OPENROUTER_API_KEY — free at openrouter.ai, no card required
# TELEGRAM_BOT_TOKEN — free from @BotFather on Telegram

python seed.py
python -m pytest tests/ -v
python cli.py --bot

Send /start to your Telegram bot. Say "I took my heart pill" by voice. Watch it land in the logs.

To run the full Hermes experience with the /med-safety slash command:

cp -r hermes-skill/med-safety ~/.hermes/skills/
hermes -s med-safety
# then type: /med-safety I took my heart pill

Three things I did not expect

Hermes's voice note handling eliminated an entire integration problem. I expected to write a custom webhook, a file download handler, and audio conversion code. The Telegram gateway handled all of that. I only had to write the transcription step and wire it to the pipeline. The hard part of "voice input on Telegram" was not hard at all.

SKILL.md is a more reliable contract than a system prompt. A well-written skill with outcome tables, hard rules, and a separate SOUL.md produces consistent behaviour across hundreds of turns. A long system prompt drifts. The skill acts like a spec the agent is trying to satisfy, not a tone it vaguely remembers.

Cron + Telegram is a complete proactive notification system. I expected to need a separate scheduler service, a notification database, and probably a Redis queue. I ended up with a --remind flag on a Python script and three lines in a Hermes cron definition. The proactive reminder feature — the one that might actually save someone from a missed dose — was the simplest part of the whole build.

What is next

Adding TTS so the system can speak the confirmation back to patients who find reading difficult. A caregiver summary view showing the week's log in plain language. Letting caregivers add or edit medications through the Telegram chat itself. And eventually, refill tracking — a warning when a medication is running low before the patient runs out entirely.

Hermes is the voice, the memory, the scheduler, and the delivery channel. Python is the safety brain. One without the other is incomplete. Together they form something worth deploying in a real home.

Top comments (3)

Harjot Singh • May 31

Medication safety is a genuinely high-stakes use case and I respect that you picked a domain where being wrong actually matters - it forces the right discipline. The thing I'd be most careful about (and curious how you handled it): an LLM confidently hallucinating a drug interaction or dosage is worse than no tool at all, so the safety layer can't be the model's judgment alone - it needs to be grounded against an authoritative interaction database with the model orchestrating, not deciding.

That "model orchestrates, deterministic source decides" pattern is exactly the kind of gating I lean on in Moonshift (a multi-agent pipeline: prompt to a shipped SaaS on your own GitHub + Vercel) - agents propose, verified data/checks dispose - because in any domain with real consequences you can't let the probabilistic layer be the final authority. Routing keeps it ~$3 flat too. First run's free, no card. Important build - how did you guard against hallucinated medical info? I'd assume a grounded drug database + the agent only as the interface, but curious about your exact safety architecture.

Preetha • May 31

Thank you for the thoughtful feedback.
The 'model 'model orchestrates, deterministic source decides' framing describes precisely what this build is doing. Hermes is the orchestrator; Python is the authority. There is no live drug interaction database in this version. The escalation layer uses a deterministic keyword list compiled from ISMP common error patterns, not a real-time database query. This is a reduced scope decision for 'Hermes challenge' submission and is called out in the risk register. The safety architecture goes deeper than the keyword list though. Hermes's skill file acts as a verifiable contract between the conversation layer and the safety pipeline.
The rules: escalation messages delivered verbatim, no dose confirmed without pipeline approval, uncertain cases never auto-resolved etc., are enforced by the skill definition. Hermes follows as a runbook here rather than acting upon prompt suggestions. If the deterministic pipeline marks a case blocked, the model follows that decision. It reads {"outcome": "ESCALATION"} and delivers the message it was handed.
The second layer is session memory. In a multi-turn clarification such as 'the white pill' resolving across two turns, the session state is held by Hermes's memory system rather than inferred from context. The conversation relies on stored state, not reinterpretation.
Grounding against a trusted medication interaction API is the right next step for production. The current architecture was designed so that additional safety checks can be added behind the existing validation layer without changing the conversation workflow. Hermes brings the channel, the memory, the scheduling, and the skill contract into a single deployable system. so, a verified data layer plugs into something already handling voice notes, proactive reminders, and caregiver correction flows across Telegram. That vertical integration means, future safety enhancements slot-in without disrupting the user experience or the operational workflows built around it.

Appreciate the thoughtful read.

Harjot Singh • May 31

"Model orchestrates, deterministic layer decides" is exactly the architecture a medical companion needs, and you clearly already have it. In a domain where a confidently-wrong dosage is harm, the LLM should never be the final authority on a fact, it parses and proposes, the rules engine verifies against the actual drug data. That's the same propose-then-verify spine I build everything on. The abstain path matters most here too: "I'm not sure, check with a pharmacist" beats a confident guess every time. Genuinely important thing to be building carefully.