Prashant Maurya

Posted on Jun 1

I Let Hermes Agent Run My Workflow for a Week — Here's What Actually Happened

#hermesagentchallenge #devchallenge #ai #productivity

Hermes Agent Challenge Submission: Write About Hermes Agent

This is a submission for the Hermes Agent Challenge: Write About Hermes Agent

I'll be honest: I've seen a lot of "AI agent" tools that impress in demos and disappoint in daily use. The demo runs perfectly. Then you try it on your actual messy workflow and it falls apart.

So when I set up Hermes Agent, I didn't benchmark it. I just used it — for real tasks, over a week. This is what happened.

Day 0: Setup (Took 4 Minutes)

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
hermes setup --portal
hermes chat

That's it. No Docker config, no Python environment juggling, no API key hunting across five different services. The --portal flag sets up Nous Portal which bundles web search, image generation, TTS, and browser automation under one subscription — no separate keys needed.

One thing worth knowing: Hermes auto-detects your OS and installs prerequisites. On Ubuntu it grabbed uv, ripgrep, and fd automatically. You can also opt into docker backend if you want isolated execution:

# ~/.hermes/config.yaml
terminal:
  backend: docker
  docker_image: python:3.11-slim

I kept local for the first week. Docker later when I needed isolation for untrusted scripts.

Day 1: The First Real Test — A Research Task

I had to summarize three recent papers on multi-agent coordination and write a one-page briefing. Normally: 45 minutes of reading, note-taking, drafting.

I typed: "Research recent papers on multi-agent LLM coordination from 2025–2026, summarize the key approaches, and write a structured briefing I can share with my team. Save it as multi-agent-briefing.md."

It ran web searches, fetched abstracts, organized findings by theme, and wrote the briefing. Took about 6 minutes. The output needed minor edits — some citations were redundant — but the structure was solid and I saved 35 minutes.

More importantly: after finishing, Hermes created a skill.

✓ Created skill: research-briefing
  Procedure for structured literature research and briefing generation.
  Saved to ~/.hermes/skills/research-briefing.md

I didn't ask for this. It just did it because the task involved 7+ tool calls and a non-trivial workflow. That skill is now reusable — /research-briefing next time instead of re-explaining the whole thing.

Day 2: Connecting Telegram

I spend time away from my laptop. I wanted to be able to kick off tasks from my phone.

hermes gateway telegram

It walked me through the BotFather setup, gave me a QR code to scan, and asked which phone number to whitelist. Five minutes later I was sending tasks from Telegram and getting results in the same chat.

What this actually unlocks: I can start a long-running task from Telegram in the morning, have it run while I'm in class or commuting, and get the result waiting for me. The agent doesn't need me babysitting it.

The gateway supports 20+ platforms now — Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Teams, Email, SMS, and more. I stuck with Telegram because it's what I already have on my phone.

Day 3: Subagents — Parallelizing Work

This was the feature that surprised me most. Hermes can spawn child agent instances and run them in parallel:

/delegate Run these three tasks in parallel:
1. Summarize the Q1 changelog for our project
2. Write unit tests for the auth module
3. Search for any known CVEs in our dependencies

It spawned three subagents with isolated contexts and tool access. All three ran concurrently. Results came back in one consolidated reply.

The default is 3 concurrent subagents. You can bump it up in config. Each subagent gets its own terminal session, so they don't step on each other's file state.

This is the kind of thing that sounds like a toy until you realize you just compressed three sequential tasks into one parallel run.

Day 4: The Curator Ran Overnight

I woke up to a notification:

Hermes Curator finished a review cycle.
- Consolidated 3 skills into 1 (research-briefing, research-summary, lit-review → research-workflow)
- Pruned 1 outdated skill (old deploy config — superseded by deploy-runbook)
- Updated 2 skills with corrections from recent task recoveries

The Autonomous Curator is a background agent that runs on a 7-day cycle (configurable). It grades your skill library, consolidates related skills, prunes dead ones, and rewrites skills that have been corrected during use.

I had four days of skills by this point and it had already found redundancies I didn't notice. The pruned deploy skill was genuinely outdated — I'd updated the config and the old skill would have led the agent astray.

This is the self-maintenance loop in practice. The skill library doesn't just grow — it stays accurate.

Day 5: Scheduled Tasks (Cron)

hermes cron add "Every Monday at 8am, search for new arXiv papers on LLM agents, summarize top 3, and send to Telegram" \
  --skill research-workflow

It parsed the natural language, confirmed the schedule (0 8 * * 1), attached the research-workflow skill, and registered the job. Now every Monday morning I have a paper digest in Telegram without touching anything.

You can also use proper cron expressions if you prefer them. Jobs support pause/resume/edit, and results get delivered to whichever platform you specify.

Day 6: The API Server

This one I didn't expect to be useful, but it was. Hermes exposes an OpenAI-compatible endpoint:

hermes proxy start
# Listening on localhost:8080

This means any tool that talks to OpenAI's API — Aider, Cline, VS Code Continue, Codex — can now route through Hermes. You get Hermes's memory, skills, and tool access through whatever interface you already use.

I pointed my VS Code Continue extension at localhost:8080 and immediately had access to all my project skills inside the editor. No context re-explaining. The agent already knew my project structure from previous sessions.

Day 7: What I Actually Think

After a week, a few things are clear:

What works really well:

The install-and-run experience is genuinely smooth
Telegram gateway makes it actually portable
Skill auto-creation means the agent gets better at your specific tasks without you managing it
The API proxy is a quiet force-multiplier — existing tools suddenly get memory
Parallel subagents save real time on decomposable work

What requires adjustment:

Cold starts with many skills loaded can feel slow (v0.12 cut this by ~57%, still noticeable)
The skill auto-creation is aggressive — you'll want to review the library after the first week and delete anything that's too narrow or task-specific
Browser automation occasionally needs a retry on JS-heavy sites

What I underestimated:
The compounding effect. Day 1, it's just a capable agent. By day 7, it has a library of skills tuned to my actual workflow, scheduled tasks running without my input, and memory of my project context. The gap between day 1 and day 7 is larger than I expected.

The Setup That Makes Sense for Most Developers

If you want to reproduce a useful configuration fast:

# 1. Install
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# 2. Setup with portal (handles model + tools in one)
hermes setup --portal

# 3. Connect Telegram (or Discord/Slack)
hermes gateway telegram

# 4. Add a cron task you'll actually want
hermes cron add "Every morning at 9am summarize my GitHub notifications and send to Telegram"

# 5. Let it run for a week before judging it
hermes chat