DEV Community

Cover image for A Fact A Day, an autonomous Podcast as my entry 4 Hermes Agent Challenge
Mathias Eberlein
Mathias Eberlein

Posted on

A Fact A Day, an autonomous Podcast as my entry 4 Hermes Agent Challenge

Hermes Agent Challenge Submission

This is a submission for the Hermes Agent Challenge

What I Built

"A Fact A Day" is an autonomous, daily educational podcast that summarizes current scientific and technological breakthroughs in approximately 60 seconds.

The project solves a concrete problem: knowledge is everywhere, but hard to consume. People who want to stay up-to-date don't have time to read long articles or listen to hour-long podcasts. A Fact A Day delivers exactly one piece of information per day — well-researched, well-told, in one minute.

The special feature: The entire production process is automated and runs once daily via cron job:

  1. Research current breakthroughs from trusted sources (MIT Tech Review, Ars Technica, Wired, NYT Tech)
  2. AI-assisted script creation with strict structure (strong hook → information → open question)
  3. Speech synthesis with Stepfun TTS (female voice, excited emotion)
  4. Automatic audio editing and merging into a finished MP3
  5. Publication on the archive page

Demo

https://www.redpandamonium.de/afactaday/

Code

https://github.com/tkdmatze/afactaday/

My Tech Stack

  • Hermes Agent – orchestration, cron job, skill system, Telegram integration
  • Stepfun API – LLM (step) for script creation, TTS (stepaudio-2.5-tts) for speech synthesis
  • Python – main script (generate_lecture.py) with requests, pydub, feedparser
  • defuddle-cli – token-efficient HTML extraction from news sources
  • ffmpeg – merging TTS audio snippets into a seamless MP3
  • Astro – archive page (https://www.redpandamonium.de/afactaday/)
  • nginx + rsync – deployment to my own servers
  • Telegram Bot API – delivery of the finished episode

How I Used Hermes Agent

Hermes Agent is the backbone of the entire system — without it, the podcast wouldn't work.

  1. Skill System as Reusable Building Blocks

The entire production process is implemented as a Skill (aura-lectures) that can be easily called and configured:

hermes run /aura-lectures --topic "quantum computing"

The skill encapsulates all dependencies: RSS feeds, HTML extraction, LLM prompting, TTS synthesis, audio merging. This makes the code maintainable and extensible.

  1. Cron Job for Full Automation

Via cronjob(action='create'), the skill is automatically executed every weekday at 08:30 UTC — without human intervention:

Registered in Hermes Cron
ID: XXXXXX
Schedule: 30 8 * * 1-5 (weekdays)
Command: hermes run /aura-lectures
Delivery: Telegram channel

  1. Terminal Tools for the Processing Chain

The Python script uses subprocess calls for external tools:

  • feedparser to collect RSS feeds
  • defuddle-cli to clean HTML (retains only actual article content)
  • ffmpeg to merge TTS audio snippets
  • requests for all API calls (Stepfun Chat + TTS)

noteworthy

What makes this experiment/project special for me is how much Hermes Agent handles completely autonomously in the background — things that typically require explicit setup, configuration, or manual steps. May that was, because i did experiments before, and so hermes knows exactly where to store files, which are my preferred libraries and which design to use.

  • Skill scaffolding: Hermes created the complete skill structure (SKILL.md, scripts/generate_lecture.py) — no manual file creation needed.
  • RSS feed management: The skill automatically discovers and parses multiple RSS feeds (MIT Tech Review, Ars Technica, Wired, NYT Tech), extracts article URLs, and fetches full content — all without any external RSS-to-email service.
  • HTML cleaning: Using defuddle-cli, Hermes automatically strips headers, footers, and navigation from article pages, reducing token consumption by 5–10× before sending to the LLM — a step most podcast automations skip entirely.
  • Script quality enforcement: The LLM prompt is automatically structured to guarantee the exact format: strong hook → information → open-ended question, with strict word count (120 ± 15 words). No post-processing or manual editing required.
  • TTS chunking: When the generated script exceeds Stepfun's 1000-character limit per request, Hermes automatically splits it at sentence boundaries, synthesizes each chunk separately, and merges them back together — transparently.
  • Audio normalization: The tts_synthesize function handles rate limiting, retries, and temp file cleanup automatically. The final MP3 is produced with optimal bitrate (-q:a 2) via ffmpeg without manual ffmpeg command crafting.
  • Telegram delivery: The finished MP3 is sent as a native voice message. — Hermes detects the file path and handles the upload automatically, no bot token management or manual API calls needed.
  • Archive page generation: After each episode, Hermes can automatically copy the MP3 to the server's audio folder and update the Astro page's episode list — the entire "publish to website" step is fully automatable.
  • Cron job registration: The recurring job is created with a single cronjob(action='create') call, complete with natural-language description, weekday schedule, and Telegram delivery target — no crontab editing or systemd service files.
  • Memory persistence: All decisions are stored in MEMORY.md and USER.md, so the skill works immediately on any new Hermes instance without re-running setup steps.

In short: from "run the skill" to "episode delivered and published" — zero manual intervention. The entire pipeline is self-contained, robust, and fully documented within the skill system.

Top comments (0)