DEV Community: Kunal Pratap Singh

I Built an Autonomous RBI Regulatory Digest Agent with Hermes Agent

Kunal Pratap Singh — Sun, 31 May 2026 18:30:14 +0000

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

The Problem Nobody Talks About

Every time the Reserve Bank of India publishes a circular, somewhere inside an Indian bank, a compliance officer opens a PDF.

They read it. They try to figure out what it means for their institution specifically. They write a summary email. They forward it to five department heads. They chase those department heads for two weeks to confirm it's been actioned. They build a spreadsheet to track all of this. And then the next circular drops and the cycle starts again.

RBI publishes hundreds of circulars a year. SEBI publishes more. MCA publishes more still. Compliance teams at Indian banks are drowning — not because they're incompetent, but because the volume of regulatory output has outpaced any reasonable human ability to track it manually.

The fine for missing a deadline isn't a polite reminder. It's a penalty notice.

This is the problem I built for.

What I Built

RBI Regulatory Digest Agent — an autonomous multi-step agent powered by Hermes Agent that monitors RBI and SEBI publication feeds, reads every new circular, extracts structured action points from the regulatory text, and delivers a formatted intelligence report to compliance teams automatically.

No human reads the circular first. No human decides what's important. No human routes it to the right department. The agent does all of that.

The pipeline

RBI/SEBI feeds → new circular detected → full text extracted →
LLM analysis → structured action points → risk classification →
HTML dashboard generated → email delivered

Every action point extracted contains:

What needs to be done — specific and actionable, not a vague summary
Deadline — parsed from the circular text
Responsible department — Credit, Compliance, Treasury, Operations, IT, Legal
Evidence required — what documentation confirms completion
Priority — Critical (overdue or <7 days), High, Medium, Low

From a new circular to a structured compliance briefing in under 10 seconds.

Demo

The demo shows Hermes Agent receiving a single instruction and autonomously:

Polling the RBI notification feed
Detecting 10 new circulars
Fetching and parsing each circular's text
Running LLM extraction on each one
Generating a risk-classified HTML dashboard with 46 action points across 8 departments
Delivering the digest by email

No human intervention between step 1 and step 6.

Code

GitHub: https://github.com/kunalp-singh/rbi-digest

Project structure

rbi_monitor.py      # Polls RBI/SEBI RSS feeds, deduplicates, stores to SQLite
extractor.py        # Fetches circular text, runs LLM extraction, stores action points  
digest.py           # Builds risk-classified HTML dashboard, sends email
hermes_prompt.txt   # The single instruction Hermes receives to run everything
circulars.db        # SQLite store (gitignored)

My Tech Stack

Hermes Agent — orchestrates the full pipeline via terminal tool
Python 3.11 — core runtime
Ollama + Qwen3:8b — local LLM for circular analysis, runs entirely on-device
BeautifulSoup4 + lxml — RSS feed parsing
SQLite — circular deduplication and action point storage
OpenAI-compatible client — connects Python to local Ollama endpoint
SMTP / Gmail — email delivery
HTML/CSS — dashboard with risk heatmap, compliance timeline, department cards

Everything runs locally. No data leaves the machine. No cloud LLM API required.

How I Used Hermes Agent

This is the part that matters most for the challenge, so I want to be precise about it.

The single instruction

The entire pipeline runs from one message to Hermes:

Use the terminal tool to run this command exactly:
cd ~/Studies/Projects/hermesProject && source venv/bin/activate && 
python3 rbi_monitor.py && python3 extractor.py && python3 digest.py

Hermes takes this, uses its terminal tool to execute the full chain, monitors the output of each step, and reports back a summary — how many circulars were found, how many action points extracted, which departments have critical items, and confirmation that the digest was delivered.

Why Hermes was the right fit

I could have run a cron job directly. I chose Hermes for three specific reasons:

1. Multi-step reasoning across tool calls
Each script depends on the previous one succeeding. Hermes doesn't just fire and forget — it monitors output, detects failures, and can adapt. When the extractor found 0 action points on informational circulars, Hermes correctly reported that as expected behaviour rather than an error.

2. Natural language orchestration
The hermes_prompt.txt file is plain English. Any compliance team member can read it, understand what the agent is doing, and modify it. There's no bash scripting knowledge required to change which feeds are monitored or what gets reported.

3. Episodic memory for future runs
Hermes logs what it processed in each session. Future runs can reference past extractions — "you processed this circular last week, here's what changed" — which is exactly the kind of institutional memory a compliance team needs.

Agentic capabilities used

Terminal tool — executes the monitoring, extraction, and digest pipeline
Multi-step execution — coordinates three dependent scripts in sequence
Output interpretation — reads script output and generates a human-readable summary
Planning — determines correct execution order without being told explicitly

What the Output Looks Like

From the most recent run against live RBI feeds:

46 action points extracted from 20 circulars
8 departments impacted: Compliance, Credit, Treasury, IT, Operations, Legal, Technology, Board/Governance
11 critical items — deadlines already passed or within 7 days
20 items with specific upcoming deadlines tracked in the compliance timeline

Sample action points extracted by the LLM:

Compliance dept — CRITICAL: Submit required documents for claiming Agency Commission — Deadline: April 30, 2026 — Source: RBI Conduct of Government Business Directions

Credit dept — MEDIUM: Update credit risk assessment policies to include calamity impact on borrowers — Deadline: July 1, 2026 — Source: RBI Credit Risk Management Amendment Directions

Operations dept — MEDIUM: Appoint Nodal Officers for pension grievance handling — Deadline: Immediate effect — Source: RBI Disbursement of Government Pension Directions

Every item links directly to the source RBI circular.

What's Next

This is a working prototype, not a finished product. The honest limitations:

Currently monitors RBI and SEBI only — IRDAI, MCA, FATF feeds can be added
LLM extraction runs locally on Qwen3:8b — a larger model would improve accuracy on dense legal text
The dashboard is static HTML — a real deployment would have a live web interface with team login, action acknowledgement, and audit trail

The architecture is modular enough that each of these is an extension, not a rewrite.

The core insight this project validates: regulatory compliance monitoring is exactly the kind of repetitive, high-stakes, structured reasoning task that agentic AI should be doing instead of humans. Hermes Agent made it possible to prototype that in days rather than months.

Built during the Hermes Agent Challenge, May 2026. All data sourced from live RBI publication feeds.

What Nobody Tells You About Running Hermes Agent Locally (M-Series Mac Edition)

Kunal Pratap Singh — Sun, 31 May 2026 17:33:11 +0000

What Nobody Tells You About Running Hermes Agent Locally (M-Series Mac Edition)

I spent a day building a real project with Hermes Agent on my M5 MacBook Air with 16GB RAM and zero API budget. This is the honest account of what broke, what worked, and what I wish someone had told me before I started.

If you're on Apple Silicon and want to run Hermes Agent locally without paying for API credits, this post is for you.

What Hermes Agent Actually Is

Before I get into the setup pain, a quick framing for people who haven't used it yet.

Hermes Agent is an open-source autonomous agent built by NousResearch: the team behind the Hermes family of fine-tuned models. It's not a chatbot wrapper. It's a full agentic loop: it receives a goal, breaks it into steps, selects from 40+ built-in tools (browser, terminal, file system, code execution, cron jobs, messaging platforms), executes those steps, and iterates until the task is done.

The part that makes it genuinely different from most agent frameworks is episodic memory. After each task, Hermes writes a structured record of what worked and what didn't. On future tasks, it retrieves those records and adjusts its approach. It actually learns from its own history.

It's MIT licensed, runs on your own machine, and supports OpenAI, Anthropic, Google, and local models via Ollama.

Step 1: Installation (The Easy Part)

Installation is genuinely one command:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

It installs everything automatically — Node.js, browser dependencies, the works. Takes about 3 minutes. After that:

source ~/.zshrc
hermes

You'll see the Hermes ASCII banner and a list of available tools and skills. This part actually works exactly as advertised.

First lesson: Run hermes postinstall after the main install. The base installer skips Playwright (the browser automation library). If you skip this step, every browser-related task will fail silently and you'll waste an hour debugging.

Step 2: The API Provider Trap

Here's where I hit the first wall.

Hermes supports a huge list of providers — OpenAI, Anthropic, Google Gemini, Ollama, and about 30 others. The interactive setup is clean and fast. But the provider you choose matters enormously for agentic tasks.

What I tried first: Gemini free tier

Google's Gemini API has a free tier. Sounds perfect. The problem is the rate limits:

gemini-2.5-flash: 5 requests per minute on free tier
gemini-flash-latest: slightly better, still very low

For a simple chatbot, 5 requests/minute is fine. For an agentic task where Hermes might make 15-20 API calls to complete a single multi-step workflow (browse a page → take a screenshot → analyze it → decide next step → browse again), you'll hit the rate limit on the second tool call.

The error looks like this:

HTTP 429: Quota exceeded for metric: 
generativelanguage.googleapis.com/generate_content_free_tier_requests
limit: 5, model: gemini-3.5-flash

And then Hermes retries, hits the limit again, and eventually gives up. You end up with a half-completed task and no useful output.

The fix: don't use cloud APIs for agentic tasks on a free tier. The request volume is just too high.

Step 3: Going Local with Ollama

This is where Apple Silicon earns its reputation.

Ollama runs LLMs locally using Apple's Metal framework — your GPU and CPU share the same unified memory pool, which means models load fast and run at genuinely usable speeds.

Install Ollama:

brew install ollama
ollama serve  # keep this running in a separate terminal tab

Now the model choice matters. On a 16GB M-series machine:

Model	Size	Speed	Context	Verdict
qwen3:8b	5.2GB	~50 tok/s	40K	Good for most tasks
gemma3:12b	~8GB	~30 tok/s	128K	Smarter, but slower
llama3.2:3b	2GB	~90 tok/s	128K	Fast but less capable
anything 30B+	>16GB	Unusable	—	Skip entirely

I went with qwen3:8b:

ollama pull qwen3:8b

Then switch Hermes to use it:

hermes config set provider ollama
hermes config set base_url http://localhost:11434/v1
hermes config set model qwen3:8b
hermes config set model.context_length 65536
hermes config set model.ollama_num_ctx 65536

Critical: Those last two lines are not optional. Hermes requires a minimum 64K context window. Qwen3:8b defaults to 40K. Without the override, Hermes will refuse to initialize every single time with this error:

Model qwen3:8b has a context window of 40,960 tokens, which is below 
the minimum 64,000 required by Hermes Agent.

Step 4: The Honest Performance Reality

I'm not going to pretend qwen3:8b on an M5 base model is fast for agentic tasks.

A simple factual question: ~15-20 seconds.
A multi-step agentic task with 5-6 tool calls: 8-12 minutes.

For a demo or a prototype, that's acceptable. For something you'd run continuously in production, you'd want either a paid API or a machine with more RAM to run a larger model.

The tradeoff is clear: speed vs. cost vs. privacy. Local Ollama gives you infinite requests, zero cost, and complete data privacy. You pay for it in latency.

For my use case — an agent that runs once daily to process regulatory documents — the latency is completely fine. The agent runs overnight and the output is ready in the morning.

Step 5: What Hermes Is Actually Good At

Once everything is running, here's what genuinely impressed me:

Terminal tool chaining. Hermes can execute a sequence of shell commands, read the output of each one, and use that output to decide what to do next. This is the core of what makes it an agent rather than a script runner.

Staying on task. With a well-written prompt, Hermes doesn't get distracted. It completes the steps you gave it without asking for clarification on every detail.

The skills system. Hermes ships with 90+ pre-built skills — integrations with GitHub, Obsidian, Spotify, Google Workspace, and dozens more. These aren't just API wrappers; they're prompting strategies that tell Hermes how to use each tool effectively.

What it struggles with on smaller models:

Complex multi-step reasoning where each step builds on the last
Tasks that require reading a long document and making nuanced judgments
Anything where the prompt is ambiguous

The last point is on the user, not Hermes. Clear, specific prompts produce dramatically better results than vague ones.

The Setup That Actually Works

Here's the complete working configuration for M-series Mac, free tier, local model:

# 1. Install
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.zshrc
hermes postinstall  # don't skip this

# 2. Install Ollama and pull a model
brew install ollama
ollama serve &  # or run in a separate tab
ollama pull qwen3:8b

# 3. Configure Hermes
hermes config set provider ollama
hermes config set base_url http://localhost:11434/v1
hermes config set model qwen3:8b
hermes config set model.context_length 65536
hermes config set model.ollama_num_ctx 65536

# 4. Start
hermes

Test it works:

Search the web for the latest news about open source AI agents

If Hermes uses the browser tool and returns actual results, you're set.

My Honest Take

Hermes Agent is the most capable open-source agent I've used. The tool ecosystem is genuinely broad, the install experience is smooth, and the episodic memory system is an idea that most commercial agent frameworks haven't caught up to yet.

The documentation gap is real — the official docs cover the happy path well, but edge cases like the Ollama context window requirement or the Playwright install step are nowhere to be found. You find them by hitting errors.

For developers who want to build real agentic workflows without API costs or data privacy concerns, Hermes on Apple Silicon is a genuinely viable stack. The latency is the price you pay. On most tasks, it's worth it.

Built and tested on M5 MacBook Air 16GB, macOS Sequoia, Hermes Agent v0.14.0, Ollama 0.6.x, qwen3:8b.