<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kunal Pratap Singh</title>
    <description>The latest articles on DEV Community by Kunal Pratap Singh (@kunal_pratapsingh_50fdc8).</description>
    <link>https://dev.to/kunal_pratapsingh_50fdc8</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3198969%2F9968cbfe-76d3-41b5-9d77-aa7718b36051.png</url>
      <title>DEV Community: Kunal Pratap Singh</title>
      <link>https://dev.to/kunal_pratapsingh_50fdc8</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kunal_pratapsingh_50fdc8"/>
    <language>en</language>
    <item>
      <title>I Built an Autonomous RBI Regulatory Digest Agent with Hermes Agent</title>
      <dc:creator>Kunal Pratap Singh</dc:creator>
      <pubDate>Sun, 31 May 2026 18:30:14 +0000</pubDate>
      <link>https://dev.to/kunal_pratapsingh_50fdc8/i-built-an-autonomous-rbi-regulatory-digest-agent-with-hermes-agent-2f8g</link>
      <guid>https://dev.to/kunal_pratapsingh_50fdc8/i-built-an-autonomous-rbi-regulatory-digest-agent-with-hermes-agent-2f8g</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Build With Hermes Agent&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Every time the Reserve Bank of India publishes a circular, somewhere inside an Indian bank, a compliance officer opens a PDF.&lt;/p&gt;

&lt;p&gt;They read it. They try to figure out what it means for their institution specifically. They write a summary email. They forward it to five department heads. They chase those department heads for two weeks to confirm it's been actioned. They build a spreadsheet to track all of this. And then the next circular drops and the cycle starts again.&lt;/p&gt;

&lt;p&gt;RBI publishes hundreds of circulars a year. SEBI publishes more. MCA publishes more still. Compliance teams at Indian banks are drowning — not because they're incompetent, but because the volume of regulatory output has outpaced any reasonable human ability to track it manually.&lt;/p&gt;

&lt;p&gt;The fine for missing a deadline isn't a polite reminder. It's a penalty notice.&lt;/p&gt;

&lt;p&gt;This is the problem I built for.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;RBI Regulatory Digest Agent&lt;/strong&gt; — an autonomous multi-step agent powered by Hermes Agent that monitors RBI and SEBI publication feeds, reads every new circular, extracts structured action points from the regulatory text, and delivers a formatted intelligence report to compliance teams automatically.&lt;/p&gt;

&lt;p&gt;No human reads the circular first. No human decides what's important. No human routes it to the right department. The agent does all of that.&lt;/p&gt;

&lt;h3&gt;
  
  
  The pipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RBI/SEBI feeds → new circular detected → full text extracted →
LLM analysis → structured action points → risk classification →
HTML dashboard generated → email delivered
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every action point extracted contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What needs to be done&lt;/strong&gt; — specific and actionable, not a vague summary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deadline&lt;/strong&gt; — parsed from the circular text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Responsible department&lt;/strong&gt; — Credit, Compliance, Treasury, Operations, IT, Legal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evidence required&lt;/strong&gt; — what documentation confirms completion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority&lt;/strong&gt; — Critical (overdue or &amp;lt;7 days), High, Medium, Low&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From a new circular to a structured compliance briefing in under 10 seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/7Dsgk76K87o"&gt;
  &lt;/iframe&gt;
&lt;br&gt;
The demo shows Hermes Agent receiving a single instruction and autonomously:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Polling the RBI notification feed&lt;/li&gt;
&lt;li&gt;Detecting 10 new circulars&lt;/li&gt;
&lt;li&gt;Fetching and parsing each circular's text&lt;/li&gt;
&lt;li&gt;Running LLM extraction on each one&lt;/li&gt;
&lt;li&gt;Generating a risk-classified HTML dashboard with 46 action points across 8 departments&lt;/li&gt;
&lt;li&gt;Delivering the digest by email&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No human intervention between step 1 and step 6.&lt;/p&gt;


&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/kunalp-singh/rbi-digest" rel="noopener noreferrer"&gt;https://github.com/kunalp-singh/rbi-digest&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Project structure
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;rbi_monitor.py      # Polls RBI/SEBI RSS feeds, deduplicates, stores to SQLite
extractor.py        # Fetches circular text, runs LLM extraction, stores action points  
digest.py           # Builds risk-classified HTML dashboard, sends email
hermes_prompt.txt   # The single instruction Hermes receives to run everything
circulars.db        # SQLite store (gitignored)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  My Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hermes Agent&lt;/strong&gt; — orchestrates the full pipeline via terminal tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python 3.11&lt;/strong&gt; — core runtime&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama + Qwen3:8b&lt;/strong&gt; — local LLM for circular analysis, runs entirely on-device&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BeautifulSoup4 + lxml&lt;/strong&gt; — RSS feed parsing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite&lt;/strong&gt; — circular deduplication and action point storage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI-compatible client&lt;/strong&gt; — connects Python to local Ollama endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SMTP / Gmail&lt;/strong&gt; — email delivery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTML/CSS&lt;/strong&gt; — dashboard with risk heatmap, compliance timeline, department cards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything runs locally. No data leaves the machine. No cloud LLM API required.&lt;/p&gt;


&lt;h2&gt;
  
  
  How I Used Hermes Agent
&lt;/h2&gt;

&lt;p&gt;This is the part that matters most for the challenge, so I want to be precise about it.&lt;/p&gt;
&lt;h3&gt;
  
  
  The single instruction
&lt;/h3&gt;

&lt;p&gt;The entire pipeline runs from one message to Hermes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Use the terminal tool to run this &lt;span class="nb"&gt;command &lt;/span&gt;exactly:
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/Studies/Projects/hermesProject &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source &lt;/span&gt;venv/bin/activate &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; 
python3 rbi_monitor.py &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; python3 extractor.py &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; python3 digest.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hermes takes this, uses its &lt;strong&gt;terminal tool&lt;/strong&gt; to execute the full chain, monitors the output of each step, and reports back a summary — how many circulars were found, how many action points extracted, which departments have critical items, and confirmation that the digest was delivered.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Hermes was the right fit
&lt;/h3&gt;

&lt;p&gt;I could have run a cron job directly. I chose Hermes for three specific reasons:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Multi-step reasoning across tool calls&lt;/strong&gt;&lt;br&gt;
Each script depends on the previous one succeeding. Hermes doesn't just fire and forget — it monitors output, detects failures, and can adapt. When the extractor found 0 action points on informational circulars, Hermes correctly reported that as expected behaviour rather than an error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Natural language orchestration&lt;/strong&gt;&lt;br&gt;
The &lt;code&gt;hermes_prompt.txt&lt;/code&gt; file is plain English. Any compliance team member can read it, understand what the agent is doing, and modify it. There's no bash scripting knowledge required to change which feeds are monitored or what gets reported.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Episodic memory for future runs&lt;/strong&gt;&lt;br&gt;
Hermes logs what it processed in each session. Future runs can reference past extractions — "you processed this circular last week, here's what changed" — which is exactly the kind of institutional memory a compliance team needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agentic capabilities used
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Terminal tool&lt;/strong&gt; — executes the monitoring, extraction, and digest pipeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-step execution&lt;/strong&gt; — coordinates three dependent scripts in sequence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output interpretation&lt;/strong&gt; — reads script output and generates a human-readable summary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Planning&lt;/strong&gt; — determines correct execution order without being told explicitly&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What the Output Looks Like
&lt;/h2&gt;

&lt;p&gt;From the most recent run against live RBI feeds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;46 action points&lt;/strong&gt; extracted from 20 circulars&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8 departments&lt;/strong&gt; impacted: Compliance, Credit, Treasury, IT, Operations, Legal, Technology, Board/Governance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;11 critical items&lt;/strong&gt; — deadlines already passed or within 7 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;20 items&lt;/strong&gt; with specific upcoming deadlines tracked in the compliance timeline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sample action points extracted by the LLM:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Compliance dept — CRITICAL&lt;/em&gt;: Submit required documents for claiming Agency Commission — Deadline: April 30, 2026 — Source: RBI Conduct of Government Business Directions&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Credit dept — MEDIUM&lt;/em&gt;: Update credit risk assessment policies to include calamity impact on borrowers — Deadline: July 1, 2026 — Source: RBI Credit Risk Management Amendment Directions&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Operations dept — MEDIUM&lt;/em&gt;: Appoint Nodal Officers for pension grievance handling — Deadline: Immediate effect — Source: RBI Disbursement of Government Pension Directions&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every item links directly to the source RBI circular.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This is a working prototype, not a finished product. The honest limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Currently monitors RBI and SEBI only — IRDAI, MCA, FATF feeds can be added&lt;/li&gt;
&lt;li&gt;LLM extraction runs locally on Qwen3:8b — a larger model would improve accuracy on dense legal text&lt;/li&gt;
&lt;li&gt;The dashboard is static HTML — a real deployment would have a live web interface with team login, action acknowledgement, and audit trail&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architecture is modular enough that each of these is an extension, not a rewrite.&lt;/p&gt;

&lt;p&gt;The core insight this project validates: &lt;strong&gt;regulatory compliance monitoring is exactly the kind of repetitive, high-stakes, structured reasoning task that agentic AI should be doing instead of humans.&lt;/strong&gt; Hermes Agent made it possible to prototype that in days rather than months.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built during the Hermes Agent Challenge, May 2026. All data sourced from live RBI publication feeds.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
    </item>
    <item>
      <title>What Nobody Tells You About Running Hermes Agent Locally (M-Series Mac Edition)</title>
      <dc:creator>Kunal Pratap Singh</dc:creator>
      <pubDate>Sun, 31 May 2026 17:33:11 +0000</pubDate>
      <link>https://dev.to/kunal_pratapsingh_50fdc8/what-nobody-tells-you-about-running-hermes-agent-locally-m-series-mac-edition-434f</link>
      <guid>https://dev.to/kunal_pratapsingh_50fdc8/what-nobody-tells-you-about-running-hermes-agent-locally-m-series-mac-edition-434f</guid>
      <description>&lt;h1&gt;
  
  
  What Nobody Tells You About Running Hermes Agent Locally (M-Series Mac Edition)
&lt;/h1&gt;

&lt;p&gt;I spent a day building a real project with Hermes Agent on my M5 MacBook Air with 16GB RAM and zero API budget. This is the honest account of what broke, what worked, and what I wish someone had told me before I started.&lt;/p&gt;

&lt;p&gt;If you're on Apple Silicon and want to run Hermes Agent locally without paying for API credits, this post is for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Hermes Agent Actually Is
&lt;/h2&gt;

&lt;p&gt;Before I get into the setup pain, a quick framing for people who haven't used it yet.&lt;/p&gt;

&lt;p&gt;Hermes Agent is an open-source autonomous agent built by NousResearch: the team behind the Hermes family of fine-tuned models. It's not a chatbot wrapper. It's a full agentic loop: it receives a goal, breaks it into steps, selects from 40+ built-in tools (browser, terminal, file system, code execution, cron jobs, messaging platforms), executes those steps, and iterates until the task is done.&lt;/p&gt;

&lt;p&gt;The part that makes it genuinely different from most agent frameworks is episodic memory. After each task, Hermes writes a structured record of what worked and what didn't. On future tasks, it retrieves those records and adjusts its approach. It actually learns from its own history.&lt;/p&gt;

&lt;p&gt;It's MIT licensed, runs on your own machine, and supports OpenAI, Anthropic, Google, and local models via Ollama.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Installation (The Easy Part)
&lt;/h2&gt;

&lt;p&gt;Installation is genuinely one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It installs everything automatically — Node.js, browser dependencies, the works. Takes about 3 minutes. After that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; ~/.zshrc
hermes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see the Hermes ASCII banner and a list of available tools and skills. This part actually works exactly as advertised.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First lesson:&lt;/strong&gt; Run &lt;code&gt;hermes postinstall&lt;/code&gt; after the main install. The base installer skips Playwright (the browser automation library). If you skip this step, every browser-related task will fail silently and you'll waste an hour debugging.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: The API Provider Trap
&lt;/h2&gt;

&lt;p&gt;Here's where I hit the first wall.&lt;/p&gt;

&lt;p&gt;Hermes supports a huge list of providers — OpenAI, Anthropic, Google Gemini, Ollama, and about 30 others. The interactive setup is clean and fast. But the provider you choose matters enormously for agentic tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I tried first: Gemini free tier&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Google's Gemini API has a free tier. Sounds perfect. The problem is the rate limits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;gemini-2.5-flash&lt;/code&gt;: 5 requests per minute on free tier&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gemini-flash-latest&lt;/code&gt;: slightly better, still very low&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a simple chatbot, 5 requests/minute is fine. For an agentic task where Hermes might make 15-20 API calls to complete a single multi-step workflow (browse a page → take a screenshot → analyze it → decide next step → browse again), you'll hit the rate limit on the second tool call.&lt;/p&gt;

&lt;p&gt;The error looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;HTTP 429: Quota exceeded for metric: 
generativelanguage.googleapis.com/generate_content_free_tier_requests
limit: 5, model: gemini-3.5-flash
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then Hermes retries, hits the limit again, and eventually gives up. You end up with a half-completed task and no useful output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix: don't use cloud APIs for agentic tasks on a free tier.&lt;/strong&gt; The request volume is just too high.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Going Local with Ollama
&lt;/h2&gt;

&lt;p&gt;This is where Apple Silicon earns its reputation.&lt;/p&gt;

&lt;p&gt;Ollama runs LLMs locally using Apple's Metal framework — your GPU and CPU share the same unified memory pool, which means models load fast and run at genuinely usable speeds.&lt;/p&gt;

&lt;p&gt;Install Ollama:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama
ollama serve  &lt;span class="c"&gt;# keep this running in a separate terminal tab&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the model choice matters. On a 16GB M-series machine:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Speed&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;th&gt;Verdict&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;qwen3:8b&lt;/td&gt;
&lt;td&gt;5.2GB&lt;/td&gt;
&lt;td&gt;~50 tok/s&lt;/td&gt;
&lt;td&gt;40K&lt;/td&gt;
&lt;td&gt;Good for most tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gemma3:12b&lt;/td&gt;
&lt;td&gt;~8GB&lt;/td&gt;
&lt;td&gt;~30 tok/s&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;Smarter, but slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama3.2:3b&lt;/td&gt;
&lt;td&gt;2GB&lt;/td&gt;
&lt;td&gt;~90 tok/s&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;Fast but less capable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;anything 30B+&lt;/td&gt;
&lt;td&gt;&amp;gt;16GB&lt;/td&gt;
&lt;td&gt;Unusable&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Skip entirely&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I went with &lt;code&gt;qwen3:8b&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen3:8b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then switch Hermes to use it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes config &lt;span class="nb"&gt;set &lt;/span&gt;provider ollama
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;base_url http://localhost:11434/v1
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model qwen3:8b
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.context_length 65536
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.ollama_num_ctx 65536
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Critical:&lt;/strong&gt; Those last two lines are not optional. Hermes requires a minimum 64K context window. Qwen3:8b defaults to 40K. Without the override, Hermes will refuse to initialize every single time with this error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Model qwen3:8b has a context window of 40,960 tokens, which is below 
the minimum 64,000 required by Hermes Agent.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4: The Honest Performance Reality
&lt;/h2&gt;

&lt;p&gt;I'm not going to pretend qwen3:8b on an M5 base model is fast for agentic tasks.&lt;/p&gt;

&lt;p&gt;A simple factual question: ~15-20 seconds.&lt;br&gt;
A multi-step agentic task with 5-6 tool calls: 8-12 minutes.&lt;/p&gt;

&lt;p&gt;For a demo or a prototype, that's acceptable. For something you'd run continuously in production, you'd want either a paid API or a machine with more RAM to run a larger model.&lt;/p&gt;

&lt;p&gt;The tradeoff is clear: &lt;strong&gt;speed vs. cost vs. privacy.&lt;/strong&gt; Local Ollama gives you infinite requests, zero cost, and complete data privacy. You pay for it in latency.&lt;/p&gt;

&lt;p&gt;For my use case — an agent that runs once daily to process regulatory documents — the latency is completely fine. The agent runs overnight and the output is ready in the morning.&lt;/p&gt;


&lt;h2&gt;
  
  
  Step 5: What Hermes Is Actually Good At
&lt;/h2&gt;

&lt;p&gt;Once everything is running, here's what genuinely impressed me:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terminal tool chaining.&lt;/strong&gt; Hermes can execute a sequence of shell commands, read the output of each one, and use that output to decide what to do next. This is the core of what makes it an agent rather than a script runner.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Staying on task.&lt;/strong&gt; With a well-written prompt, Hermes doesn't get distracted. It completes the steps you gave it without asking for clarification on every detail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills system.&lt;/strong&gt; Hermes ships with 90+ pre-built skills — integrations with GitHub, Obsidian, Spotify, Google Workspace, and dozens more. These aren't just API wrappers; they're prompting strategies that tell Hermes how to use each tool effectively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it struggles with on smaller models:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complex multi-step reasoning where each step builds on the last&lt;/li&gt;
&lt;li&gt;Tasks that require reading a long document and making nuanced judgments&lt;/li&gt;
&lt;li&gt;Anything where the prompt is ambiguous&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The last point is on the user, not Hermes. Clear, specific prompts produce dramatically better results than vague ones.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Setup That Actually Works
&lt;/h2&gt;

&lt;p&gt;Here's the complete working configuration for M-series Mac, free tier, local model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Install&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.zshrc
hermes postinstall  &lt;span class="c"&gt;# don't skip this&lt;/span&gt;

&lt;span class="c"&gt;# 2. Install Ollama and pull a model&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama
ollama serve &amp;amp;  &lt;span class="c"&gt;# or run in a separate tab&lt;/span&gt;
ollama pull qwen3:8b

&lt;span class="c"&gt;# 3. Configure Hermes&lt;/span&gt;
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;provider ollama
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;base_url http://localhost:11434/v1
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model qwen3:8b
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.context_length 65536
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.ollama_num_ctx 65536

&lt;span class="c"&gt;# 4. Start&lt;/span&gt;
hermes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Search the web for the latest news about open source AI agents
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If Hermes uses the browser tool and returns actual results, you're set.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Honest Take
&lt;/h2&gt;

&lt;p&gt;Hermes Agent is the most capable open-source agent I've used. The tool ecosystem is genuinely broad, the install experience is smooth, and the episodic memory system is an idea that most commercial agent frameworks haven't caught up to yet.&lt;/p&gt;

&lt;p&gt;The documentation gap is real — the official docs cover the happy path well, but edge cases like the Ollama context window requirement or the Playwright install step are nowhere to be found. You find them by hitting errors.&lt;/p&gt;

&lt;p&gt;For developers who want to build real agentic workflows without API costs or data privacy concerns, Hermes on Apple Silicon is a genuinely viable stack. The latency is the price you pay. On most tasks, it's worth it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built and tested on M5 MacBook Air 16GB, macOS Sequoia, Hermes Agent v0.14.0, Ollama 0.6.x, qwen3:8b.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
