<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Prashant Maurya </title>
    <description>The latest articles on DEV Community by Prashant Maurya  (@_prshant01).</description>
    <link>https://dev.to/_prshant01</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3293209%2Fc1657e23-17b2-4467-9e61-6c78c34f5de4.png</url>
      <title>DEV Community: Prashant Maurya </title>
      <link>https://dev.to/_prshant01</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_prshant01"/>
    <language>en</language>
    <item>
      <title>Hermes Agent's Kanban System Is the Most Underrated Feature in Open Source AI Agents</title>
      <dc:creator>Prashant Maurya </dc:creator>
      <pubDate>Mon, 01 Jun 2026 04:29:32 +0000</pubDate>
      <link>https://dev.to/_prshant01/hermes-agents-kanban-system-is-the-most-underrated-feature-in-open-source-ai-agents-3af6</link>
      <guid>https://dev.to/_prshant01/hermes-agents-kanban-system-is-the-most-underrated-feature-in-open-source-ai-agents-3af6</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;When people talk about Hermes Agent, they talk about the Skills System and the persistent memory. Those are genuinely impressive. But there's a feature in the v0.12 "Tenacity Release" that I think deserves more attention: the &lt;strong&gt;Kanban multi-agent system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This post is about what it actually does, why it matters, and why most agent frameworks haven't solved the problem it's solving.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Agents That Don't Finish
&lt;/h2&gt;

&lt;p&gt;Here's a pattern that anyone who's used AI agents on long tasks will recognize:&lt;/p&gt;

&lt;p&gt;You give the agent a complex, multi-step task. It starts well. Somewhere in the middle — a tool call fails, a subprocess hangs, the context window fills, the model gets confused about state — and the agent either loops, produces garbage, or just stops. You come back an hour later to find it stuck or finished with something completely wrong.&lt;/p&gt;

&lt;p&gt;This isn't a model intelligence problem. It's a &lt;strong&gt;state management and fault tolerance problem&lt;/strong&gt;. The agent has no durable record of what it's done, what's pending, and what failed. When something goes wrong, there's no recovery path.&lt;/p&gt;

&lt;p&gt;Hermes's Kanban system is a direct answer to this.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Kanban System Is
&lt;/h2&gt;

&lt;p&gt;The Kanban ships as a durable multi-agent task board — a structured queue of tasks with explicit state transitions, built-in fault tolerance, and automatic recovery.&lt;/p&gt;

&lt;p&gt;Tasks on the board have states: &lt;code&gt;todo&lt;/code&gt;, &lt;code&gt;in_progress&lt;/code&gt;, &lt;code&gt;blocked&lt;/code&gt;, &lt;code&gt;done&lt;/code&gt;, &lt;code&gt;failed&lt;/code&gt;. The board persists across restarts. Agents working on tasks emit heartbeats. If a heartbeat stops, the task is automatically reclaimed and either retried or escalated.&lt;/p&gt;

&lt;p&gt;The key components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Heartbeat monitoring&lt;/strong&gt; — Every active task has a heartbeat timer. If an agent working on a task misses its heartbeat window (it crashed, hung, or the process died), the system detects this automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zombie detection&lt;/strong&gt; — A "zombie" is an agent that stopped responding but didn't cleanly exit. The system detects zombie agents and reclaims their tasks rather than leaving them stuck in &lt;code&gt;in_progress&lt;/code&gt; forever.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto-block on incomplete exit&lt;/strong&gt; — If a task's assigned agent exits without marking the task done or failed, the board automatically moves the task to &lt;code&gt;blocked&lt;/code&gt; state. Nothing silently falls through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-task retries&lt;/strong&gt; — Failed tasks can be configured to automatically retry up to N times before escalating. You set retry policy per task or per board.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hallucination recovery&lt;/strong&gt; — This one is subtle. When an agent produces output that contradicts its own task log (claims it completed a step it never ran), the board detects the inconsistency and flags it for review rather than silently marking the task done.&lt;/p&gt;




&lt;h2&gt;
  
  
  The &lt;code&gt;/goal&lt;/code&gt; Command: Staying on Target
&lt;/h2&gt;

&lt;p&gt;Alongside Kanban, the v0.12 release added &lt;code&gt;/goal&lt;/code&gt; — what the docs call the "Ralph loop."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/goal Ship the auth module with tests and a PR by end of session
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps the agent locked on a target across turns. Instead of each message being independently interpreted, every subsequent action is evaluated against the declared goal. The agent won't drift — if a sub-task would take it away from the goal, it recognizes this and gets back on track.&lt;/p&gt;

&lt;p&gt;Combined with Kanban, this means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You declare a goal&lt;/li&gt;
&lt;li&gt;Hermes decomposes it into a Kanban board of tasks&lt;/li&gt;
&lt;li&gt;Subagents pick up tasks and work on them in parallel&lt;/li&gt;
&lt;li&gt;Failed tasks get retried; zombie agents get reclaimed; blocked tasks get escalated&lt;/li&gt;
&lt;li&gt;The agent tracks progress against the original goal and knows when it's actually done&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is what "the agent finishes what it starts" looks like in practice.&lt;/p&gt;




&lt;h2&gt;
  
  
  Subagent Delegation: The Parallelism Layer
&lt;/h2&gt;

&lt;p&gt;The Kanban system is most powerful when combined with Hermes's subagent delegation via the &lt;code&gt;delegate_task&lt;/code&gt; tool.&lt;/p&gt;

&lt;p&gt;A parent agent with a complex task can spawn up to 3 child agents by default (configurable), each with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Isolated context (the subagent knows only what it needs to)&lt;/li&gt;
&lt;li&gt;Restricted toolsets (it can only use the tools relevant to its task)&lt;/li&gt;
&lt;li&gt;Its own terminal session (no file-state collisions between agents)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The parent agent coordinates — it doesn't do the work directly. It delegates, monitors progress via the Kanban board, handles escalations, and synthesizes results.&lt;/p&gt;

&lt;p&gt;In practice, this looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Parent: "Build a REST API with authentication, tests, and documentation"

→ Subagent 1: Implements the core API endpoints
→ Subagent 2: Writes integration tests
→ Subagent 3: Drafts API documentation

Parent: Monitors all three, handles merge conflicts, synthesizes final output
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without durable state management, parallel subagents are fragile — if one fails, you don't know which one, and recovery is manual. The Kanban board makes parallel execution safe by making task state explicit and recoverable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Checkpoints v2: The Safety Net
&lt;/h2&gt;

&lt;p&gt;Running parallel agents doing real work means real risk. A subagent making file changes can go wrong.&lt;/p&gt;

&lt;p&gt;Hermes's Checkpoints v2 (also part of the Tenacity Release) handles this. Before any file mutation, the system automatically snapshots the working directory. The &lt;code&gt;checkpoint_manager&lt;/code&gt; tracks these snapshots with real pruning — old checkpoints get cleaned up, not accumulated indefinitely.&lt;/p&gt;

&lt;p&gt;If something goes wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/rollback
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. You're back to before the last file-mutating operation. Combined with the Kanban board's task state, this means a failed multi-agent run doesn't leave you with a partially-mutated codebase in an unknown state.&lt;/p&gt;




&lt;h2&gt;
  
  
  Gateway Auto-Resume: Surviving Restarts
&lt;/h2&gt;

&lt;p&gt;One more piece of the reliability picture: gateway auto-resume.&lt;/p&gt;

&lt;p&gt;In previous versions, if the Hermes gateway process restarted (server reboot, OOM kill, network drop), all in-progress agent sessions were lost. You'd have to restart tasks manually.&lt;/p&gt;

&lt;p&gt;With the Tenacity Release, the gateway automatically resumes interrupted sessions after restart. The Kanban board state is persisted, in-progress tasks get reclaimed, and the agent picks up roughly where it left off.&lt;/p&gt;

&lt;p&gt;This matters more than it sounds for anyone running Hermes on a VPS or in a container. Process crashes happen. An agent system that survives them gracefully is a different category of tool than one that needs babysitting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Architecture Is Rare
&lt;/h2&gt;

&lt;p&gt;Most agent frameworks don't have an equivalent answer to durable multi-agent task management. Here's why:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The research community optimizes for single-agent performance.&lt;/strong&gt; Benchmarks are almost all single-agent: can the agent solve this coding problem, answer this question, complete this task. Multi-agent coordination with fault tolerance is an engineering problem, not a benchmark problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Durable state is hard.&lt;/strong&gt; Most frameworks store task state in memory or simple files. Real durability — heartbeat monitoring, zombie detection, restart recovery — requires more infrastructure investment than most open source projects make.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failure modes are subtle.&lt;/strong&gt; An agent that fails loudly is easy to fix. An agent that succeeds incorrectly — marks a task done when it hallucinated the last step — is hard to detect without explicit verification. Most frameworks don't have hallucination recovery in their task management layer.&lt;/p&gt;

&lt;p&gt;Hermes is, to my knowledge, the only open source agent framework that ships all of these in a single installable package.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to Use the Kanban System
&lt;/h2&gt;

&lt;p&gt;The Kanban + subagent delegation is overkill for simple tasks. Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The task takes more than 20–30 minutes to complete&lt;/li&gt;
&lt;li&gt;The task has multiple independent subtasks that can run in parallel&lt;/li&gt;
&lt;li&gt;You're running unattended (scheduled cron, overnight batch)&lt;/li&gt;
&lt;li&gt;The cost of partial completion and unknown state is high (production deployments, large codebases)&lt;/li&gt;
&lt;li&gt;You need a clear audit trail of what happened&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For conversational tasks, quick lookups, or one-off automations, just use regular Hermes chat. The Kanban is for the serious workloads.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It Together
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start a multi-agent project&lt;/span&gt;
/goal Build a &lt;span class="nb"&gt;complete &lt;/span&gt;user authentication module: JWT, refresh tokens, tests, docs

&lt;span class="c"&gt;# Hermes decomposes into Kanban tasks, spawns subagents, monitors progress&lt;/span&gt;
&lt;span class="c"&gt;# You can check status at any point&lt;/span&gt;
/kanban status

&lt;span class="c"&gt;# If something fails, check what happened&lt;/span&gt;
/kanban log

&lt;span class="c"&gt;# Roll back if needed&lt;/span&gt;
/rollback
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The v0.12 "Tenacity Release" shipped 864 commits, 588 merged PRs, and closed 282 issues (including 13 P0s and 36 P1s). The Kanban system is the centerpiece, but the security wave (WhatsApp rejecting strangers by default, Discord role-allowlists, redaction on by default) and Google Chat as the 20th platform are also worth noting.&lt;/p&gt;

&lt;p&gt;The name "Tenacity" is accurate. This release is about making the agent finish what it starts, survive what it can't prevent, and be honest about what went wrong.&lt;/p&gt;

&lt;p&gt;That's a harder problem than raw capability — and it's the one that actually matters for production use.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Get started:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://hermes-agent.nousresearch.com/docs/user-guide/features/subagents" rel="noopener noreferrer"&gt;Docs: Subagent Delegation&lt;/a&gt; · &lt;a href="https://github.com/NousResearch/hermes-agent/releases/tag/v2026.5.7" rel="noopener noreferrer"&gt;GitHub Release Notes&lt;/a&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Hermes Agent vs. The Rest — An Honest Comparison of Open Agentic Frameworks in 2026</title>
      <dc:creator>Prashant Maurya </dc:creator>
      <pubDate>Mon, 01 Jun 2026 04:28:19 +0000</pubDate>
      <link>https://dev.to/_prshant01/hermes-agent-vs-the-rest-an-honest-comparison-of-open-agentic-frameworks-in-2026-he2</link>
      <guid>https://dev.to/_prshant01/hermes-agent-vs-the-rest-an-honest-comparison-of-open-agentic-frameworks-in-2026-he2</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The agent framework space has exploded. AutoGen, CrewAI, LangGraph, OpenAI Agents SDK, Google ADK — each week brings something new. It's genuinely hard to know what to actually use.&lt;/p&gt;

&lt;p&gt;This post compares Hermes Agent against the most popular alternatives across five dimensions that actually matter for developers building real things: infrastructure flexibility, memory/learning, tool ecosystem, messaging/deployment, and openness. No fluff — just an honest breakdown.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Frameworks
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;Creator&lt;/th&gt;
&lt;th&gt;License&lt;/th&gt;
&lt;th&gt;Primary Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Nous Research&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;Any (OpenAI-compatible)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AutoGen&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Microsoft&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;Azure/OpenAI preferred&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CrewAI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CrewAI Inc.&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;OpenAI preferred&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LangGraph&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LangChain Inc.&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;Any (LangChain integrations)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Google ADK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;td&gt;Gemini preferred&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;GPT-4o/o-series&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  1. Infrastructure Flexibility
&lt;/h2&gt;

&lt;p&gt;Where does the agent actually run, and how much does it cost you?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt; offers six terminal backends: local, Docker, SSH, Daytona, Singularity (HPC clusters), and Modal (serverless). SSH backend means you can run it on any remote machine you already have. Modal means near-zero cost when idle. It runs on Linux, macOS, and WSL2 with zero prerequisites — the installer handles everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoGen&lt;/strong&gt; is primarily a Python library. You run it wherever you run Python. No native packaging, no single-command setup, no serverless consideration built in. Flexible but manual.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; is similar — a Python framework. CrewAI+ (their cloud) manages deployment for you, but that's a paid managed service, not open infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; has LangGraph Cloud for managed hosting (paid) and self-hosted options, but the self-hosting story involves more moving parts than you'd want for a quick project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google ADK&lt;/strong&gt; is built for Cloud Run. If you're already in GCP, this is seamless. If you're not, the path to deployment involves more ceremony than it should.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; is designed to run in your existing Python environment. No particular infrastructure story — you bring your own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; Hermes wins on infrastructure flexibility, especially for developers who want serverless-or-VPS without vendor commitment. Google ADK wins within GCP. Others require more DIY deployment work.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Memory and Learning Over Time
&lt;/h2&gt;

&lt;p&gt;This is the dimension where frameworks differ most dramatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt; has three memory layers working together: a Skills System (procedural memory in inspectable markdown files), persistent cross-session memory (FTS5 search + LLM summarization), and Honcho dialectic user modeling. The Autonomous Curator runs on a 7-day cycle to consolidate, prune, and update the skill library automatically. The agent creates its own skills after complex tasks without prompting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoGen&lt;/strong&gt; has &lt;code&gt;ConversableAgent&lt;/code&gt; with basic message history. There's no native cross-session memory — you manage persistence yourself. Community extensions exist but aren't core.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; added long-term memory via &lt;code&gt;LongTermMemory&lt;/code&gt;, short-term via &lt;code&gt;ShortTermMemory&lt;/code&gt;, and entity memory. It's the most structured memory system among the Python frameworks, but it's still session-bound by default and doesn't self-improve procedurally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; supports memory through LangMem and persistence layers. The developer controls what's stored and recalled. Flexible but requires explicit engineering work to get compound learning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google ADK&lt;/strong&gt; has session state and memory tools. Designed for stateful multi-turn conversations within a session. Cross-session persistence requires connecting to Firestore or another backend yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; ships with a basic memory tool and context objects. No autonomous learning or self-improvement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; Hermes has the most sophisticated and autonomous memory/learning system. CrewAI has the most structured memory among Python frameworks. Others require significant manual engineering to achieve comparable results.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Tool Ecosystem
&lt;/h2&gt;

&lt;p&gt;How easily can the agent do things — and what things can it do?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt; ships with a broad built-in tool registry: web search, browser automation (Browserbase, Browser Use, local Chrome), terminal execution, file editing, memory operations, subagent delegation, code execution (sandboxed Python RPC), image generation (9 models), voice/TTS, Home Assistant, X/Twitter search, computer use, and vision analysis. MCP servers add any tool from the MCP ecosystem. The Skills Hub adds 200+ site-specific browser automation skills from browse.sh. Channel-level skill bindings let you configure which tools are available per platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoGen&lt;/strong&gt; has a solid function-calling framework. You define tools as Python functions and register them. No built-in tool registry — you build what you need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; has a &lt;code&gt;@tool&lt;/code&gt; decorator pattern and a growing library of built-in tools (web search, file operations, code execution). More turnkey than AutoGen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; inherits LangChain's enormous tool ecosystem. If a tool exists in LangChain, it works in LangGraph. The breadth is unmatched — but so is the complexity of managing all the integrations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google ADK&lt;/strong&gt; has deep Google service integration (Search, Maps, Drive, Calendar, Gmail via MCP) and good built-in tool primitives. Non-Google integrations require more work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; has function tools, hosted tools (web search, code interpreter, file search via OpenAI's own infrastructure), and handoffs. Clean but tightly coupled to OpenAI's platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; LangChain's ecosystem via LangGraph is the broadest in terms of raw number of integrations. Hermes wins on built-in breadth without configuration — everything from browser automation to image generation is ready without extra packages. Google ADK wins within the Google ecosystem. OpenAI Agents SDK is cleanest but most closed.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Messaging and Deployment
&lt;/h2&gt;

&lt;p&gt;Can your agent talk to you where you actually are?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt; supports 20 messaging platforms via a gateway: Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Teams, Email, SMS, Mattermost, DingTalk, Feishu, Google Chat, and more. The gateway is a plugin host — new platform adapters can be dropped in. Everything runs from a single gateway process. Voice memos, cross-platform conversation continuity, slash commands on every platform. Built-in cron scheduler delivers results to any platform on a schedule.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoGen&lt;/strong&gt; has no native messaging integration. You build whatever delivery mechanism you want.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; has no native messaging platform support. CrewAI+ exposes an API you can call from anywhere, but the "talk to your agent from Telegram" story is DIY.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; same — no native messaging. You'd build this yourself using LangChain's integrations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google ADK&lt;/strong&gt; integrates with Google Chat and has Vertex AI deployment. Within Google Workspace, this is excellent. Outside it, less so.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; has no native messaging integration. OpenAI's products (ChatGPT, etc.) handle this separately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; Hermes is by far the strongest here. 20 supported platforms, single gateway process, cron + delivery built in. If you want your agent reachable from your phone without building infrastructure, Hermes is the only framework where this is a first-class feature.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Openness and Portability
&lt;/h2&gt;

&lt;p&gt;Can you actually own and move your agent?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt; is MIT. No model lock-in — works with any OpenAI-compatible endpoint. Skills are markdown files in &lt;code&gt;~/.hermes/skills/&lt;/code&gt; — portable, inspectable, version-controllable. Memory is local SQLite. The &lt;code&gt;agentskills.io&lt;/code&gt; open standard means skills work across compatible agents. No telemetry, no tracking, all data local.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoGen&lt;/strong&gt; is MIT and model-agnostic. Your code is yours. No proprietary data formats.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; has MIT core, but CrewAI+ (cloud features) is commercial. Skills/crews are Python code — portable but not as readable as markdown.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; is MIT. LangSmith (tracing/evaluation) and LangGraph Cloud are commercial. Framework is portable; the ecosystem increasingly nudges toward their paid products.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google ADK&lt;/strong&gt; is Apache 2.0. Model preference is clearly Gemini. Cloud Run deployment creates GCP coupling if you're not careful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; is MIT, but practically everything interesting (hosted tools, traces, evals) requires OpenAI's platform. Most locked-in of the group.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; AutoGen and Hermes are most open in practice. OpenAI Agents SDK is most closed. Others sit somewhere in between, with commercial upsell pressure present to varying degrees.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Hermes&lt;/th&gt;
&lt;th&gt;AutoGen&lt;/th&gt;
&lt;th&gt;CrewAI&lt;/th&gt;
&lt;th&gt;LangGraph&lt;/th&gt;
&lt;th&gt;Google ADK&lt;/th&gt;
&lt;th&gt;OpenAI SDK&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure flexibility&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory &amp;amp; learning&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool ecosystem&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Messaging &amp;amp; deployment&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐&lt;/td&gt;
&lt;td&gt;⭐&lt;/td&gt;
&lt;td&gt;⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Openness &amp;amp; portability&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Who Should Use What
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use Hermes if:&lt;/strong&gt; You want a general-purpose agent you fully control, that improves over time, that you can reach from your phone, and that runs on infrastructure you own. Best for individual developers and small teams building personal or project-level automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use AutoGen if:&lt;/strong&gt; You're doing research or building multi-agent systems where you want maximum programmatic control over agent interaction patterns. Better for academic or experimental work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use CrewAI if:&lt;/strong&gt; You want a structured role-based multi-agent system and the crew metaphor maps naturally to your problem. Good for pipelines where agents have clear, distinct jobs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use LangGraph if:&lt;/strong&gt; You need the breadth of the LangChain ecosystem and want graph-based control flow for complex stateful workflows. Best when you need a specific integration that only LangChain has.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Google ADK if:&lt;/strong&gt; You're building on GCP and want deep Google service integration. The deployment story is excellent within that ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use OpenAI Agents SDK if:&lt;/strong&gt; You're already invested in OpenAI's platform and want the cleanest, most polished developer experience within that ecosystem. Accept the lock-in as a tradeoff.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Differentiator
&lt;/h2&gt;

&lt;p&gt;Most frameworks solve "can the agent do the task." Hermes solves "will the agent still be useful six months from now without you constantly re-explaining your context."&lt;/p&gt;

&lt;p&gt;That's a different problem, and it matters more the longer you use an agent. The skill library, the Curator, the persistent memory — these compound. The other frameworks generally don't have an equivalent answer to this question.&lt;/p&gt;

&lt;p&gt;Whether that matters to you depends on whether you're building a one-off demo or a long-running workflow. For the latter, Hermes's architecture is genuinely ahead.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Try Hermes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://hermes-agent.nousresearch.com/docs" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt; · &lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Hermes Agent's Brain: How Its Skills &amp; Memory System Actually Works</title>
      <dc:creator>Prashant Maurya </dc:creator>
      <pubDate>Sun, 31 May 2026 12:24:55 +0000</pubDate>
      <link>https://dev.to/_prshant01/hermes-agents-brain-how-its-skills-memory-system-actually-works-45k2</link>
      <guid>https://dev.to/_prshant01/hermes-agents-brain-how-its-skills-memory-system-actually-works-45k2</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Most AI agents have a dirty secret: they forget everything the moment the session ends.&lt;/p&gt;

&lt;p&gt;You explain your project once. Then again next time. And again. The agent never gets better at &lt;em&gt;your&lt;/em&gt; workflow — it just stays a general-purpose tool that happens to be smart.&lt;/p&gt;

&lt;p&gt;Hermes Agent is built differently. It ships with two systems that together form something closer to a genuine long-term memory: a &lt;strong&gt;Skills System&lt;/strong&gt; and a &lt;strong&gt;Persistent Memory&lt;/strong&gt; layer. This post digs into how they actually work — not the marketing summary, but the mechanics.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Stateless Agents
&lt;/h2&gt;

&lt;p&gt;Before getting into Hermes, it's worth understanding what problem this solves.&lt;/p&gt;

&lt;p&gt;Standard LLM-based agents operate inside a context window. Everything the agent knows during a session lives in that window. When the session ends, it's gone. The next time you open a conversation, you're talking to an agent with no memory of you, your codebase, your preferences, or the workflows you've developed together.&lt;/p&gt;

&lt;p&gt;Some tools patch this with naive "memory" — they dump a text blob of past conversations into the system prompt. This works up to a point, but it's not selective, it gets expensive as context grows, and it doesn't help the agent get &lt;em&gt;better&lt;/em&gt; at tasks — just recall facts.&lt;/p&gt;

&lt;p&gt;Hermes takes a different approach with two distinct systems serving different purposes.&lt;/p&gt;




&lt;h2&gt;
  
  
  System 1: The Skills System (Procedural Memory)
&lt;/h2&gt;

&lt;p&gt;Skills in Hermes aren't plugins you install. They're &lt;strong&gt;on-demand knowledge documents&lt;/strong&gt; — markdown files the agent loads when it needs them, and more importantly, &lt;strong&gt;creates on its own&lt;/strong&gt; when it discovers something worth remembering.&lt;/p&gt;

&lt;h3&gt;
  
  
  The SKILL.md Format
&lt;/h3&gt;

&lt;p&gt;Every skill is a structured markdown file with a YAML frontmatter header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy-runbook&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Our deployment runbook — services, rollback, Slack channels&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1.0.0&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;hermes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;deployment&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;runbook&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;internal&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;requires_toolsets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;terminal&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Deploy Runbook&lt;/span&gt;

&lt;span class="gu"&gt;## When to Use&lt;/span&gt;
Trigger conditions for this skill.

&lt;span class="gu"&gt;## Procedure&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Step one
&lt;span class="p"&gt;2.&lt;/span&gt; Step two

&lt;span class="gu"&gt;## Pitfalls&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Known failure modes and fixes

&lt;span class="gu"&gt;## Verification&lt;/span&gt;
How to confirm it worked.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The structure is deliberate. It teaches the agent &lt;em&gt;when&lt;/em&gt; to use the skill, &lt;em&gt;how&lt;/em&gt; to execute it, &lt;em&gt;what can go wrong&lt;/em&gt;, and &lt;em&gt;how to verify success&lt;/em&gt;. That's not documentation — that's an executable procedure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Progressive Disclosure: How Skills Load Efficiently
&lt;/h3&gt;

&lt;p&gt;Here's where it gets clever. Skills don't get dumped into the context window all at once. They use a &lt;strong&gt;progressive disclosure&lt;/strong&gt; pattern across three levels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Level 0: skills_list()          → [{name, description, category}, ...]   (~3k tokens)
Level 1: skill_view(name)       → Full content + metadata                 (varies)
Level 2: skill_view(name, path) → Specific reference file                 (varies)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent first sees just names and descriptions — a light index. It only loads the full skill content when it actually needs to. And within a skill, supporting reference files are only fetched at level 2 when specifically required.&lt;/p&gt;

&lt;p&gt;This means you can have 50+ skills installed and the token overhead is minimal — only what's relevant gets loaded per task.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Agent Creates Its Own Skills
&lt;/h3&gt;

&lt;p&gt;This is the most underrated part. Hermes automatically creates new skills through the &lt;code&gt;skill_manage&lt;/code&gt; tool when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It completes a complex task (5+ tool calls) successfully&lt;/li&gt;
&lt;li&gt;It had to recover from errors and found the working path&lt;/li&gt;
&lt;li&gt;You corrected its approach mid-task&lt;/li&gt;
&lt;li&gt;It discovers a non-trivial multi-step workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;skill_manage&lt;/code&gt; tool has targeted actions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;create&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;New skill from scratch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;patch&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Targeted fix (preferred — more token-efficient)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;edit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Major structural rewrite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;delete&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Remove entirely&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;write_file&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Add supporting reference files&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In practice, this means the agent gets better at &lt;em&gt;your specific environment&lt;/em&gt; over time. If you have a particular deployment process, a quirky internal API, or a custom build system — after you walk through it once, the agent writes that down as a skill. Next time, it just uses the skill.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slash Commands: Skills as First-Class UX
&lt;/h3&gt;

&lt;p&gt;Every installed skill is automatically available as a slash command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/deploy-runbook                    &lt;span class="c"&gt;# loads the skill and asks what you need&lt;/span&gt;
/github-pr-workflow create a PR &lt;span class="k"&gt;for &lt;/span&gt;the auth refactor
/plan design a rollout &lt;span class="k"&gt;for &lt;/span&gt;migrating our auth provider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can invoke them from the CLI, Telegram, Discord — any platform Hermes is connected to.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Skills Hub: A Community Ecosystem
&lt;/h3&gt;

&lt;p&gt;Beyond agent-created skills, there's a whole ecosystem of installable skills from multiple sources:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes skills search kubernetes
hermes skills &lt;span class="nb"&gt;install &lt;/span&gt;openai/skills/k8s
hermes skills &lt;span class="nb"&gt;install &lt;/span&gt;well-known:https://mintlify.com/docs/.well-known/skills/mintlify
hermes skills &lt;span class="nb"&gt;install &lt;/span&gt;https://sharethis.chat/SKILL.md   &lt;span class="c"&gt;# direct URL&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supported sources include &lt;code&gt;skills.sh&lt;/code&gt; (Vercel's directory), well-known endpoints, GitHub repositories, ClawHub, LobeHub, and browse.sh (200+ site-specific browser automation skills for Airbnb, Amazon, arXiv, etc.).&lt;/p&gt;

&lt;p&gt;All hub-installed skills go through a security scanner checking for data exfiltration, prompt injection, and destructive commands before installation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Skill Bundles: Task Profiles
&lt;/h3&gt;

&lt;p&gt;When you always need the same set of skills together, bundle them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes bundles create backend-dev &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--skill&lt;/span&gt; github-code-review &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--skill&lt;/span&gt; test-driven-development &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--skill&lt;/span&gt; github-pr-workflow &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"Backend feature work — review, test, PR workflow"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then &lt;code&gt;/backend-dev refactor the auth middleware&lt;/code&gt; loads all three skills at once. You can ship team-wide task profiles by checking the bundle YAML into a shared dotfiles repo.&lt;/p&gt;




&lt;h2&gt;
  
  
  System 2: Persistent Memory (Episodic + Semantic Memory)
&lt;/h2&gt;

&lt;p&gt;Where the Skills System handles &lt;em&gt;how to do things&lt;/em&gt;, the Memory System handles &lt;em&gt;what it knows about you and your context&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Hermes uses &lt;strong&gt;FTS5 full-text search with LLM summarization&lt;/strong&gt; for cross-session recall. But the more interesting part is how it decides what to remember.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Curator: Selective Memory Formation
&lt;/h3&gt;

&lt;p&gt;Not every conversation detail is worth persisting. Hermes has a "Curator" component that periodically reviews recent interactions and decides what's actually worth storing long-term. This is closer to how human memory consolidation works — important, repeated, or explicitly notable information gets retained; noise gets discarded.&lt;/p&gt;

&lt;p&gt;The agent also &lt;strong&gt;nudges itself&lt;/strong&gt; to persist knowledge — meaning it's not purely passive. When it recognizes it's learned something worth keeping, it actively writes a memory entry rather than waiting for the Curator's next pass.&lt;/p&gt;

&lt;h3&gt;
  
  
  Honcho: Dialectic User Modeling
&lt;/h3&gt;

&lt;p&gt;The third piece of the memory architecture is integration with &lt;a href="https://github.com/plastic-labs/honcho" rel="noopener noreferrer"&gt;Honcho&lt;/a&gt;, which Hermes uses for what the docs call "dialectic user modeling."&lt;/p&gt;

&lt;p&gt;Rather than just storing facts about you as a flat list, Honcho builds a structured model of who you are — your working style, your preferences, the kind of errors you typically make, what you care about. This model updates through interaction, not just through explicit "remember this" commands.&lt;/p&gt;

&lt;p&gt;The practical result: the agent's responses adapt to you over time without you having to constantly re-explain your context.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Architecture Matters
&lt;/h2&gt;

&lt;p&gt;Here's what separates this from "we added memory" marketing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Skills are inspectable and editable.&lt;/strong&gt; They're markdown files in &lt;code&gt;~/.hermes/skills/&lt;/code&gt;. You can read them, edit them, delete them. There's no black box — you can see exactly what procedural knowledge the agent has built up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The agent improves on the right signal.&lt;/strong&gt; It creates skills after complex multi-step tasks, after errors, after corrections — not after every conversation. This keeps the skill library focused on non-trivial knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Memory and skills serve different purposes.&lt;/strong&gt; Skills are for procedures and workflows. Memory is for facts, preferences, and context. Mixing them up is a common mistake in agent design. Hermes keeps them separate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. The ecosystem is open.&lt;/strong&gt; The &lt;code&gt;agentskills.io&lt;/code&gt; standard means skills are portable across compatible agents. Publishing a tap is just pushing to a GitHub repo. No lock-in.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Test: Does It Actually Get Better?
&lt;/h2&gt;

&lt;p&gt;The honest answer is: it depends on how you use it.&lt;/p&gt;

&lt;p&gt;If you just ask one-off questions, you won't see much difference from a stateless agent. The memory and skills systems only compound value over repeated, complex interactions where procedures are worth encoding.&lt;/p&gt;

&lt;p&gt;But if you're using Hermes for a real project — deploying code, managing a workflow, running research pipelines — the self-improving loop starts to show up. After a week of use, the agent's skills directory fills with your actual workflows, not generic templates.&lt;/p&gt;

&lt;p&gt;That's the bet Nous Research is making with Hermes: that the future of useful AI agents isn't a smarter model, it's an agent that gets smarter &lt;em&gt;at your specific context&lt;/em&gt; over time.&lt;/p&gt;

&lt;p&gt;Whether that bet pays off depends on how much the skills and memory systems can actually automate the knowledge capture process — reducing the burden on you to explicitly teach the agent things. Based on the architecture, the foundation is sound. The proof is in extended use.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install (Linux / macOS / WSL2)&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

&lt;span class="c"&gt;# Setup with Nous Portal (covers model + web search + image gen + TTS)&lt;/span&gt;
hermes setup &lt;span class="nt"&gt;--portal&lt;/span&gt;

&lt;span class="c"&gt;# Browse available skills&lt;/span&gt;
hermes skills browse

&lt;span class="c"&gt;# Start a session and ask Hermes to teach itself your workflow&lt;/span&gt;
hermes chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full documentation: &lt;a href="https://hermes-agent.nousresearch.com/docs" rel="noopener noreferrer"&gt;hermes-agent.nousresearch.com/docs&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you found this breakdown useful, the &lt;a href="https://agentskills.io" rel="noopener noreferrer"&gt;Skills Hub&lt;/a&gt; is worth exploring — the community ecosystem is growing fast.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>opensource</category>
    </item>
    <item>
      <title>SparshAI: I Built an Offline AI Tutor for Students Using Gemma 4 — Here's What Happened</title>
      <dc:creator>Prashant Maurya </dc:creator>
      <pubDate>Sun, 24 May 2026 18:28:40 +0000</pubDate>
      <link>https://dev.to/_prshant01/sparshai-i-built-an-offline-ai-tutor-for-students-using-gemma-4-heres-what-happened-glk</link>
      <guid>https://dev.to/_prshant01/sparshai-i-built-an-offline-ai-tutor-for-students-using-gemma-4-heres-what-happened-glk</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Write About Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;There is a district in Uttar Pradesh called Sonbhadra.&lt;/p&gt;

&lt;p&gt;It sits in the southernmost corner of the state, surrounded by forests and hills. &lt;br&gt;
It is one of India's most tribal, most remote, and most underserved districts. &lt;br&gt;
Mobile signals disappear between villages. Internet is not something you plan &lt;br&gt;
around — it is something you hope for.&lt;/p&gt;

&lt;p&gt;I am a student at IIT Jodhpur. Sonbhadra is where I come from.&lt;/p&gt;

&lt;p&gt;Every time I go back home, I carry two things with me — the education I am &lt;br&gt;
getting at one of India's top institutions, and the quiet guilt of knowing &lt;br&gt;
that most kids from my area will never have access to what I have.&lt;/p&gt;

&lt;p&gt;This time, I decided to try and do something about it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;People talk about the digital divide all the time. But the conversation usually &lt;br&gt;
focuses on devices — "give students smartphones" or "build more computer labs."&lt;/p&gt;

&lt;p&gt;That misses the deeper problem.&lt;/p&gt;

&lt;p&gt;In Sonbhadra, even when a student has a device, consistent internet is not &lt;br&gt;
available. 4G signal is weak and patchy. Broadband does not exist in most &lt;br&gt;
villages. Mobile data runs out. And even when the internet works, it works &lt;br&gt;
in bursts — five minutes here, ten minutes there.&lt;/p&gt;

&lt;p&gt;Cloud-based AI tools like ChatGPT are simply not an option in this reality. &lt;br&gt;
You cannot have a tutoring session that depends on a connection that might &lt;br&gt;
disappear mid-sentence.&lt;/p&gt;

&lt;p&gt;The other problem is language. Most educational AI tools respond only in &lt;br&gt;
English. The students I grew up with are smart and curious, but they think &lt;br&gt;
in Hindi. An AI that cannot meet them in their own language is an AI that &lt;br&gt;
cannot help them.&lt;/p&gt;

&lt;p&gt;These two problems — internet dependency and language barrier — are what &lt;br&gt;
SparshAI was built to solve.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is SparshAI?
&lt;/h2&gt;

&lt;p&gt;SparshAI is a local AI tutoring system that runs entirely on a single laptop, &lt;br&gt;
with no internet connection required after the initial setup.&lt;/p&gt;

&lt;p&gt;The name comes from the Hindi word "Sparsh" — which means touch, or connection. &lt;br&gt;
That is exactly what this project is about: creating a connection between &lt;br&gt;
students who have been left behind and the knowledge they deserve access to.&lt;/p&gt;

&lt;p&gt;The idea is simple. One laptop sits in a school or community center. Students &lt;br&gt;
gather around it, or connect to it over a basic local WiFi network. They type &lt;br&gt;
their questions — in Hindi, in English, or in a mix of both. SparshAI answers &lt;br&gt;
them, patiently, clearly, in whatever language they used.&lt;/p&gt;

&lt;p&gt;No internet. No monthly fees. No cloud. No data leaving the room.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Gemma 4 Made This Possible
&lt;/h2&gt;

&lt;p&gt;I had thought about building something like this before. The problem was always &lt;br&gt;
the model. Local AI models that were capable enough for real tutoring were too &lt;br&gt;
large to run on affordable hardware. Models small enough to run locally were &lt;br&gt;
too weak to give useful explanations.&lt;/p&gt;

&lt;p&gt;Gemma 4 changed that equation completely.&lt;/p&gt;

&lt;p&gt;Google's Gemma 4 is an open model family — meaning anyone can download and run &lt;br&gt;
it locally, for free. But what makes it genuinely special is the range of sizes &lt;br&gt;
it comes in, and how capable even the smaller models are.&lt;/p&gt;

&lt;p&gt;The Gemma 4 family has three main variants:&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;E2B and E4B&lt;/strong&gt; models are built for edge devices — phones, low-RAM laptops, &lt;br&gt;
even a Raspberry Pi. They are small, efficient, and designed to run without a GPU.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;31B Dense&lt;/strong&gt; model is a full-power model for high-end machines — great &lt;br&gt;
quality, but needs serious hardware.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;27B MoE&lt;/strong&gt; model is built for speed and reasoning, best suited for GPU setups.&lt;/p&gt;

&lt;p&gt;For SparshAI, I chose the &lt;strong&gt;E4B model&lt;/strong&gt; — the 4 billion parameter variant. &lt;br&gt;
This was not a default choice. It was a deliberate one.&lt;/p&gt;

&lt;p&gt;Here is my reasoning: the schools and community centers in Sonbhadra that &lt;br&gt;
could realistically host a setup like this would have access to a basic &lt;br&gt;
second-hand laptop — something with 8GB of RAM and no dedicated graphics card. &lt;br&gt;
That is the hardware reality on the ground.&lt;/p&gt;

&lt;p&gt;The E2B model, while even smaller, does not give deep enough explanations for &lt;br&gt;
real academic concepts. I tested both. E2B answers are often too surface-level &lt;br&gt;
for a student genuinely trying to understand something.&lt;/p&gt;

&lt;p&gt;The 31B model gives richer answers, but it needs hardware that costs three to &lt;br&gt;
four times more. That puts it out of reach for the use case I was designing for.&lt;/p&gt;

&lt;p&gt;E4B sits exactly in the middle. Capable enough to explain photosynthesis, &lt;br&gt;
Newton's laws, fractions, grammar concepts, and historical events in meaningful &lt;br&gt;
depth. Small enough to run smoothly on an ₹18,000 second-hand laptop with no GPU.&lt;/p&gt;

&lt;p&gt;That is intentional model selection. Not picking what sounds most impressive — &lt;br&gt;
picking what actually works for the people you are building for.&lt;/p&gt;




&lt;h2&gt;
  
  
  The LENTERA Inspiration
&lt;/h2&gt;

&lt;p&gt;While researching how others had approached this problem, I came across a project &lt;br&gt;
called LENTERA, which was built during the Gemma 3n Impact Challenge for remote &lt;br&gt;
schools in Indonesia.&lt;/p&gt;

&lt;p&gt;Their core insight stopped me in my tracks.&lt;/p&gt;

&lt;p&gt;LENTERA found that in educational settings, students tend to ask the same &lt;br&gt;
questions repeatedly. "What is photosynthesis?" gets asked by a new student &lt;br&gt;
every single day. If you make the AI regenerate that answer from scratch every &lt;br&gt;
time, you waste time and processing power unnecessarily.&lt;/p&gt;

&lt;p&gt;Their solution was intelligent caching — storing answers to common questions &lt;br&gt;
locally so that repeat queries get instant responses, and the model only works &lt;br&gt;
hard on genuinely new questions. This reduced their response time from 90 &lt;br&gt;
seconds down to under 1 second for common queries.&lt;/p&gt;

&lt;p&gt;I built this same principle into SparshAI. The result is that the most &lt;br&gt;
frequently asked questions — basic science concepts, grammar rules, math &lt;br&gt;
fundamentals — are answered almost instantly. The system gets faster and &lt;br&gt;
smarter the more it is used, because it builds up a local library of answers &lt;br&gt;
that are relevant to that specific school's students.&lt;/p&gt;

&lt;p&gt;This felt right for Sonbhadra specifically. The NCERT curriculum is standardized &lt;br&gt;
across India. Class 8 students in Sonbhadra ask the same questions as Class 8 &lt;br&gt;
students anywhere else. A cached answer to "What is the water cycle?" is just &lt;br&gt;
as useful the hundredth time as the first.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Actually Tested
&lt;/h2&gt;

&lt;p&gt;I brought a working version of SparshAI back to Sonbhadra during my last visit. &lt;br&gt;
I set it up in a room with five students between the ages of 12 and 16.&lt;/p&gt;

&lt;p&gt;I want to be honest about what this was. It was not a formal study. It was not &lt;br&gt;
a controlled experiment. It was five curious kids, a laptop, and an afternoon.&lt;/p&gt;

&lt;p&gt;But what happened in that afternoon told me everything I needed to know.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The language thing worked better than I expected.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first student typed her question entirely in Hindi. SparshAI responded in &lt;br&gt;
Hindi. Her face when she saw that — the small surprise of being answered in her &lt;br&gt;
own language by a machine — is something I will not forget quickly.&lt;/p&gt;

&lt;p&gt;She asked a follow-up question. Then another. Within twenty minutes she had &lt;br&gt;
gone deeper into the topic of plant biology than her textbook had taken her &lt;br&gt;
in an entire chapter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The patience factor is real.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the boys asked the same question three different ways because he did not &lt;br&gt;
understand the first two answers. A tired teacher with 50 students would not &lt;br&gt;
have the bandwidth for that. SparshAI answered each time without any indication &lt;br&gt;
of frustration. On the third explanation, something clicked for him. He nodded &lt;br&gt;
and moved on.&lt;/p&gt;

&lt;p&gt;That patience is not a small thing. For students who feel embarrassed asking &lt;br&gt;
their teacher to repeat something, having a system that will explain the same &lt;br&gt;
concept ten different ways without judgment is genuinely significant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The offline test was the most important one.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Midway through the session, I turned off the WiFi router deliberately — without &lt;br&gt;
telling the students. Nothing changed. SparshAI kept working exactly as before &lt;br&gt;
because everything was running locally on the laptop. No internet. No &lt;br&gt;
interruption. No awareness on their part that anything had changed.&lt;/p&gt;

&lt;p&gt;That is the whole point. A tool that works only when the internet works is not &lt;br&gt;
a tool for Sonbhadra. A tool that keeps working regardless of connectivity — &lt;br&gt;
that is something real.&lt;/p&gt;




&lt;h2&gt;
  
  
  What SparshAI Is Not
&lt;/h2&gt;

&lt;p&gt;I want to be clear about the limitations because honesty matters more than &lt;br&gt;
hype, especially when you are talking about something that affects students &lt;br&gt;
who already have limited options.&lt;/p&gt;

&lt;p&gt;SparshAI is not a replacement for a good teacher. A good teacher brings &lt;br&gt;
energy, relationship, observation, and human judgment that no AI can replicate. &lt;br&gt;
What SparshAI can do is fill the hours when no teacher is available — evenings, &lt;br&gt;
weekends, exam seasons, the long gaps between school hours and the next day.&lt;/p&gt;

&lt;p&gt;The Hindi support is good, but not perfect. Complex questions with regional &lt;br&gt;
dialect mixing sometimes produce answers that are technically correct but &lt;br&gt;
slightly awkward in phrasing. This is an area that needs improvement.&lt;/p&gt;

&lt;p&gt;Response speed on very old hardware can be slow for complex questions — &lt;br&gt;
sometimes 15 to 20 seconds. For a student used to waiting, this is acceptable. &lt;br&gt;
For someone expecting ChatGPT speed, it would feel frustrating. Setting the &lt;br&gt;
right expectations matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Gemma 4 Unlocked That Nothing Else Could
&lt;/h2&gt;

&lt;p&gt;I want to step back and say this directly, because I think it gets lost in &lt;br&gt;
technical discussions.&lt;/p&gt;

&lt;p&gt;Before Gemma 4, building something like SparshAI was not practically possible &lt;br&gt;
for the specific constraints of rural India. The models capable of real &lt;br&gt;
educational dialogue required cloud infrastructure. The models small enough &lt;br&gt;
to run locally were not capable enough to be genuinely useful.&lt;/p&gt;

&lt;p&gt;Gemma 4 E4B sits at an intersection that did not exist before — capable enough &lt;br&gt;
to teach, small enough to run on affordable hardware, open enough to deploy &lt;br&gt;
without ongoing costs.&lt;/p&gt;

&lt;p&gt;For a student from Sonbhadra trying to build something for Sonbhadra, that &lt;br&gt;
intersection is everything.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where SparshAI Goes Next
&lt;/h2&gt;

&lt;p&gt;This is still early. What I have right now is a working proof of concept that &lt;br&gt;
I have tested with five students on one afternoon.&lt;/p&gt;

&lt;p&gt;But I know what the next steps look like.&lt;/p&gt;

&lt;p&gt;The most important one is fine-tuning on NCERT content. The entire Class 6 &lt;br&gt;
through Class 10 NCERT curriculum is publicly available. A version of Gemma 4 &lt;br&gt;
fine-tuned specifically on this content would be dramatically more useful for &lt;br&gt;
Indian school students than the base model. The answers would be more aligned &lt;br&gt;
with what students are actually studying, the examples would be culturally &lt;br&gt;
relevant, and the Hindi quality would improve.&lt;/p&gt;

&lt;p&gt;The second step is voice input. Typing is a barrier for younger students and &lt;br&gt;
for students who are less comfortable with keyboards. Adding offline &lt;br&gt;
speech-to-text — so a student can simply speak their question — would open &lt;br&gt;
SparshAI up to a much wider age range.&lt;/p&gt;

&lt;p&gt;The third step is scale. One laptop per school, shared over a basic local &lt;br&gt;
network, can serve an entire student body. The hardware cost is a one-time &lt;br&gt;
investment. After that, the running cost is zero. That economics makes &lt;br&gt;
SparshAI potentially replicable across hundreds of schools in districts &lt;br&gt;
like Sonbhadra without requiring ongoing funding.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Final Thought
&lt;/h2&gt;

&lt;p&gt;I got into IIT Jodhpur. That happened because I had access to things — &lt;br&gt;
preparation resources, guidance, a support system — that most students from &lt;br&gt;
my district simply do not have.&lt;/p&gt;

&lt;p&gt;I have thought about that gap for a long time. It always felt too large, &lt;br&gt;
too structural, too deeply embedded in inequality to be addressed by a &lt;br&gt;
single person building a single thing.&lt;/p&gt;

&lt;p&gt;SparshAI has not changed my mind about the scale of that gap. It is still &lt;br&gt;
enormous. But it has changed my mind about whether technology can be part &lt;br&gt;
of bridging it.&lt;/p&gt;

&lt;p&gt;Gemma 4 running locally on a ₹18,000 laptop, answering a 13-year-old &lt;br&gt;
girl's question about plant biology in Hindi, with no internet connection, &lt;br&gt;
for free — that is not a small thing.&lt;/p&gt;

&lt;p&gt;That is a door opening.&lt;/p&gt;

&lt;p&gt;And sometimes, a door is enough to start with.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Student at IIT Jodhpur | From Sonbhadra, Uttar Pradesh&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Project: SparshAI — Local offline AI tutor for rural students&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Model used: Gemma 4 E4B | Hardware: 8GB RAM laptop, no GPU&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Inspired by: LENTERA (Gemma 3n Impact Challenge)&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Tags: #devchallenge #gemmachallenge #gemma&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
  </channel>
</rss>
