Emmanuel Aiyenigba

Posted on May 22

Hermes, The Self-Improving Agent You Can Actually Run Yourself

#hermesagentchallenge #devchallenge #agents

Hermes Agent Challenge Submission: Write About Hermes Agent

This is a submission for the Hermes Agent Challenge

Introduction

Many AI agents have a shelf life. You set one up, it handles a few tasks, and then you discover it has no idea what it did last week, requires you to repeat context every session, and collapses the moment you try to connect it to something it was not explicitly designed for. The agent market is full of impressive demos that do not hold up as long-running, real-world tools.

Hermes Agent, built by Nous Research and released under the MIT license, takes a fundamentally different approach to the problem. Instead of being optimized for a single impressive demo, it is designed to compound over time. The longer it runs, the more context it builds about your projects and preferences. The more tasks you throw at it, the more reusable skills it generates. It is not a coding copilot, a chatbot wrapper, or a one-trick pipeline. It is a persistent agent that lives on your infrastructure, reaches you across messaging platforms, and genuinely gets more capable the longer you use it.

At the time of writing, the project sits at 160,000 stars on GitHub and v0.14.0, which is a meaningful signal for a tool this young. This article walks through what Hermes actually is under the hood, how to get it running locally and connected to real tools, and how it stacks up against OpenClaw and LangGraph so you can make an honest decision about when to reach for it.

What Hermes Agent Actually Is

Before walking through setup, it is worth spending a moment on the architecture, because understanding how Hermes works explains why the setup steps matter.

At the core of Hermes is what Nous Research calls a closed learning loop. When you give Hermes a complex task, it does not just complete it and discard the work. Instead, it automatically generates a reusable skill from that interaction, a structured procedure it can recall and improve upon in future sessions. These skills live in your local filesystem under ~/.hermes/skills/ and are indexed so the agent can search and invoke them without you having to remember they exist. Over time, this means Hermes builds a growing library of procedures tailored to your specific workflows.

Alongside the skills system, Hermes maintains persistent memory across conversations. This includes project context, user preferences, and a searchable session history powered by SQLite FTS5 with LLM-summarized recall. When you return to a project after two weeks, Hermes can surface relevant prior context without you having to re-explain everything from scratch.

Beyond memory, the architecture supports isolated subagents. When a task has parallel components, Hermes can spawn isolated child agents with their own terminal environments and Python RPC sessions, then aggregate the results. This is particularly useful for long-running research or code tasks where you want to avoid context blowout in a single conversation thread.

Finally, Hermes ships with a gateway process that connects the agent to messaging platforms. You can run a task from your laptop terminal, switch to Telegram while commuting, and pick up the same session without any ceremony. This cross-platform continuity is something most agentic frameworks simply do not address.

Setting Up Hermes Agent Locally

The install process is straightforward on Linux, macOS, WSL2, and Termux. A single curl command handles the dependency chain, including uv, Python 3.11, Node.js, ripgrep, and ffmpeg.

Step 1: Install

Open your terminal and run:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

After the installer finishes, reload your shell so the hermes command is available:

source ~/.bashrc   # or: source ~/.zshrc

If you are on Windows, the project supports native PowerShell as an early beta, though WSL2 is the more battle-tested path. The PowerShell one-liner is:

iex (irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1)

Step 2: Choose a Model Provider

One of the more useful design decisions in Hermes is that it has no default model lock-in. Before starting a conversation, run:

 hermes model

This opens an interactive selector where you can configure your provider and model. Hermes supports OpenAI, Anthropic, OpenRouter (which gives access to 200+ models), Nous Portal, NVIDIA NIM, Hugging Face, and several others including Kimi, MiniMax, and z.ai. You can also point it at a local Ollama endpoint if you want to run fully offline. Switching providers later requires only running hermes model again. No code changes, no environment variable archaeology.

To set up your API key for, say, OpenRouter, you configure it once through the setup wizard and Hermes stores it securely in ~/.hermes/.env rather than requiring you to export it in every session.

Step 3: Run the Setup Wizard

Rather than manually editing config files, Hermes provides a guided setup that walks you through the most important configuration decisions:

hermes setup

The wizard covers your preferred model, which toolsets to enable, your execution backend (more on this shortly), and whether to set up the messaging gateway. If you are migrating from OpenClaw, the wizard automatically detects a ~/.openclaw directory and offers to import your settings, memories, skills, and API keys in one step.

Step 4: Start the Agent

Once setup is complete, you start an interactive session with:

hermes

This opens the full TUI, which includes multiline editing, slash-command autocomplete, streaming tool output, and conversation history. From here, you can issue natural language tasks and watch Hermes reason through them using whatever tools are enabled.

Next, let's talk about how you can connect Hermes to tools.

Connecting Hermes to Tools

This is where Hermes earns its keep. The tool ecosystem is organized into toolsets, which are named groups of capabilities you enable or disable as needed. To see what is available and toggle individual tools, run:

hermes tools

Out of the box, Hermes ships with more than 40 tools spanning web search, browser automation, vision and image generation, text-to-speech, file operations, shell execution, and multi-model reasoning. These cover the vast majority of everyday agent workflows without requiring any additional setup.

Execution Backends

One of the more operationally significant features in Hermes is its support for seven distinct execution backends: local, Docker, SSH, Singularity, Modal, Daytona, and Vercel Sandbox. Most agents run shell commands directly on your local machine, which is fine for personal use but problematic when you want isolation, reproducibility, or serverless cost profiles.

If you are running tasks that touch sensitive files or need a clean environment for each run, switching to the Docker backend takes a single configuration change:

hermes config set sandbox docker

For production workloads where you want the agent to hibernate between sessions and only incur compute costs when actively running, the Modal and Daytona backends offer serverless persistence. The agent's environment wakes on demand and sleeps when idle, which makes running Hermes on a cloud VM genuinely cost-efficient for non-continuous workloads.

MCP Integration

Hermes supports the Model Context Protocol (MCP), which means you can connect it to any MCP-compatible server and immediately expose those capabilities as native tools. If you have an MCP server for your database, your internal APIs, or a third-party service, Hermes can use it without requiring any custom adapter code. The MCP configuration lives in ~/.hermes/config.yaml and follows the same server definition format used by other MCP hosts.

For example, to connect Hermes to an MCP server running on localhost:

# ~/.hermes/config.yaml
mcp_servers:
  my-tool-server:
    command: node
    args: ["/path/to/mcp-server/index.js"]

Once added, the tools exposed by that server appear automatically in the hermes tools list and become available to the agent in every subsequent session.

Messaging Gateway and Cron

Beyond the terminal, Hermes supports a gateway process that connects the agent to Telegram, Discord, Slack, WhatsApp, Signal, and email. Setting up the gateway takes a few minutes:

hermes gateway setuphermes gateway start

The setup wizard walks you through generating bot tokens for each platform. Once the gateway is running, messages you send through any connected platform reach the same agent, with the same memory and skills, as your CLI sessions. Cross-platform conversation continuity is handled automatically.

Hermes also ships a built-in cron scheduler that lets you describe recurring tasks in natural language. Rather than writing cron syntax and wiring up shell scripts, you tell the agent what you want:

During a conversation:"Every weekday at 7am, send me a summary of unread GitHub notifications to Telegram."

The agent translates that instruction into a scheduled task and runs it unattended through the gateway, delivering the result to whichever platform you specified.

How Hermes Compares to OpenClaw and LangGraph

To make an honest recommendation, it helps to put Hermes next to two frameworks that represent meaningfully different philosophies: OpenClaw, which shares a direct lineage with Hermes, and LangGraph, which represents the programmatic, graph-based end of the spectrum.

Hermes vs. OpenClaw

OpenClaw is the predecessor to Hermes. The migration command in the Hermes CLI (hermes claw migrate) exists specifically to help OpenClaw users move their settings, memories, skills, and API keys across, which tells you most of what you need to know about the relationship.

OpenClaw established many of the patterns that Hermes inherits: persistent memory, a skills system, multi-platform messaging, and model-agnostic configuration. If you are currently running OpenClaw and it is meeting your needs, the case for migrating is largely about access to what has been added since the fork. Hermes extends the architecture in several concrete directions.

The execution backend system is considerably broader. Hermes supports seven backends where OpenClaw effectively assumes local execution. If you want Docker isolation, SSH-based remote execution, or serverless backends like Modal, Hermes is where that work has happened.

The subagent system is also more mature in Hermes. Spawning isolated child agents with their own terminals and Python RPC sessions, and collapsing multi-step pipelines into zero-context-cost turns, is a Hermes-specific capability. OpenClaw does not have an equivalent.

The skills system has also evolved. In Hermes, skills are compatible with the agentskills.io open standard, which means you can pull community-contributed skills from the Skills Hub rather than building everything from scratch. The cross-session search is powered by SQLite FTS5 with LLM summarization, which is a meaningful step up from simple key-value memory storage.

So when should you stick with OpenClaw rather than migrating? Honestly, the cases are narrow. If you have a heavily customized OpenClaw setup with skills and personas that you rely on daily and do not want to risk disrupting, waiting until you have time for a proper migration is reasonable. The hermes claw migrate command handles most of the work, but any migration carries some risk of edge cases. Otherwise, the feature trajectory strongly favors Hermes, and the migration tooling makes the switch low-cost.

Hermes vs. LangGraph

LangGraph is a Python framework from LangChain for building stateful, multi-actor agent applications using directed graphs. It is a genuinely different tool serving a genuinely different use case, so comparing them is less about which is better and more about which fits your situation.

LangGraph's core proposition is control. You define your agent as an explicit graph where nodes are computation steps and edges control flow. You can add conditional branching, loops, checkpoints, and human-in-the-loop interruptions at precisely defined points. When you need to reason carefully about how an agent makes decisions, audit its behavior, or integrate it into a production system with strict reliability requirements, LangGraph gives you the scaffolding to do that. It is a framework for building agents, not an agent itself.

Hermes is an agent. You install it, configure it, and talk to it. The internal reasoning loop is not something you define through code. The tradeoff is that you give up some low-level control in exchange for a system that handles a large surface area of practical tasks without requiring you to write Python to describe how it should think.

Where LangGraph genuinely outperforms Hermes is in situations where you need to embed agentic behavior into a larger software system. If you are building a production workflow where an agent is one component among many, and you need that component to have defined inputs, outputs, retry logic, and observability hooks, LangGraph's programmatic model is the right tool. It integrates naturally with LangSmith for tracing, supports streaming state updates to frontends, and has a large ecosystem of pre-built components.

Where Hermes outperforms LangGraph is in personal and team productivity use cases where you want an agent that can handle a wide variety of tasks, remember what it knows about you and your projects, reach you on Telegram or Slack, run scheduled jobs overnight, and get better the longer you use it. LangGraph requires you to write and maintain code for every new capability. Hermes accumulates new skills automatically from experience.

The persistent memory system is also a meaningful differentiator. LangGraph supports checkpoints and state persistence, but these are designed for resuming specific workflow instances. Hermes's memory operates at a higher level, building a model of who you are, what your projects involve, and how you prefer to work that persists and enriches across all future sessions.

When to Reach for Hermes

Thinking through the comparisons above, a few clear use cases emerge where Hermes is the right choice.

Reach for Hermes when:

You want a general-purpose autonomous agent that runs persistently on your own infrastructure. If the core problem is "I need something that can handle a wide variety of tasks across multiple sessions and get smarter over time," Hermes is built exactly for that.
Cross-platform availability matters. If you want to kick off a task from your laptop, check on it from your phone via Telegram, and have results delivered to a Slack channel, the messaging gateway handles all of that with a single gateway process running on your server.
You are working across many different projects and want the agent to carry context from one to the next. The persistent memory and skills system mean that the work you put into one project gradually informs how the agent handles related work later.

Reach for LangGraph instead when you are building an agentic component that needs to be embedded in a production Python application with defined interfaces, observability, and strict control flow. If your primary concern is software engineering correctness rather than agent productivity, LangGraph gives you the structure to guarantee it.

Reach for OpenClaw instead if you have an existing OpenClaw setup that is working well and the migration overhead is not worth the improvements. But given that the migration tooling is mature and the hermes claw migrate command handles the heavy lifting, the window for that justification is narrowing.

Closing Thoughts

Hermes Agent sits in an interesting position. It is not trying to be a coding assistant that lives in your IDE, and it is not trying to be a framework you build agents on top of. It is trying to be the agent itself, one that earns increasing usefulness by actually learning from what it does.

The install process is genuinely simple. The model-agnostic design means you can point it at whatever provider makes sense for your budget and requirements. The MCP integration means you are not locked out of the tools your team already uses. And the learning loop means the return on the setup cost compounds over time rather than resetting each session.

For developers and technical teams who want an agent they actually control, that runs on their own infrastructure, and that respects the fact that context built over months of work is worth preserving, Hermes is worth a serious look.

DEV Community