DEV Community

정상록
정상록

Posted on

Hermes Agent: The Self-Evolving AI Agent That Learns From Your Workflow

TL;DR

Hermes Agent is an MIT-licensed AI agent framework by Nous Research that genuinely learns from experience. Auto-generates skills after 5+ repeated tasks, maintains 5-layer persistent memory, supports 200+ models via OpenRouter, and can self-evolve its own prompts using GEPA (ICLR 2026 Oral). Just hit v0.4.0 with 300 PRs merged in one week.

The Problem: AI Agents That Forget Everything

Every AI agent I've used has the same fundamental issue: session ends, memory gone.

You spend 2 hours teaching it your project structure, coding conventions, and deployment pipeline. Next morning? Clean slate.

Hermes Agent solves this with a genuinely different architecture.

Self-Improving Loop: How It Works

The core innovation is a 4-step cycle:

Step 1 - Auto Skill Generation:
When you repeat a tool call 5+ times, the agent automatically synthesizes the procedure into a Python-based skill.

Step 2 - Skill Nudge:
Periodic prompts suggest saving completed workflows as reusable skills.

Step 3 - Skill Refinement:
When a skill fails or runs inefficiently, the agent iteratively improves it.

Step 4 - Persistent Storage:
Skills are saved to ~/.hermes/skills/ in the open agentskills.io format.

# Your skills grow over time
ls ~/.hermes/skills/
# deploy-staging.py
# git-feature-branch.py
# db-migration-check.py
Enter fullscreen mode Exit fullscreen mode

5-Layer Memory System

Layer Mechanism Persistence
MEMORY.md Searchable markdown Permanent
USER.md User model (preferences, coding style) Permanent
Honcho AI-native dual-peer memory Cross-session
SessionDB SQLite + FTS5 full-text search Permanent
Conversation Messages + compression Session

The Honcho integration is particularly interesting -- it builds both a "user peer" (learning your goals and communication style) and an "AI peer" (building the agent's knowledge representation).

Self-Evolution via GEPA + DSPy

This is the wildcard feature. A separate repo (hermes-agent-self-evolution) provides genetic prompt evolution:

# Optimize a prompt for code review quality
python evolve.py --target "code review quality" --budget 10
Enter fullscreen mode Exit fullscreen mode

How it works:

  1. Collects execution traces (errors, profiling, reasoning logs)
  2. Diagnoses why things failed
  3. Generates candidate prompt variants (genetic algorithm)
  4. Evaluates each variant
  5. Auto-creates a PR with the best performer

No GPU training. API calls only. ~$2-10 per optimization cycle.

Based on GEPA (Genetic-Pareto Prompt Evolution), which was an ICLR 2026 Oral paper.

Quick Start

# One-line install (Linux, macOS, WSL2)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# Setup (model selection, API keys)
hermes setup

# Start
hermes
Enter fullscreen mode Exit fullscreen mode

Only prerequisite: git. The install script handles Python, Node.js, and dependencies.

200+ Models, Zero Lock-in

# Switch models with one command
hermes model set openrouter/anthropic/claude-3.5-sonnet
hermes model set openai/gpt-4o
hermes model set ollama/llama3.1  # Local
Enter fullscreen mode Exit fullscreen mode

Works with OpenRouter (200+ models), OpenAI, Anthropic (via proxy), Ollama, vLLM, llama.cpp.

6 Terminal Backends

Backend Use Case
Local Direct host execution
Docker Isolated, reproducible environments
SSH Remote server management
Daytona Serverless with hibernation
Singularity HPC containers
Modal Serverless (~$0 when idle)

Honest Comparison

Feature Hermes Agent Claude Code Cursor
Self-evolution Yes (GEPA) No No
Open source MIT Partial No
Data privacy Fully self-hosted Cloud Cloud
Model diversity 200+ Claude only Multi
Persistent memory 5 layers Limited Limited
Code quality Model-dependent Excellent Excellent

Where Hermes wins: Self-evolution, data privacy, model flexibility, cost ($5/mo VPS).

Where others win: Code output quality (Claude Code), community size (Cursor), polished UX.

v0.4.0 Highlights (March 23, 2026)

  • OpenAI-compatible API server
  • 6 new messaging adapters (Signal, DingTalk, SMS, Mattermost, Matrix, Webhook)
  • MCP server management + OAuth 2.1
  • Prompt caching + streaming by default
  • 200+ bug fixes
  • 300 PRs merged in one week

Links:

What's your experience with self-improving AI agents? Have you tried Hermes or something similar? Would love to hear about your setup.

Top comments (0)