TL;DR
Hermes Agent is an MIT-licensed AI agent framework by Nous Research that genuinely learns from experience. Auto-generates skills after 5+ repeated tasks, maintains 5-layer persistent memory, supports 200+ models via OpenRouter, and can self-evolve its own prompts using GEPA (ICLR 2026 Oral). Just hit v0.4.0 with 300 PRs merged in one week.
The Problem: AI Agents That Forget Everything
Every AI agent I've used has the same fundamental issue: session ends, memory gone.
You spend 2 hours teaching it your project structure, coding conventions, and deployment pipeline. Next morning? Clean slate.
Hermes Agent solves this with a genuinely different architecture.
Self-Improving Loop: How It Works
The core innovation is a 4-step cycle:
Step 1 - Auto Skill Generation:
When you repeat a tool call 5+ times, the agent automatically synthesizes the procedure into a Python-based skill.
Step 2 - Skill Nudge:
Periodic prompts suggest saving completed workflows as reusable skills.
Step 3 - Skill Refinement:
When a skill fails or runs inefficiently, the agent iteratively improves it.
Step 4 - Persistent Storage:
Skills are saved to ~/.hermes/skills/ in the open agentskills.io format.
# Your skills grow over time
ls ~/.hermes/skills/
# deploy-staging.py
# git-feature-branch.py
# db-migration-check.py
5-Layer Memory System
| Layer | Mechanism | Persistence |
|---|---|---|
| MEMORY.md | Searchable markdown | Permanent |
| USER.md | User model (preferences, coding style) | Permanent |
| Honcho | AI-native dual-peer memory | Cross-session |
| SessionDB | SQLite + FTS5 full-text search | Permanent |
| Conversation | Messages + compression | Session |
The Honcho integration is particularly interesting -- it builds both a "user peer" (learning your goals and communication style) and an "AI peer" (building the agent's knowledge representation).
Self-Evolution via GEPA + DSPy
This is the wildcard feature. A separate repo (hermes-agent-self-evolution) provides genetic prompt evolution:
# Optimize a prompt for code review quality
python evolve.py --target "code review quality" --budget 10
How it works:
- Collects execution traces (errors, profiling, reasoning logs)
- Diagnoses why things failed
- Generates candidate prompt variants (genetic algorithm)
- Evaluates each variant
- Auto-creates a PR with the best performer
No GPU training. API calls only. ~$2-10 per optimization cycle.
Based on GEPA (Genetic-Pareto Prompt Evolution), which was an ICLR 2026 Oral paper.
Quick Start
# One-line install (Linux, macOS, WSL2)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
# Setup (model selection, API keys)
hermes setup
# Start
hermes
Only prerequisite: git. The install script handles Python, Node.js, and dependencies.
200+ Models, Zero Lock-in
# Switch models with one command
hermes model set openrouter/anthropic/claude-3.5-sonnet
hermes model set openai/gpt-4o
hermes model set ollama/llama3.1 # Local
Works with OpenRouter (200+ models), OpenAI, Anthropic (via proxy), Ollama, vLLM, llama.cpp.
6 Terminal Backends
| Backend | Use Case |
|---|---|
| Local | Direct host execution |
| Docker | Isolated, reproducible environments |
| SSH | Remote server management |
| Daytona | Serverless with hibernation |
| Singularity | HPC containers |
| Modal | Serverless (~$0 when idle) |
Honest Comparison
| Feature | Hermes Agent | Claude Code | Cursor |
|---|---|---|---|
| Self-evolution | Yes (GEPA) | No | No |
| Open source | MIT | Partial | No |
| Data privacy | Fully self-hosted | Cloud | Cloud |
| Model diversity | 200+ | Claude only | Multi |
| Persistent memory | 5 layers | Limited | Limited |
| Code quality | Model-dependent | Excellent | Excellent |
Where Hermes wins: Self-evolution, data privacy, model flexibility, cost ($5/mo VPS).
Where others win: Code output quality (Claude Code), community size (Cursor), polished UX.
v0.4.0 Highlights (March 23, 2026)
- OpenAI-compatible API server
- 6 new messaging adapters (Signal, DingTalk, SMS, Mattermost, Matrix, Webhook)
- MCP server management + OAuth 2.1
- Prompt caching + streaming by default
- 200+ bug fixes
- 300 PRs merged in one week
Links:
- GitHub (10k+ stars)
- Documentation
- Self-Evolution Repo
What's your experience with self-improving AI agents? Have you tried Hermes or something similar? Would love to hear about your setup.
Top comments (0)