DEV Community

Cover image for Hermes Just Killed OpenClaw (Here's Why)
S M Tahosin
S M Tahosin

Posted on

Hermes Just Killed OpenClaw (Here's Why)

Hermes Agent Challenge Submission

This is a submission for the Hermes Agent Challenge.

I do not think OpenClaw is dead.

That title is deliberately dramatic because the shift is dramatic. OpenClaw did something important: it made a lot of developers believe that a personal AI assistant could be more than a chat box. It could sit on your machine, connect to your messages, call tools, browse, run commands, and actually move work forward.

But Hermes Agent changes the question.

OpenClaw asks:

What if I could run a personal AI assistant on my own devices?

Hermes asks:

What if my agent could live on my infrastructure, remember how I work, improve its own procedures, use tools across channels, and become more useful every week?

That second question is why Hermes feels like the next step.

Not because OpenClaw is bad. OpenClaw is popular for a reason. The official repo describes it as a personal AI assistant that runs on your own devices, answers through the channels you already use, and uses a Gateway as the control plane. That is a strong idea.

The problem is that the AI agent market is moving from "assistant I operate" to "worker I supervise." Once that happens, the winning system is not the one with the loudest demo. It is the one with the better memory model, execution boundary, skill lifecycle, tool surface, and deployment story.

That is where Hermes starts to pull ahead.

The short version

If I had to explain the difference in one line:

OpenClaw feels like a local-first assistant. Hermes feels like agent infrastructure that happens to chat.

That distinction matters.

A real agent has to do more than respond. It needs to run somewhere reliable. It needs to work while I am away. It needs to remember the parts of my environment that matter. It needs to learn repeatable procedures. It needs to make tool use safer, especially when those tools touch files, browsers, credentials, APIs, and servers.

OpenClaw helped prove the demand.

Hermes is making the operating model more serious.

The five claims that matter

The loudest Hermes pitch right now is simple: install it, connect it, give it skills, run it on a server, and let it become your agent.

That pitch is exciting, but I would not judge Hermes by hype. I would judge it by which claims survive contact with architecture.

Claim Why it matters My read
"One-command install" Agents die when setup is fragile. If the first hour is dependency pain, most people quit. Useful, but not the real moat. Setup gets you to day one. Memory and skills decide day thirty.
"Run it on a VPS or sandbox" A serious agent should not need your personal laptop open all day. This is one of Hermes' strongest arguments. Persistent agents belong on persistent infrastructure.
"Built-in skills" Skills turn vague AI behavior into repeatable procedures. Strong, especially because Hermes treats skills as something the agent can improve, not just something a user installs.
"Messaging integrations" Telegram, Discord, Slack, WhatsApp, and similar channels make the agent reachable from normal life. Important, but only if paired with background sessions. Otherwise it is just another bot in another inbox.
"Safer execution" Agents touch terminals, files, browsers, APIs, and credentials. That is dangerous by default. This is where Hermes feels more mature: command approval, allowlists, Docker, SSH, sandbox backends, and scoped toolsets all matter.

That is the lens for the rest of this post.

I do not care whether Hermes can produce a flashy demo once. Most agent frameworks can do that now.

I care whether Hermes has the bones for repeated work: memory, procedural learning, sandboxed execution, remote availability, and enough tool scoping to avoid turning convenience into a security incident.

Why OpenClaw won attention first

OpenClaw's strength is obvious from its own README. It is broad, local, channel-heavy, and familiar to developers who want an assistant they can own.

The official repo highlights:

  • WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Microsoft Teams, Matrix, LINE, WeChat, and many more channels
  • A local-first Gateway that owns messaging surfaces and routes requests
  • First-class tools for browser, files, exec, canvas, cron, sessions, image generation, video generation, TTS, and sub-agents
  • Skills based on SKILL.md
  • Native onboarding with openclaw onboard
  • Companion apps and nodes for macOS, iOS, Android, and headless devices

That is not small. That is why OpenClaw became a reference point for personal agents.

It also has a massive community. At the time I checked the GitHub API, OpenClaw had far more stars than Hermes. Popularity alone does not decide technical direction, but it does tell you something: OpenClaw made the category legible.

For context, I checked the public repos directly: openclaw/openclaw and NousResearch/hermes-agent. OpenClaw has the bigger gravity right now. Hermes has the more interesting agent-runtime thesis.

The issue is that popularity also brings a harsh spotlight. Once strangers, groups, plugins, browsers, shells, and personal accounts all meet inside one assistant, the security model becomes the product.

OpenClaw's own security docs are honest about this. The guidance assumes a personal assistant trust boundary: one trusted operator boundary per gateway. It says OpenClaw is not a hostile multi-tenant security boundary for adversarial users sharing one gateway. It also says the product default for trusted single-operator setups allows host execution in the gateway or node context unless you tighten it.

That is not a cheap criticism. It is the tradeoff OpenClaw chose: powerful local assistant first, hardening second.

Hermes starts from a different center.

Hermes is built around compounding

The most important Hermes idea is not Telegram integration. It is not browser automation. It is not even the tool count.

The key idea is compounding.

Hermes describes itself as a self-improving agent with a built-in learning loop. Its docs talk about agent-curated memory, autonomous skill creation, skill improvement during use, session search, external memory providers, and user modeling.

That sounds abstract until you translate it into developer terms:

If the agent solves a hard workflow today, it should not rediscover that workflow next week.

That is the difference between a chatbot with tools and an agent that grows.

Hermes has two memory layers that are easy to reason about:

  • MEMORY.md for environment facts, project conventions, lessons learned, and workflow notes
  • USER.md for preferences, communication style, expectations, and profile details

Those are bounded on purpose. Hermes keeps them focused instead of stuffing an infinite pile of text into every prompt. For older conversations, it uses SQLite session storage with FTS5 search and summarization.

That design feels practical. The always-loaded memory stays small. The deeper history is searchable when needed.

This is exactly how I want a serious agent to behave. I do not want it to remember everything equally. I want it to remember what changes future behavior.

The skill system is the real "DNA"

Skills are where Hermes becomes interesting.

OpenClaw has skills too. Its docs explain that skills are AgentSkills-compatible SKILL.md folders that teach the agent how to use tools. OpenClaw loads bundled skills, managed/local skills, personal skills, project skills, and workspace skills.

Hermes takes the same basic idea and pushes it closer to procedural memory.

The Hermes docs say the agent can create, update, and delete its own skills through skill_manage. It creates skills after complex successful tasks, when it finds the path through errors, when a user corrects its approach, or when it discovers a non-trivial workflow.

That is the part that matters.

Not "skills as a plugin folder."

Skills as the agent writing down how to be better next time.

This is the difference between installing extensions and building organizational memory. A good senior developer does not just solve an incident. They improve the runbook. Hermes is trying to make the agent do the same thing.

And it is not only local skills. Hermes supports:

  • Official optional skills
  • skills.sh
  • Well-known skill endpoints
  • Direct URL skills
  • GitHub skill installs
  • Community registries
  • External read-only skill directories
  • Security scanning and audit commands for installed hub skills

That gives Hermes a useful middle ground. It can learn locally, but it can also participate in a broader open skill ecosystem.

The execution story is stronger

This is where the comparison gets practical.

An agent that can run commands should make you slightly nervous. That is healthy.

Hermes treats terminal execution as a configurable backend. Commands can run locally, in Docker, over SSH, in Singularity, in Modal, in Daytona, or in Vercel Sandbox. The docs are clear about the tradeoff:

  • local is easy, but has no isolation
  • Docker gives container isolation
  • SSH moves execution to another server
  • Modal and Daytona give cloud sandbox options
  • Vercel Sandbox gives microVM-style cloud execution with snapshot persistence

The security page goes further. With Docker, Hermes applies hardened container flags: drop capabilities, no new privileges, PID limits, tmpfs mounts, and explicit resource limits. It also avoids forwarding host environment variables by default.

That matters for one simple reason:

The agent should not automatically inherit your entire laptop just because you wanted it to scrape a page or refactor a file.

OpenClaw can sandbox too. Its README points to Docker, SSH, and OpenShell options, and it recommends sandboxing for non-main sessions. Its security docs are detailed and serious.

But the default mental model is different.

OpenClaw is a personal assistant with optional hardening.

Hermes is an agent runtime where isolated execution is part of the normal deployment conversation.

That is why I would rather run Hermes on a VPS or cloud sandbox for always-on work.

Messaging is not the win. Remote agency is.

Both tools can talk through messaging platforms.

OpenClaw has a huge channel list. Hermes also supports a wide set: Telegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Matrix, Mattermost, Home Assistant, DingTalk, Feishu/Lark, WeCom, Microsoft Teams, and more.

The interesting Hermes feature is not that you can message it.

The interesting feature is that messaging becomes a control surface for background work.

Hermes supports background sessions from messaging platforms. You can start a separate task, keep chatting in the main thread, and receive the result back in the same channel. That is a small feature on paper, but it changes the feel of the system.

It stops being:

I am chatting with a bot.

It becomes:

I am dispatching work to an agent that lives somewhere else.

That is the future I care about.

I do not want my personal agent trapped inside the laptop I am currently using. I want it on a server, reachable from my phone, able to run a long task, report back, and remember the result.

Hermes is built for that shape.

Tool breadth is now table stakes

There was a time when "this agent can browse the web and run commands" sounded wild.

That time is over.

Both OpenClaw and Hermes have serious tool surfaces.

OpenClaw ships built-in tools for shell execution, code execution, browser control, web search, file I/O, patching, messaging, canvas, nodes, cron, images, music, video, TTS, sessions, and sub-agents.

Hermes ships a broad registry too: web search, extraction, terminal, file editing, browser automation, vision, image generation, TTS, memory, session search, cron, messaging, delegation, code execution, Home Assistant, MCP tools, RL tools, and more.

So the question is not:

Which one has tools?

The better question is:

Which one makes tools safer, more composable, and easier to scope per situation?

Hermes has a clear toolset model. Toolsets can be enabled per session, per platform, or per task. There are platform presets like hermes-cli, hermes-telegram, and dynamic MCP toolsets. That gives you a cleaner way to say:

"This Telegram agent can do X, but not Y."

For me, that is more important than raw tool count.

Hermes vs OpenClaw

Here is my practical comparison.

Area OpenClaw Hermes Agent
Core identity Personal AI assistant Self-improving agent runtime
Mental model Local-first Gateway assistant Persistent worker on your infrastructure
Setup CLI onboarding and Gateway daemon CLI, Gateway, and multiple runtime backends
Messaging Very broad channel coverage Channels plus background sessions
Skills Skills loaded from many locations Skills as procedural memory
Memory Workspace and session context Curated memory plus session search
Tooling Broad built-in tools Toolsets, MCP, delegation, media, web
Security Personal trust boundary, hardening available Approval, isolation, env filtering, scoped tools
Deployment Device or Gateway host Local, VPS, Docker, SSH, Modal, Daytona, Vercel Sandbox
Ideal user Power user with a device assistant Developer building a supervised digital worker
Biggest risk Too much power in one assistant boundary Newer ecosystem still proving itself

This table is why I do not read Hermes as "another OpenClaw clone."

Hermes is competing on a different axis.

OpenClaw made the assistant powerful.

Hermes is trying to make the assistant compound.

The practical playbook

If you are reading this and wondering "okay, but what do I actually try first?", this is the path I would take.

First, run Hermes somewhere disposable. A local machine is fine for learning, but the interesting path is Docker, SSH, Modal, Daytona, or another sandbox backend. The whole point is to avoid giving an experimental agent unlimited access to your daily machine on day one.

Then connect one messaging surface, not five. Telegram or Discord is enough. Make sure allowlists or DM pairing are enabled before you give the agent terminal access.

Then give Hermes one recurring workflow:

/background Research the latest Hermes Agent docs changes, summarize the developer impact, and send me 5 possible DEV post angles.
Enter fullscreen mode Exit fullscreen mode

After that, watch for the compounding moment. If the workflow takes several tool calls, has a repeatable structure, or needs a correction from you, that is exactly the kind of thing that should become a skill.

A good first Hermes skill would not be "write blog posts." Too vague.

A better one would be:

research-release-notes

When given a GitHub repo or docs page:
1. Find the latest release or docs update.
2. Prefer primary sources.
3. Extract concrete changes.
4. Separate confirmed facts from opinion.
5. Produce a DEV-ready outline with links.
Enter fullscreen mode Exit fullscreen mode

That is where Hermes becomes more than a chat assistant. You are not just asking it to do a task. You are teaching it a durable way to do that class of task.

Where OpenClaw still wins

A good comparison should admit the other side.

OpenClaw still has big advantages:

  1. It has enormous attention and community gravity.
  2. Its channel ecosystem is very broad.
  3. Its native app and node story is compelling.
  4. Its local-first assistant feel is easier to explain to non-agent people.
  5. It has already shaped how people talk about personal AI assistants.

If your goal is "I want a personal AI assistant connected to my messaging apps and devices," OpenClaw is still a serious answer.

But if your goal is "I want an agent that can become operational infrastructure," Hermes is the more interesting answer.

Where Hermes wins

Hermes wins because it is opinionated about the hard parts.

1. It treats memory as a product surface

Memory is not just chat history. It is a curated behavioral layer. The split between MEMORY.md, USER.md, and searchable session history is simple enough to trust and flexible enough to grow.

2. It treats skills as learning

The agent can create and update skills after hard tasks. That is the closest thing to compounding engineering knowledge in this category.

3. It treats execution location as a first-class choice

Local, Docker, SSH, Modal, Daytona, Vercel Sandbox, Singularity. That is not a footnote. That is the difference between a toy assistant and something you can deploy with intent.

4. It treats messaging as dispatch

I can talk to the agent through Telegram or Discord, but the real value is sending background work and getting results back. That makes the chat app a command center, not the product itself.

5. It treats safety as architecture, not a disclaimer

Allowlists, DM pairing, command approval, container isolation, MCP credential filtering, context scanning, env var filtering, and scoped toolsets are not glamorous features. They are the features you need after the first impressive demo.

The bigger point

The agent space is splitting into two philosophies.

One philosophy says:

Give the user a powerful assistant and let them connect everything.

The other says:

Give the user an agent runtime that can be supervised, isolated, taught, remembered, and deployed.

OpenClaw represents the first philosophy extremely well.

Hermes represents the second.

That is why I think Hermes is the more important project to study right now.

OpenClaw proved people want agents with hands.

Hermes is asking what happens when those hands also get memory, runbooks, safer execution, background work, and a home outside your current laptop.

That is the jump.

What I would build with Hermes

If I were turning this into a real project, I would build a developer publishing agent.

Not a blog spammer. A proper assistant for technical writing:

  1. Watch official docs, GitHub releases, and challenge pages.
  2. Summarize what changed with links to primary sources.
  3. Keep a memory of my writing preferences and recurring projects.
  4. Create reusable skills for research, outline creation, source checking, and DEV formatting.
  5. Draft posts in my style, but keep claims grounded in citations.
  6. Send drafts to Telegram for review.
  7. Track comments and suggest follow-up posts based on real discussion.

That would use the Hermes shape well:

  • long-running background research
  • web extraction
  • session search
  • persistent memory
  • skills that improve over time
  • messaging delivery
  • scoped tool access
  • scheduled tasks

That is the kind of workflow where Hermes makes more sense than a one-shot chat assistant.

The point is not that Hermes can write.

The point is that Hermes can build a writing operation around memory, tools, and feedback.

Final take

Did Hermes literally kill OpenClaw?

No.

OpenClaw is too useful, too popular, and too culturally important to dismiss.

But Hermes may have killed the idea that a personal agent is only a local assistant with a chat interface.

That is the real shift.

The next generation of agents will not be judged only by how many apps they connect to. They will be judged by whether they can:

  • remember the right things
  • forget the wrong things
  • learn procedures
  • run in isolated environments
  • work asynchronously
  • integrate with open tools
  • stay useful after the first demo

By that standard, Hermes is not just another agent.

It is a strong argument for where agent software is going next.

That is my real test for any agent framework now:

Does it get more useful because I used it yesterday?

If the answer is no, it is still mostly a tool wrapper.

If the answer is yes, we are finally talking about agent software.

And yes, that is why the title says it:

Hermes just killed OpenClaw.

Not by replacing it overnight.

By making the category grow up.

The first thing I would personally validate is not whether Hermes can write a pretty paragraph. It is whether a Docker or SSH-backed Hermes research agent can run for a week, keep useful memory, and avoid turning one bad tool call into a machine-level mess. If you have tried either backend already, I would genuinely like to hear which one felt smoother and where it broke.

Sources

What do you think?

Is Hermes actually the next step after OpenClaw, or is OpenClaw still the better model for personal agents?

And of the five claims above, which one matters most to you: memory, skills, sandboxing, messaging, or running the agent on real infrastructure?

Top comments (12)

Collapse
 
jakesullivan profile image
Jake Sullivan

This is probably the first post that actually explains why so many devs are quietly moving away from OpenClaw. Everyone was focused on features and agent hype, but reliability and architecture matter way more once you start running real workflows.

Hermes feels much more intentional instead of “ship first, patch later.” That difference becomes obvious after a few weeks of usage.

Collapse
 
tahosin profile image
S M Tahosin

Really appreciate this, Jake. You nailed the exact point I was trying to make. Features create hype, but reliability and architecture are what actually matter when you start using these tools in real workflows.
That "ship first, patch later" comparison is honestly a great way to put it. Curious, how long have you been using Hermes compared to OpenClaw?

Collapse
 
jakesullivan profile image
Jake Sullivan

I’ve been testing OpenClaw for a few months and started using Hermes more seriously recently. The difference became noticeable once I moved beyond simple demos into longer workflows and multi-step automation.

OpenClaw still has insane potential, but Hermes feels way more predictable during actual usage. That consistency matters a lot more than people realize.

Thread Thread
 
tahosin profile image
S M Tahosin

That’s actually super valuable insight, Jake. A lot of tools look impressive in demos, but longer multi-step workflows expose the real strengths and weaknesses fast.

"Predictable" is probably the perfect word here. Potential is exciting, but consistency is what makes something actually usable. Thanks for sharing your real experience.

Collapse
 
byteharbor profile image
Jordan Miles

Really interesting breakdown. I think the biggest takeaway isn’t that OpenClaw is “dead” but that Hermes shifted the conversation toward what people expect from an agent now. The memory structure and the self-improvement loop definitely feel like the next step, even if the ecosystem still has a long way to go.
OpenClaw still has a huge community and some things it does better, but it’s hard to ignore how fast Hermes is moving. Feels like we’re watching the early days of a real evolution in personal AI tools.
Curious to see how both projects respond over the next few months.

Collapse
 
tahosin profile image
S M Tahosin

That’s a great take, Jordan. I agree, the bigger shift is really in expectations. Hermes feels like it pushed the conversation toward more adaptive and autonomous agents, not just task execution.
And totally fair point about OpenClaw too. Strong community support is a massive advantage. The next few months should be really interesting to watch. Which side do you think wins long term, speed of innovation or ecosystem strength?

Collapse
 
byteharbor profile image
Jordan Miles

Honestly, I think it might come down to which project can balance both. Speed without stability burns people out, but a big ecosystem without fresh ideas can get stagnant. Hermes has the momentum right now, but OpenClaw has the kind of community depth that doesn’t disappear overnight.

If either of them manages to blend rapid iteration with a solid long term foundation, that’s probably the one that ends up winning. Until then, it’s fun watching both push each other forward.

Thread Thread
 
tahosin profile image
S M Tahosin

That’s a really balanced perspective, Jordan. I think you nailed it. Speed alone creates hype, community alone creates staying power, but combining both is the real game changer.

And honestly, competition like this is great for all of us. Both projects pushing each other forward probably means better agents faster.

Collapse
 
ritu2026 profile image
Israt Ritu

That "day-1 setup vs. day-30 utility" line hits the nail on the head. We've all been so caught up in easy installations that we overlooked what happens when an agent actually needs to grow over time. Moving from static markdown skills to an agent that dynamically updates its own procedural memory feels like the exact leap forward we need.

Quick question for you: Do you think OpenClaw will pivot its architecture to match this infrastructure-first approach, or will they double down on being the ultimate local device assistant?

Collapse
 
tahosin profile image
S M Tahosin

Love this perspective. That’s exactly the shift I was trying to highlight. Easy setup gets attention, but long-term adaptability is what makes an agent actually useful.
Great question too. My guess is OpenClaw probably leans into its local assistant strength rather than fully mirroring Hermes, but if Hermes keeps pushing this direction, some architectural evolution feels inevitable.

Collapse
 
noise2026 profile image
Berming Hamng

Really solid breakdown. The point about compounding skills + isolated execution being the real game changer was spot on. Great read 👏

Some comments may only be visible to logged-in visitors. Sign in to view all comments.