Phil Rentier Digital

Posted on Apr 21 • Originally published at rentierdigital.xyz

Hermes Agent: The Self-Hosted AI That Finally Grew Up. Here's the Two-VPS Setup Under $10.

#ai #technology #aiagents #hermesagent

Last weekend I installed Hermes Agent on two VPS. A brand-new Hostinger box in 1-click Docker. My existing Contabo box via SSH and a single curl command. Same model config on both sides: Sonnet 4.6 as primary, DeepSeek V4 for delegation. Two install philosophies. Both ship a working agent that replies in Telegram within minutes.

TL;DR: Two install paths tested end-to-end (zero terminal versus pure SSH), a model stack that completely shifted since February, one architectural move that Nous Research made while OpenClaw was busy patching, and a community pattern I wasn't expecting around who's actually migrating (and who isn't).

If you've been reading here since February, you know I documented my $15/month OpenClaw migration after the Claude Max ban. Hadn't touched it since. It worked. Then last week changed my mind. Anthropic officially pulled third-party access to Pro/Max on April 4. The public OpenClaw CVE tracker crossed 138 entries on the 10th. Nous Research shipped Hermes v0.9 on the 13th, a release that merged more pull requests in one drop than some projects ship in a quarter. Triple-hit, same week. Hard to keep ignoring it after that.

The Moment I Knew It Was a Different Beast

Five minutes into the Contabo install, the wizard asked me which terminal backend I wanted: local, Docker, SSH, Daytona, Singularity, or Modal. OpenClaw never asked me that question. OpenClaw just ran. Which was great until the afternoon a skill tried to clean temp files and nearly clipped a directory I'd rather it didn't touch. Hermes making the isolation question explicit, before install completes, tells you what generation you're dealing with.

Same with the auto-detection step further down the wizard. It scanned for ~/.openclaw, saw mine, and offered to import skills, memories, and API keys. Not in a migration guide you have to read on a Tuesday. In the installer. That's someone who designed for a specific user (the one leaving OpenClaw) and built the ramp.

Two small choices. Both say the same thing. Someone watched six months of OpenClaw happen and took notes.

Why I Bothered: What Six Months of OpenClaw Taught Me

Credit where it's due first. OpenClaw defined the self-hosted agent category. 347k GitHub stars in six months, an ecosystem of 13k+ community-built skills, a Discord that feels alive. Without OpenClaw, there's no Hermes to write about. The prototype did the hard job of proving the category was real.

But a prototype that grows fast accumulates architecture debt. Three places I felt that debt firsthand.

The UX breaks non-geeks. I've spent evenings debugging obscure configuration issues that made no sense until I'd read three Discord threads and one angry Medium post. Shadow, OpenClaw's official maintainer, said it directly on Discord (paraphrased): if you can't use a command line, you should not be using OpenClaw. When the person maintaining the product tells you it's a geek tool, believe them.

Security is patched, not designed. The public CVE tracker logged over 138 entries in roughly two months between February and April 2026. A separate exposure analysis from ARMO counted roughly 135k OpenClaw instances publicly reachable, the majority without authentication. Reco flagged a campaign of malicious skills in the hundreds. Microsoft's guidance in February, paraphrased: don't deploy OpenClaw on machines holding sensitive data. These are not bug counts. This is an architecture that trusts inputs by default and spends its time patching when someone finds the next hole.

Governance is turbulent. Three name changes in twelve months (Clawdbot, Moltbot, OpenClaw). OpenAI acquisition late 2025. For a tool I want to keep running three years, that's too much weather to sit through.

None of this aims at Peter Steinberger. The guy shipped something huge and defined a category. But an architecture designed for a prototype cannot outgrow its debt through patching, no matter how diligent the patching is.

Which is why next generations exist.

What Makes Hermes a Product, Not a Prototype

Quick context on Nous Research. AI safety lab behind the Hermes, Nomos, and Psyche model families, serious reputation in the open-weight crowd, MiniMax partnership announced early 2026. Hermes Agent launched in February, crossed 64k+ GitHub stars in two months, shipped v0.9.0 on April 13 with nine releases in seven weeks. Aggressive velocity.

Four architectural moves I watched firsthand during the installs.

Security treated as a constraint. Tirith, the pre-execution scanner, inspects shell commands before they run. Sub-agents live in their own namespace, each one isolated from the others and from the host. Containers ship hardened with read-only root filesystem and dropped capabilities. Filesystem checkpoints happen automatically before any destructive operation, with a rollback command that does what it says. Zero agent-specific CVE to date according to The New Stack (paraphrased). The move here is architectural, not cosmetic.

A closed learning loop. After complex tasks (five or more tool calls), the agent pauses, evaluates, and writes a reusable skill (a SKILL.md plus the code that goes with it). Nous's own benchmark (paraphrased) claims roughly 40% faster performance on research tasks once the agent has built up its own skill library. I saw the mechanism in action the first time I asked it to set up a recurring task. It wrote a SKILL.md covering the cron-plus-auth dance it had just figured out, so the next cron request starts from that skill instead of from scratch. Feels weird the first time. Useful by day three.

A standardized runtime. Same dependency set, same isolation model, same behavior across Linux, macOS, WSL2, and Android via Termux. The runtime doesn't drift depending on where you deploy (local dev machine, $5 VPS, bare-metal homelab, a phone), which sounds obvious until you try to rebuild a drifted OpenClaw install from memory on a new box at 11pm. No native Windows, no impact on me or 95% of the readers here.

A model-agnostic routing layer. Nous Portal OAuth (400+ models), OpenRouter (200+), direct Anthropic/OpenAI, Ollama local, vLLM, SGLang. Switch primary or delegation with a single hermes model command. No code change, no restart, no reconfig. Testing a new model on a specific task takes about two seconds.

The New Stack paraphrased the bet neatly: OpenClaw optimized for ecosystem breadth, Hermes optimizes for depth of learning. Different architectural bets, neither universally right. Hermes fits the use case where you want the thing to compound over time.

Install Path One: Hostinger (Zero Terminal)

KVM 2 plan specs: 2 vCPU, 8 GB RAM, 100 GB NVMe, 8 TB bandwidth, Ubuntu 24.04 LTS. Price: $8.99/month. Pre-configured Hermes Agent template sitting in the Docker catalog. Zero Docker to install on your side.

How it went. hPanel → Docker Manager → Catalog → typed "Hermes Agent" in the search → Select → Deploy. The template asked for the provider API key during deploy. I pasted my OpenRouter key (one key handles Sonnet 4.6, DeepSeek V4, and the fallbacks). Under fifteen minutes from clicking Deploy to the first "Hi" in Telegram, and most of that was the VPS provisioning itself.

No real friction. The wizard is what Hostinger has always been good at: opinionated defaults, minimal questions, works.

One detail worth noting. The same Hostinger catalog also offers OpenClaw as a 1-click template. Not a commercial pick on my end. A user choice in the same store. Provider stays neutral.

Who this path is for: the reader who followed my OpenClaw articles, who wants to test Hermes without getting into systemd, ufw, and Docker networking. Zero terminal end to end. Deploy, paste key, chat.

Hostinger Docker catalog Hermes Agent template.

Install Path Two: Contabo (I Already Had One)

My Contabo box has been running for a while now, handling WooCommerce store ops plus a handful of partner webhooks, with Traefik in front. I wanted to see if Hermes would drop onto an existing box without drama.

Cloud VPS 10 specs: 3 vCPU, 8 GB RAM, 75 GB NVMe. Price: $4.95/month, same price in year 1, 2, and 3. No renewal surprise. That's the part I keep coming back to.

How it went. SSH in as a regular user with sudo rights (not root, and yes we'll come back to that). Then the official one-liner from Nous Research (verbatim):

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

Obligatory confession: yes, this is curl | bash, the pattern every sysadmin has been yelling about for a decade. On a box that runs an actual ecommerce store. Read the script before you run it. I did. You should too. The installer itself is clean, handles Python 3.11, Node.js, uv, ripgrep, ffmpeg on its own, and never touches anything outside the Hermes working directory. That said, if the word "curl bash" gave you a rash just now, clone the repo and run the install from a local checkout. Works the same.

Then the interactive wizard. Choices that actually matter: LLM provider → model → TTS (I picked Edge TTS, free) → terminal backend (Docker, for isolation, out of the six options) → messaging working directory → sudo support → max tool iterations → tool progress display → session reset mode → messaging platform (Telegram).

Ten questions, maybe fifteen. Reading them beats skipping them, because the terminal backend choice alone is the difference between "agent in a sandbox" and "agent with the keys to the kitchen".

The auto-detection step is the one I want to flag. Because I had ~/.openclaw sitting on this same VPS, the wizard offered to import my existing skills, memories, settings, and API keys in one go. I took it. Three seconds, done. Whatever OpenClaw taught my agent over six months is now sitting in Hermes, which saves me from rebuilding the personalization layer from zero. If you don't have OpenClaw on the box, the wizard just skips that step and moves on.

One documented trap, not to be missed. If you already run a Telegram bot under OpenClaw, do NOT reuse its token. Create a NEW bot via BotFather or both break. A YouTube demo from early April walked straight into it on camera (paraphrased, source below). Free lesson, courtesy of someone else's mistake.

Under twenty minutes total to a working agent on Telegram, most of it spent reading the wizard questions carefully instead of mashing Enter.

The Contabo arguments, condensed. RAM-per-dollar is unbeatable at roughly $0.50/GB (for reference, you're around $6/GB on DigitalOcean). Full OS control (Ubuntu 22/24, Debian, Rocky, CentOS). Data centers across Europe, Asia, the Americas, Australia. A CLI wizard that teaches you what it's installing instead of hiding it behind a panel. Same price over three years.

Who this path is for: the reader who wants to understand the commands that ran, who already hosts other services, who plans in three-to-five-year chunks instead of thirty-day ones.

Contabo Cloud VPS 10.

The Model Stack (Two Months Later, Everything Shifted)

In my February article I was running Kimi K2.5 + MiniMax + GLM-4.7-Flash. Optimal stack for OpenClaw at the time. For Hermes, the landscape moved and my priorities moved with it.

Technical context first. Hermes v0.9 carries a fixed per-API-call overhead of roughly 73% (tool definitions around 8,700 tokens, system prompt around 5,200 tokens). In Telegram mode the overhead climbs to 15-20K tokens per message, two to three times CLI mode, per Nous's own docs. In that context, reliable tool-calling becomes the critical factor. A cheap model that misfires tool calls loops into error and burns more tokens than a premium model running clean.

Actual config after two weeks of iteration:

provider: openrouter
model: anthropic/claude-sonnet-4-6    # primary

delegation:
  model: deepseek/deepseek-v4
  provider: openrouter

Claude Sonnet 4.6 ($3/$15 per million input/output tokens) as primary. Consensus pick in the Hermes-in-production community right now (r/LocalLLaMA threads, r/singularity, Berkeley Function Calling Leaderboard). Reliable tool-calling, solid multi-step reasoning, no error spirals. DeepSeek V4 ($0.30/$0.50) as delegation. 90% cache discount makes the overhead nearly free. Around 90% of Claude's quality on sub-agent tasks. Honest caveat: DeepSeek's infra throws 503s at peak hours, fallback is clean (delegation drops back to primary without drama).

Models to avoid. GPT-5.4 Mini, "terrible at tool calling" by explicit r/LocalLLaMA warning. MiniMax 2.5 was unusable, 2.7 fixed it. Qwen 3.x for tool-calling breaks parsing because of the <think> tags. Pure reasoning models talk themselves out of using tools. Don't ask me why, they just do.

Real monthly cost depends on your usage pattern. At roughly 10 messages per day, you'll probably land around $15-25 all-in. At 30 per day, closer to $40-70. At 50+, $80-120. The Telegram overhead is the variable that moves the needle.

Fallback plan if something derails: hermes model, switch primary to DeepSeek V4, effective immediately, no reconfig. Safety net is one command.

My SOUL.md opens with the four integrity lines from my prompt contract. Never lie. Never hide the truth. Never conceal a problem. Never fail silently. Same clause that sat on top of my old OpenClaw CLAUDE.md. It still makes the dashboard yellow instead of fake-green, and I still prefer yellow.

What Hermes Doesn't Do Yet (Honestly)

Four caveats worth stating plainly.

Anthropic OAuth does NOT work natively. If you're Claude-first (me, probably you), you need OpenRouter or a direct Anthropic API key. Pro and Max subscriptions cover the web interface, not the API, so you can't plug them into an agent anyway. The real friction is having to manage a separate pay-as-you-go balance on OpenRouter or the Anthropic console on top of whatever web subscription you already pay for. Two invoices, two dashboards, one usage to monitor. Biggest caveat on my list right now.

The skills ecosystem is young. No ClawHub equivalent with 13k+ community-built skills. Hermes creates its own skills through the learning loop, but you start without a shared library. The compounding effect takes two to four weeks to become visible, based on what I observed and what r/LocalLLaMA reports.

v0.9 is five days old. Hermes is two months old total. CVEs will come (no architecture is immune). The design should keep them less catastrophic. Nous's aggressive velocity also means a massive surface of change, which means a massive surface of bugs too. A release that merges hundreds of PRs is not a calm number.

And a community nuance that matters. Power users aren't migrating. They're running both in parallel via the ACP protocol (OpenClaw as orchestrator, Hermes as execution specialist). Source: a Kilo analysis of r/openclaw threads, paraphrased. Full migration isn't the only valid path. I'm not dual-running, but I'm not telling you not to either.

Hermes is architecturally superior. I'll stand on that. But it's a two-month-old product, not a messiah. Temper accordingly.

Who Should Actually Do This

Four quick segments so you don't have to squint at the decision.

If you're new to self-hosted agents, go Hermes direct via the Hostinger 1-click. No OpenClaw debt to migrate. Sonnet 4.6 + DeepSeek V4 on OpenRouter. Roughly $15-25/mo all-in for personal use.

If you already run OpenClaw with a stable setup, dual-run via ACP instead of migrating. OpenClaw keeps orchestrating your automations, Hermes runs as execution specialist on new tasks. The Hermes wizard detects ~/.openclaw and offers to import the personalization layer, which means the cost of trying is basically zero. (If your setup already runs the 21 advanced automations I documented here, Hermes won't break any of them.)

If you migrated post-Claude-Max-ban (my case, February), it's Hermes + OpenRouter + Sonnet 4.6 + DeepSeek V4. Direct upgrade from the old Kimi/MiniMax stack. Same price range, better tool-calling reliability.

For critical production, wait. v1.0 or three months of v0.x stability. For personal use or side projects, it's fine now. For your client's prod, it's not.

Your client pays you to be boring about their uptime.

I took install notes on both paths while I was doing them. If there's interest, I'll clean them up into a proper guide: the 2-path checklist, the SOUL.md integrity template, the Sonnet 4.6 / DeepSeek V4 config. Say so in the comments.

Three months from now, Hermes will have its own CVEs. Every architecture ends up with some. That's not the question.

OpenClaw had six months. It took on the debt. Hermes looked at that debt first. Good prototype. But honestly, spending time debugging (even with Claude) is not my passion. I'd rather be building. C'est la vie 😊

Sources

Public OpenClaw CVE tracker (GitHub, April 2026)
ARMO exposure analysis on OpenClaw instances (February 2026)
Reco campaign report on malicious OpenClaw skills (March 2026)
Nous Research Hermes Agent documentation and v0.9 release notes (April 2026)

This article may contain affiliate links. I may earn a small commission if you purchase through them.

(*) The cover is AI-generated. Midjourney took one look at the Hermes launch schedule and blamed me for the deadline.

The article walks through two real installs on $5 VPS, but the bigger shift is how Hermes handles isolation and security by design—not patches. If you're self-hosting agents, the demo-vs-product checklist in the welcome kit shows exactly which of those 138 CVEs you should actually worry about.

→ Get the welcome kit

DEV Community