Rost

Posted on Apr 10 • Originally published at glukhov.org

Hermes AI Assistant - Install, Setup, Workflow, and Troubleshooting

#selfhosting #llm #ai #devops

Hermes Agent is a self-hosted, model-agnostic AI assistant that runs on a local machine or low-cost VPS, works through terminal and messaging interfaces, and improves over time by turning repeated tasks into reusable skills.

It is very similar in functionality to OpenClaw, another self-hosted assistant stack built around tools, memory, and local control.

If you want the wider picture of self-hosted assistants, retrieval, and local infrastructure around Hermes, this overview of AI systems ties those topics to the same problems Hermes is trying to solve.

For deployment trade-offs and runtime choices, LLM Hosting in 2026: Local, Self-Hosted & Cloud Infrastructure Compared provides the hosting map, while LLM Performance in 2026: Benchmarks, Bottlenecks & Optimization covers the throughput and latency side once Hermes is running.

My biased take: Hermes is most interesting when treated as infrastructure, not a tab you occasionally open. Once it runs as a service and has a stable home directory, your prompts start to look less like "chat" and more like "ops".

What Hermes Agent is and why it matters

Hermes Agent is an open-source AI agent built by Nous Research. It is designed to run persistently, use tools (terminal, files, web, and more), and improve its own behaviour over time with a skills and memory system.

Two design choices are worth spelling out because they shape everything else in this guide.

First, Hermes is not locked to a single model provider. The official setup flow supports multiple providers and any OpenAI-compatible endpoint, and switching is done via the hermes model command rather than code edits.

Second, Hermes draws a hard line between "conversation" and "execution". The agent can talk all day, but when it needs to act, it does so through explicit tools and a configurable execution backend. That is where safety, reproducibility, and troubleshooting live.

Cost and licensing are refreshingly boring. Hermes Agent itself is free software under the MIT licence. If you use hosted models, the ongoing cost is whatever your provider charges. If you run local models, you can avoid API fees entirely.

Install Hermes Agent

Hermes has a fast install path for Linux, macOS, and WSL2. The official docs keep it intentionally simple.

One-line install

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

After installation, reload your shell and start the CLI.

source ~/.bashrc   # or source ~/.zshrc
hermes

The installer is not just a thin wrapper. According to the installation guide, it sets up dependencies, the repo, a virtual environment, and the hermes command, then gets you to a first chat-ready state.

Windows and Android notes

Native Windows is not supported. The docs recommend WSL2 and running Hermes inside it.

For Android, Hermes supports a Termux install path. It is designed to detect Termux and adapt its dependency and environment setup accordingly.

Quickstart

The quickest first run is literally just hermes, but a meaningful quickstart means two extra decisions: which model provider to use and which tools should be enabled.

Pick a provider and model

Hermes exposes three complementary entry points:

hermes model to pick a provider and default model
hermes tools to enable or disable toolsets
hermes setup to run an interactive wizard across major configuration areas

A minimal flow looks like this:

hermes model
hermes tools
hermes

In terms of what is actually supported, the official Quickstart lists a range of providers and also calls out that Hermes works with OpenAI-compatible APIs. That matters because it includes both hosted services and self-hosted endpoints.

Prove tool execution early

Before you build habits around Hermes, it is worth verifying that tool use works in your environment. The Quickstart explicitly suggests terminal usage as a first feature to try.

In practice, a small "smoke test" prompt does two jobs: it checks the terminal tool and it validates permission prompts.

Example prompt:

Show my disk usage and the five largest directories.

If Hermes cannot run the terminal tool, skip ahead to Troubleshooting. Terminal backend misconfiguration is one of the most common causes, and the fix is usually obvious once you look at config.

Configuration that scales

Hermes rewards people who understand where it stores state and how it resolves configuration. This is also where many "it worked yesterday" problems come from.

Where configuration and state live

Hermes stores its settings and state under ~/.hermes. The official configuration guide documents the layout, including config.yaml for settings, .env for secrets, auth.json for OAuth credentials, SOUL.md for identity, and folders for memories, skills, cron, sessions, and logs.

This matters for two reasons.

Debugging becomes mechanical because you know exactly where to look.
Backups become straightforward because one directory captures most of the agent state you care about.

Config precedence and keeping secrets out of config.yaml

Hermes resolves configuration with a precedence order. At the top are CLI overrides, then config.yaml, then .env, with built-in defaults at the bottom.

The nice detail is that hermes config set routes values to the right file: API keys to .env and non-secret settings to config.yaml.

hermes config set model openrouter/meta-llama/llama-3.1-70b-instruct
hermes config set terminal.backend docker
hermes config set OPENROUTER_API_KEY sk-or-v1-xxxxxxxx

Hermes also supports environment variable substitution inside config.yaml via ${VAR_NAME} syntax. This is useful when you want to keep certain values in the environment while still referencing them in structured config.

Sandbox and execution backends

Hermes supports multiple terminal backends that define where shell commands actually run. The config guide lists local, docker, ssh, modal, daytona, and singularity.

The opinionated but non-evangelical way to think about this is:

local is fastest and simplest, but it is not isolated
docker is a pragmatic safety and reproducibility layer
ssh is a clean way to separate your chat device from your compute box
modal and daytona fit "serverless but persistent enough" workflows
singularity is the HPC-friendly option

A minimal Docker backend example:

# ~/.hermes/config.yaml
terminal:
  backend: docker
  docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
  docker_volumes:
    - "/home/user/projects:/workspace/projects"
  docker_forward_env:
    - "GITHUB_TOKEN"

The docs also describe security hardening for the Docker backend, such as dropping capabilities and disabling privilege escalation.

Skills, memory, and profiles

Hermes has two related mechanisms for compounding value.

Skills are procedural memory. Hermes can create, update, and delete its own skills and can offer to save an approach as a skill after completing a complex task.

Built-in memory is stored as files like MEMORY.md and USER.md under ~/.hermes, and Hermes can also use external memory providers for deeper recall. The memory docs list multiple provider plugins, and the memory providers guide documents an interactive setup flow.

If you want multiple independent agents on the same machine, Hermes profiles provide isolation. Each profile gets its own directory with its own config, secrets, memories, sessions, skills, cron jobs, and gateway state.

Typical workflow

If you treat Hermes like an agent you will keep around, the workflow starts to look like service engineering.

A stable baseline

A baseline that tends not to rot is:

Install and run a first chat in the CLI.
Pick a provider and model with hermes model, then confirm costs.
Configure toolsets and decide whether terminal execution is local or sandboxed.
Make a quick change to SOUL.md only after you have used the default for a while. Identity changes matter more than people expect because it is "slot 1" in the system prompt.

Daily usage that compounds

Hermes has a terminal UI rather than a web UI, and it is designed for long sessions with slash commands, resumable sessions, and streaming tool output.

In practice, a useful cadence is:

run work in a named session for a project
compress context when it grows too large
let Hermes turn repeated routines into skills
keep a mental boundary between "ask" and "act" so tool execution stays auditable

Messaging gateway for 24/7 access

The messaging gateway is the piece that makes Hermes feel like an assistant rather than a terminal app. The docs describe it as a single process that connects to multiple platforms, handles sessions, runs cron jobs, and delivers messages.

Setup is invoked via hermes gateway setup, and the gateway can run in the foreground or as a user service. The CLI reference documents gateway subcommands like run, install, start, stop, status, and restart.

Security for a tool-using bot matters. The gateway docs describe allowlists for specific platforms and a DM pairing flow that issues one-time pairing codes and requires approval via hermes pairing approve.

Updates without drama

Hermes updates are a first-class command. The updating guide documents hermes update, config migration checks, and a small post-update validation routine including hermes doctor and hermes gateway status.

hermes update
hermes doctor
hermes gateway status

Troubleshooting and diagnostics

Most Hermes failures are not mysterious. They look mysterious because people only check the model layer and ignore the runtime layer.

Fast triage commands

The CLI reference explicitly positions three commands as the core loop:

hermes doctor for interactive diagnostics
hermes status for a quick overview
hermes dump for a shareable, redacted setup summary

For logs, hermes logs tails files stored under ~/.hermes/logs.

hermes doctor --fix
hermes status
hermes dump --show-keys
hermes logs errors -f

Common installation failures

The FAQ and troubleshooting guide lists several recurring problems and their fixes, including Python version issues, uv not found, and permission problems caused by mixing sudo installs with user installs.

If you encounter these errors, the docs provide specific remediation steps such as upgrading Python, installing uv, and reinstalling Hermes without sudo.

Provider and model issues

When API keys do not work, the FAQ recommends checking configuration, re-running hermes model, or setting a key directly via hermes config set. It also calls out a common gotcha: keys are provider-specific.

For "model not found" problems, the FAQ points back to using hermes model to pick a valid identifier and shows both config and per-session overrides.

Rate limiting and context length issues are also covered. The FAQ suggests waiting for 429 errors, switching providers or models, and reducing context pressure via compression or a fresh session.

Terminal backend and gateway issues

If terminal commands fail immediately, the configuration guide includes a "common terminal backend issues" section and points at the typical causes per backend, including Docker not running and missing SSH variables. It also notes that falling back to local is a valid debugging move when sandbox configuration is in question.

For gateway problems, the messaging guide highlights allowlists and pairing as the safe defaults, which means many "bot is silent" incidents are actually authorisation doing its job.

References

Hermes Agent Homepage

DEV Community