DEV Community

Cover image for How I Connected Hermes Agent to My Next.js App (And Why It's Not Just Another Chatbot Wrapper)
Onah Sunday.
Onah Sunday.

Posted on

How I Connected Hermes Agent to My Next.js App (And Why It's Not Just Another Chatbot Wrapper)

Hermes Agent Challenge Submission

This is a submission for the Hermes Agent Challenge

Live app: https://devbrief-tau.vercel.app

Repo: github.com/sundayonah/devbrief


I was skeptical.

Every week there's a new "autonomous AI agent" that turns out to be a thin wrapper around a chat API with a fancy UI on top. So when I heard about Hermes Agent — Nous Research's open-source agent that "grows with you" — I filed it under probably hype and moved on.

Then I actually used it. And I ended up rebuilding an entire side project around it.

This is a practical guide to setting up Hermes Agent locally and connecting it to a real Next.js app. I'll walk through exactly what I did to power DevBrief — a tool that reads your GitHub activity and writes standups, PR changelogs, or work logs — using Hermes as the brain. I'll also include the bugs I hit (wrong hermes binary, empty API keys, IPv6 localhost, OpenRouter credit reservation) so you don't have to rediscover them.


What Makes Hermes Different

Before the setup steps, let me explain why this matters — because the architecture is genuinely different from what you're probably used to.

Most "AI-powered" apps work like this:

Your app → LLM API → response → done
Enter fullscreen mode Exit fullscreen mode

Every request is stateless. The model has no memory of the last call. You're just sending text and getting text back.

Hermes works like this:

Your app → Hermes Agent → skills + memory + tools → response
                ↓
         can learn from the interaction over time
Enter fullscreen mode Exit fullscreen mode

Hermes is a persistent agent that runs on your machine (or server). It has:

  • Skills — Markdown files that teach it how to handle specific tasks.
  • Memory — Cross-session context (Hermes can use this to calibrate over time).
  • Tools — Terminal, files, web search, and more on the agent side (not in your Next.js app).
  • An OpenAI-compatible API — So connecting from a backend is straightforward.
  • Cron scheduling — Natural language scheduling for recurring jobs (optional; I wired this as a next step).

For DevBrief, the important part is: my Next.js app does not call OpenRouter directly. It calls Hermes, which already has the model, tools, and skills configured. That's the difference between a wrapper and an agent-backed product.


What DevBrief Actually Does

DevBrief isn't only a standup generator. After you connect GitHub (OAuth), you can:

  • Pick a repo, time range, branch, and PR filters
  • Choose output mode: standup, PR changelog, or work log
  • Pick a tone: casual, formal, or concise

The Next.js route POST /api/summary fetches GitHub activity, then calls lib/hermes.ts → Hermes on port 8642. If Hermes is down, you still get a fallback template so the UI isn't broken.

Try it: devbrief-tau.vercel.app — any GitHub user can sign in with OAuth and use their own repos.


Step 1 — Install Hermes (and use the right binary)

One command. Works on Linux, macOS, and WSL2 on Windows:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

The installer handles Python 3.11, Node.js, and dependencies. On WSL, the Nous Hermes CLI usually lands at ~/.local/bin/hermes.

Verify:

~/.local/bin/hermes --version
which hermes
Enter fullscreen mode Exit fullscreen mode

Windows gotcha: If you also have Rust/Cargo installed, which hermes might point at ~/.cargo/bin/hermes — that's the IBC relayer, not Nous Hermes. Use ~/.local/bin/hermes explicitly, or put ~/.local/bin before ~/.cargo/bin in your PATH.

Test the agent CLI before touching your app:

~/.local/bin/hermes chat -q "Say hello in one word"
Enter fullscreen mode Exit fullscreen mode

Step 2 — Pick your model provider

Hermes is model-agnostic: OpenRouter, Anthropic, local Ollama, and more.

~/.local/bin/hermes model
Enter fullscreen mode Exit fullscreen mode

I used OpenRouter. Set your key in ~/.hermes/.env (Hermes reads this file; DevBrief does not):

OPENROUTER_API_KEY=sk-or-v1-your-key-here
Enter fullscreen mode Exit fullscreen mode

Pick a model in the wizard. For development I used a free/cheap route (openrouter/owl-alpha) after hitting billing quirks with Sonnet (more in Troubleshooting). For production quality, something like anthropic/claude-sonnet-4.6 on OpenRouter works — but watch max_tokens reservation (Step 2b).

Duplicate key trap: Run grep -n OPENROUTER_API_KEY ~/.hermes/.env. You must have only one non-empty line. If a second empty OPENROUTER_API_KEY= appears at the bottom of the file, python-dotenv uses the last value (empty) and every Hermes call fails with HTTP 400 while curl still works from your shell export.


Step 2b — Optional: cap max_tokens for OpenRouter

Hermes may request the model's full output budget (e.g. 64000 tokens). OpenRouter pre-reserves credits for that ceiling. With a small balance you can get:

HTTP 402: You requested up to 64000 tokens, but can only afford 2661.
Enter fullscreen mode Exit fullscreen mode

Fix in ~/.hermes/config.yaml:

model:
  max_tokens: 2048
Enter fullscreen mode Exit fullscreen mode

Or add credits at openrouter.ai/settings/credits.


Step 3 — Enable the API server and start the gateway

This is the step most tutorials skip. DevBrief talks to Hermes over the OpenAI-compatible API server, which runs inside the gateway — not hermes serve (outdated in some docs).

In ~/.hermes/.env:

API_SERVER_ENABLED=true

# Optional for local dev (gateway may accept all requests without a key)
# API_SERVER_KEY=change-me-local-dev
Enter fullscreen mode Exit fullscreen mode

Start the gateway (keep this terminal open):

~/.local/bin/hermes gateway run
Enter fullscreen mode Exit fullscreen mode

You might see warnings about allowlists or missing API keys — that's normal for local dev. Confirm the API is up:

curl http://127.0.0.1:8642/health
Enter fullscreen mode Exit fullscreen mode

Expected:

{"status": "ok", "platform": "hermes-agent"}
Enter fullscreen mode Exit fullscreen mode

Test chat completions (omit Authorization if you didn't set API_SERVER_KEY):

curl http://127.0.0.1:8642/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hermes-agent",
    "messages": [{"role": "user", "content": "Hello! Are you running?"}],
    "max_tokens": 20
  }'
Enter fullscreen mode Exit fullscreen mode

Use 127.0.0.1, not localhost, on Windows. Node often resolves localhost to IPv6 ::1. I got ECONNREFUSED ::1:8642 until I set HERMES_ENDPOINT=http://127.0.0.1:8642.

WSL + Windows: Run hermes gateway run in WSL and pnpm dev on Windows. WSL2 forwards 127.0.0.1:8642 to the gateway when it's running.

CORS: DevBrief calls Hermes from the Next.js server (/api/summary), not from the browser. You do not need API_SERVER_CORS_ORIGINS unless your frontend calls Hermes directly.


Step 4 — Write a skill

Instead of stuffing a giant system prompt into every API call, you can add a skill file — Markdown that teaches Hermes how to handle standups.

DevBrief ships hermes-skills/standup-writer.md (abbreviated):

# Skill: standup-writer

## Purpose
Given raw GitHub activity (commits, PRs, issues), generate a clean
daily standup summary in three sections: Yesterday, Today, Blockers.

## Style Rules
- Keep each bullet under 12 words
- Use plain English, no jargon
- Infer "Today" from open PRs and unresolved issues

## Output Format
**Yesterday**
- ...

**Today**
- ...

**Blockers**
- None / [describe blocker]
Enter fullscreen mode Exit fullscreen mode

Install it:

cp hermes-skills/standup-writer.md ~/.hermes/skills/
Enter fullscreen mode Exit fullscreen mode

In the app, standup mode uses a system hint plus prompts (the API doesn't pass a separate skill field). PR changelog and work log modes use tailored user prompts in buildPrompt(). The skill file still helps Hermes when you're in standup mode.


Step 5 — Connect it to your Next.js app

Hermes exposes POST /v1/chat/completions. DevBrief's lib/hermes.ts calls it from the server with a system + user message, optional bearer auth, and a long timeout (generations can take 30–60 seconds because the agent runs tools).

Simplified version of the real integration:

lib/hermes.ts

export async function generateBrief(input: {
  activity: GitHubActivity;
  tone: "casual" | "formal" | "concise";
  outputMode: "standup" | "pr_changelog" | "work_log";
}) {
  const base = (process.env.HERMES_ENDPOINT || "http://127.0.0.1:8642")
    .replace("localhost", "127.0.0.1");

  const headers: Record<string, string> = {
    "Content-Type": "application/json",
  };
  if (process.env.HERMES_API_KEY) {
    headers.Authorization = `Bearer ${process.env.HERMES_API_KEY}`;
  }

  const res = await fetch(`${base}/v1/chat/completions`, {
    method: "POST",
    headers,
    body: JSON.stringify({
      model: process.env.HERMES_MODEL_NAME || "hermes-agent",
      messages: [
        {
          role: "system",
          content:
            "You are DevBrief. Turn structured GitHub activity into human-readable summaries.",
        },
        { role: "user", content: buildPrompt(input) },
      ],
      stream: false,
    }),
    signal: AbortSignal.timeout(300_000),
  });

  if (!res.ok) throw new Error(`Hermes ${res.status}`);

  const data = await res.json();
  return data.choices[0].message.content;
}
Enter fullscreen mode Exit fullscreen mode

.env.local (Next.js — not the same file as ~/.hermes/.env):

HERMES_ENDPOINT=http://127.0.0.1:8642

# Only if you set API_SERVER_KEY in ~/.hermes/.env
# HERMES_API_KEY=change-me-local-dev
Enter fullscreen mode Exit fullscreen mode

OPENROUTER_API_KEY in .env.local does nothing for DevBrief; only Hermes uses it via ~/.hermes/.env.

Run the app:

pnpm dev
Enter fullscreen mode Exit fullscreen mode

Click Generate. In the terminal you want POST /api/summary 200 without Hermes Agent not reachable, using fallback generator. The UI should show prose (e.g. a PR changelog with Summary / Changes), not only raw - PR #17 [closed]: title bullets.


Step 6 — Automated scheduling (roadmap)

Hermes can schedule work in plain English and deliver via Telegram/Slack through the gateway. I documented this as a next step in DevBrief's README:

hermes schedule "Every weekday at 8:30am, POST to http://localhost:3000/api/summary ..."
Enter fullscreen mode Exit fullscreen mode

To wire Telegram:

hermes gateway setup
Enter fullscreen mode Exit fullscreen mode

Troubleshooting (what actually bit me)

Symptom Cause Fix
hermes: command not found or wrong behavior Wrong hermes on PATH (IBC relayer) Use ~/.local/bin/hermes
curl to OpenRouter works, hermes chat HTTP 400 Duplicate empty OPENROUTER_API_KEY at bottom of ~/.hermes/.env Keep one key line only
hermes chat HTTP 402 on Sonnet OpenRouter reserves credits for huge max_tokens model.max_tokens: 2048 or add credits
DevBrief ECONNREFUSED ::1:8642 IPv6 localhost / gateway not running 127.0.0.1, run hermes gateway run
Bullet-list output only Fallback path — Hermes unreachable Fix gateway + endpoint; check server logs
Request takes ~40s Normal — full agent + tools Expected for first successful run

What I learned building DevBrief

The skill + prompt split is useful. Standup format lives in standup-writer.md and in prompt builders; app code stays about GitHub data and UX.

The fallback matters. When Hermes wasn't running, I still tested the UI. generateFallbackBrief() produces a minimal template until the gateway is up.

The OpenAI-compatible API is a real advantage. One fetch to /v1/chat/completions — no custom Hermes SDK in Next.js.

Memory and cron are powerful — and optional. Hermes supports memory and scheduling; I focused the submission on the working path: GitHub → Next.js API → Hermes gateway → formatted brief.

Honest latency. Agent-backed generation is slower than a single LLM call. Worth it for quality; show a loading state in the UI.


Running it in production

DevBrief and Hermes deploy as two services. Hermes is a long-lived gateway; it cannot run inside Vercel’s short-lived serverless functions. On serverless-only Next.js hosts, Hermes must run somewhere else — we use Render (Docker) for Hermes and Vercel for the Next.js app.

Users → DevBrief (Vercel)
            ↓  POST /v1/chat/completions
      Hermes (Render Docker)
            ↓
      OpenRouter (API key only on Hermes)
Enter fullscreen mode Exit fullscreen mode

1. Hermes on Render

The repo ships docker/hermes/Dockerfile (Ubuntu + official Hermes installer + bundled standup-writer skill).

  1. Push the repo to GitHub (include docker/, hermes-skills/, render.yaml).
  2. In Render: New → Web Service → Docker, connect the repo.
  3. Dockerfile path: docker/hermes/Dockerfile · Context: repository root.
  4. Health check path: /health.
  5. Environment variables on the Render service:
Variable Value
OPENROUTER_API_KEY Your OpenRouter key (Hermes only — not on Vercel)
API_SERVER_KEY Strong secret; same value as DevBrief HERMES_API_KEY
HERMES_MODEL e.g. openrouter/owl-alpha (optional)

API_SERVER_ENABLED and API_SERVER_HOST are already set in the Dockerfile. The first deploy can take 15–20+ minutes while install.sh runs inside the image.

Verify:

curl https://YOUR-SERVICE.onrender.com/health
# {"status":"ok","platform":"hermes-agent"}
Enter fullscreen mode Exit fullscreen mode

OpenRouter tip: If you see HTTP 402, lower model.max_tokens in Hermes config (e.g. 2048) — Hermes may request a huge default budget and OpenRouter pre-reserves credits.

2. DevBrief on Vercel

  1. Import the repo in Vercel (Next.js).
  2. Environment variables (Production):
Variable Example
GITHUB_ID / GITHUB_SECRET GitHub OAuth app
NEXTAUTH_SECRET openssl rand -base64 32
NEXTAUTH_URL https://your-app.vercel.app
HERMES_ENDPOINT https://YOUR-SERVICE.onrender.com (HTTPS, no :8642)
HERMES_API_KEY Same as Render API_SERVER_KEY
  1. In your GitHub OAuth app, set Authorization callback URL to:

https://your-app.vercel.app/api/auth/callback/github

  1. /api/summary can run 30–60+ seconds while Hermes generates; this repo sets maxDuration = 60 — you need a Vercel plan that allows ≥60s (typically Pro).

  2. History on Vercel is stored under /tmp (writable but ephemeral across cold starts). For durable history, add a database later.

Do not put OPENROUTER_API_KEY on Vercel for the normal flow — only the Hermes container needs it.

3. Smoke test

  1. curl https://YOUR-SERVICE.onrender.com/health
  2. Open the Vercel app → Connect GitHub → pick a repo → Generate
  3. You want real prose from Hermes, not the fallback bullet template.

Repo: github.com/sundayonah/devbrief · Full checklist: docs/DEPLOYMENT.md.


Optional: local Docker before Render

export OPENROUTER_API_KEY=sk-or-v1-...
export API_SERVER_KEY=your-dev-secret
docker compose -f docker-compose.hermes.yml up --build
curl http://127.0.0.1:8642/health

Enter fullscreen mode Exit fullscreen mode

The full picture

GitHub API
    ↓
Next.js API route (/api/summary)
    ↓
lib/hermes.ts → POST /v1/chat/completions @ 127.0.0.1:8642
    ↓
Hermes gateway (model via OpenRouter, tools, skills)
    ↓
Standup / PR changelog / work log
    ↓
Next.js UI → copy, history

(Optional later)
Hermes cron → POST /api/summary → Telegram via gateway
Enter fullscreen mode Exit fullscreen mode

Resources


If you're building with Hermes: start the gateway, hit /health, then hit /v1/chat/completions with curl before writing application code. On Windows, use 127.0.0.1 and keep hermes gateway run alive while you develop.

Drop a comment if you hit any snags — happy to help.


Top comments (0)