DEV Community

Debarpan Jha
Debarpan Jha

Posted on

I Built a Skill Registry for Hermes — npm for Agent Skills

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

What I Built

Hermes Skill Registry — npm for Hermes skills. A publish-discover-install-rate marketplace where any agent can share skills and any agent can find them.

Hermes has 85+ skills covering everything from GitHub workflows to Spotify to Minecraft servers. They're what make Hermes different from Claude Code or Codex — the ability to persist procedural knowledge across sessions. But they have a distribution problem: when an agent creates a new skill, it lives in ~/.hermes/skills/ on that one machine. There's no way for other agents to discover it, install it, or rate it. Every agent learns from scratch.

The registry fixes that. It turns skills from local artifacts into shared infrastructure.

It's three things:

- FastAPI registry server — REST API with search, publish, install, rate, star, and trending endpoints backed by SQLite. Skills stored as raw SKILL.md files with metadata extracted from frontmatter.
- CLI client — 7 commands: search, install, publish, rate, inspect, trending, star. Install writes directly to ~/.hermes/skills/<category>/<name>/SKILL.md.
- Web frontend — Single HTML file, dark theme, no external CDN. Searchable skill cards, category filters, trending sidebar, install button that copies the CLI command.

There's also a skill-registry SKILL.md that teaches agents auto-discovery: when you encounter an unknown task, search the registry, inspect, install, use, rate.
Enter fullscreen mode Exit fullscreen mode

Demo

Start the server
cd ~/hermes-skill-registry && python3 -m registry.main

Search, install, publish, rate
python3 registry/cli.py search "testing" --sort rating
python3 registry/cli.py install 3
python3 registry/cli.py publish ./my-skill/
python3 registry/cli.py rate 3 5 --review "Works great"


API docs at http://localhost:8000/docs. Docker: docker-compose up --build.
Enter fullscreen mode Exit fullscreen mode

Code

Hermes Skill Registry

npm for Hermes skills — a community-driven marketplace where agents can publish, discover, install, and rate skills.

Quick Start

# Install dependencies
pip install -r requirements.txt

# Start the registry server
python3 -m registry.main

# Seed with sample skills
python3 -m registry.seed

# Search
python3 registry/cli.py search "testing"

# Install a skill
python3 registry/cli.py install 1

# Publish your skill
python3 registry/cli.py publish ./my-skill/

# Rate a skill
python3 registry/cli.py rate 3 5 --review "Excellent!"
Enter fullscreen mode Exit fullscreen mode

API

Base URL: http://localhost:8000







































































Method Path Description
GET /api/v1/search?q=&category=&tags=&sort= Search skills
GET /api/v1/trending Trending skills
GET /api/v1/skills/{id} Skill details
GET /api/v1/skills/{id}/download Download SKILL.md
POST /api/v1/skills Publish skill (API key)
PUT /api/v1/skills/{id} Update skill (API key)
DELETE /api/v1/skills/{id} Delete skill (API key)
POST /api/v1/skills/{id}/rate Rate skill (API key)
GET /api/v1/skills/{id}/ratings List ratings
POST /api/v1/skills/{id}/star Star a skill
GET /health Health check
GET /docs Swagger UI

Authentication

Write operations (publish…




registry/ → FastAPI server, DB layer, CLI, schemas, auth, seed
web/ → index.html (single-file frontend)
tests/ → test_api.py (23 tests)

My Tech Stack

Python 3.11, FastAPI, SQLite + aiosqlite, Pydantic v2, Click, httpx, pytest, Docker. No ORM. No build pipeline. No external runtime deps.

How I Used Hermes Agent

This project isn't something I sat down and typed out line by line. It was built through a directed conversation with Hermes Agent across two sessions, with a multi-hour break in between. The agent was the executor; I was the architect. Here's exactly which agentic capabilities I leaned on and why each one was the right fit.

1. Multi-File Project Scaffolding

The project has ~20 files across 4 directories — server code, database layer, CLI, tests, web frontend, Docker config. I never once asked Hermes to "create a directory structure." I described the components I wanted (FastAPI + SQLite, a CLI client, a web UI, tests) and Hermes figured out the file layout, created all the files, and wired them together. This is the baseline capability that makes everything else possible. Without it, you're playing 20 questions about file paths before any real work happens.

Why it was the right fit: Scaffolding is pure execution, zero creative judgment. An agent gets it right every time; a human drifts. Let the agent do it.

2. Terminal Execution + Iterative Debugging

When the FastAPI 307 redirect bug hit — httpx was getting redirects instead of 201 responses on publish — I didn't theorize about it. I told Hermes "POST returns 307 instead of 201, fix it," and Hermes:
- Read the route code
- Identified that FastAPI redirects /skills to /skills/
- Added follow_redirects=True to the httpx client
- Re-ran the tests to verify

Same pattern for the test data collision bug (test names overlapping after a session restart — fixed by seeding with os.getpid() * 1000) and the port-already-in-use issue after waking from sleep. Each cycle took under two minutes. Doing this manually would have been: read the error, Google it, try a fix, test it. Five to ten minutes per cycle, multiplied by the number of cycles.

Why it was the right fit: Debugging is where agentic terminal access earns its keep. The agent reads code, forms a hypothesis, patches, and verifies — all in one turn. That tight loop is the difference between a project that takes hours and one that takes an afternoon.

3. File Reading + Targeted Patching (the patch tool)

I never asked Hermes to rewrite a whole file. When the duplicate route registration needed fixing, it found the exact lines in skills.py and surgical-strike replaced them. When the test import needed cleanup after removing routes, it patched the import line. Using patch instead of write_file meant the agent never accidentally wiped code it shouldn't touch.

Why it was the right fit: Real projects aren't write-once. They're edit-in-place hundreds of times. The ability to find a specific block in a 100-line file and change only what needs changing is what makes agent-driven development scale beyond toy scripts.

4. Test Writing + Verification Loop

Hermes wrote all 23 tests in test_api.py — covering every endpoint, edge case, and auth scenario. More importantly, after each bug fix, it re-ran the test suite. That's the part that matters: the agent doesn't just write tests, it runs them and acts on the output. When all 23 passed after the redirect fix, I had confidence the fix was real, not a hand-wave.

Why it was the right fit: Tests are the agent's mirror. Without them, the agent is guessing whether its changes work. With them, you have a verifiable contract. The pytest integration through the terminal tool closes the write-test-fix loop entirely within the agent.

5. Web Frontend in a Single File

I told Hermes: "One HTML file. Dark theme, GitHub palette. Search cards, category filter, trending sidebar, modal preview, install button that copies the CLI command. No external dependencies, no CDN, no build step." And it produced the entire frontend — HTML structure, CSS styling, and JavaScript that fetches from the API — in one shot. The design quality is genuinely good. I didn't touch a line of it.

Why it was the right fit: Frontend work is tedious when you know what you want but don't want to write it. Constraints ("dark theme, no CDN") give the agent enough guardrails to produce something usable on the first try, without you needing to iterate on pixel-level details.

6. Persistent Memory + Session Recovery

This is the underrated one. I slept for three hours between sessions. The Hermes session context was gone — no conversation history, no memory of what happened. But Hermes had written durable facts to its memory store (project path, Python version specifics, key gotchas like the 307 redirect). When I came back and said "resume," it recovered the full project state in under a minute: where the files were, what was done, what remained.

Why it was the right fit: Real development doesn't happen in one sitting. The ability to pause, come back hours later, and pick up without re-explaining everything is what makes agent-driven development viable for multi-hour projects. Without persistent memory, every session restart is a context tax.
Enter fullscreen mode Exit fullscreen mode

Top comments (0)