This is a submission for the Hermes Agent Challenge: Write About Hermes Agent
Most "give your agent a new model provider" tutorials are stories of bravely subclassing things. You inherit from a base class, you override three methods, you read the wire-format docs of two vendors, you write an adapter, you handle the streaming chunks, you wire it into a settings page somewhere. By the time you've shipped, you've forgotten what you were trying to do.
Hermes Agent's provider-plugin SDK refuses to play that game. The whole thing is a declarative dataclass and one registry call. If the thing you want to add already speaks OpenAI on the wire which most modern gateways and aggregators do — you can be done in twenty-six lines.
This post walks you through the pattern with a real working example: the omnizen-provider plugin I shipped to expose Omnizen's gateway to Hermes. You don't have to care about Omnizen specifically the same shape works for any OpenAI-compatible gateway (Together, Groq, Fireworks, your home-grown vLLM, your in-house router). Omnizen just happens to be a convenient real example because the URL is the only thing you'd swap out.
The shape Hermes wants
A Hermes provider plugin lives at plugins/model-providers/<your-name>/ and ships two files:
-
plugin.yaml— a short manifest so Hermes can discover and version the plugin. -
__init__.py— instantiates aProviderProfile, then callsregister_provider()on it.
That's it. There is no adapter class to subclass, no chat-completions method to implement, no streaming-chunk handler. The ProviderProfile is declarative: you describe what the provider is, and Hermes's existing OpenAI-compatible call path handles all the actual work.
The fields ProviderProfile cares about:
| Field | What it's for |
|---|---|
name |
Internal identifier; the key shown in hermes model
|
aliases |
Short aliases users can type instead of the full name |
env_vars |
Tuple of env vars the plugin reads — Hermes uses this for "is this provider configured?" detection |
display_name |
Human-friendly name in hermes model
|
description |
One-line pitch under the name in the picker |
signup_url |
Where Hermes sends users who don't have a key yet |
base_url |
The OpenAI-compatible chat-completions endpoint |
default_aux_model |
Model Hermes uses for internal calls (planning probes, tool description embeddings) when no model is specified |
fallback_models |
Models Hermes tries in order when the primary fails or isn't picked |
That's the whole API surface. If you can fill in those fields, you have a working provider.
My Plugin
Here's the entire __init__.py for the Omnizen plugin — no abbreviation, no "…and so on":
And the manifest next to it:
You can checkout the code here:
Hermes-Omnizen
What happens at runtime
The flow is symmetric on both ends:
-
At Hermes startup, the plugin's
__init__.pyexecutes once. Theregister_provider(omnizen)call drops theProviderProfileinto Hermes's in-memory registry. From Hermes's point of view, the provider now exists; nothing more is needed. -
The user runs
hermes model, picks the provider from the menu, and Hermes stores their choice. -
The user runs
hermes chat— or invokes a tool, or kicks off a multi-step plan, or hops to another agent through the Agent Communication Protocol. Hermes builds a standard OpenAI chat-completions request, readsOMNIZEN_API_KEYfrom the env, and POSTs tobase_url. The gateway answers in OpenAI's SSE envelope. Hermes's existing parser handles the stream and the tool-call frames. None of that code knows the difference between Omnizen and OpenAI proper — the wire format is identical, so the same call path serves both.
The reason this works is the OpenAI Chat Completions API has become the lingua franca for "talk to an LLM." If the thing you're building a Hermes plugin for already speaks OpenAI — and most modern gateways do — your job is describe the gateway, not implement an adapter. The runtime does the rest.
Why the pattern matters (even if you skip the rest of the post)
A few lessons that generalise beyond Hermes:
- Treat aggregators as a single "provider" in your agent's mental model. It keeps the pluggable-model interface clean — one wire format, one auth, one place to swap. Don't try to make your agent multiplex five providers itself if a gateway is already doing that work.
- OpenAI-compatible is the lingua franca, even when the backend isn't OpenAI. If the thing you're calling speaks OpenAI on the wire, your agent doesn't need to know what's behind it — Anthropic, MiniMax, your home-grown Llama, doesn't matter.
- Provider plugins should be declarative, not procedural. Hermes gets this right: I described what the provider is, I didn't write any code about what to do. The runtime knows what to do because the wire format is fixed.
-
You shouldn't have to maintain a fork to ship a vendor. When your provider SDK is this small (
ProviderProfile+register_provider), adding a new vendor is a PR-sized commit, not a months-long integration. Hermes ships fifteen or so of these out of the box — the cost is so low it might as well be free.
If you're building anything that wants to be "model-agnostic," this is the seam to expose.
Gotchas
The things you only learn by shipping one of these:
-
The
default_aux_modelmatters more than you think. Hermes uses it for any internal call where the user didn't pick a model — planning probes, tool-description embeddings, name-it. If you set it to a heavyweight model, every interaction feels slow and twice as expensive before the user has even said anything. Pick a cheap-fast model for the aux; let the user spend on the chat call. I default tokimi-k2here because it's roughly the speed of a thought. -
Fallbacks are silent.
fallback_modelsswap in automatically when the primary fails (rate limit, 5xx). Great — until you're debugging "why does my answer have a different vibe today?" and realise Hermes quietly fell back two models down the chain. Log the actual responding model: theusage.modelfield on the SSE stream tells you the truth; the picker only tells you the intent. -
The plugin registers at import time. Which means if the plugin module isn't on Hermes's discovery path, Hermes won't see it. Symptom:
hermes modeldoesn't list your provider. Cause: missing__init__.pyin a parent directory, wrong working directory at launch, or the plugin folder is in~/Documents/cool-hermes-stuffand Python can't see it. Fix: install the plugin as a module so it's onsys.path, don't just copy-paste it into a random folder and hope. -
OpenAI-compatible ≠ identical to OpenAI. Most gateways disagree with the OpenAI SDK on at least one streaming-chunk shape — usually the final
usageframe, sometimes the role on the first delta. Hermes is forgiving here, but if you build your own provider plugin and watch your assistant's last token vanish into thin air, this is where to start looking. Send one real chat through and watch the stream end-to-end before you ship.
References
- Hermes Agent:
NousResearch/hermes-agent - The worked example plugin:
anchorblock/hermes-agent·plugins/model-providers/omnizen/ - Omnizen gateway:
omnizen.ai - HermsexOmnizen:
hermes-omnizen
The gateway-side of this — how the Omnizen API actually fans one virtual key out to multiple model brands behind the curtain — is a separate post for another day. For this tutorial, all you need to know is it speaks OpenAI on the wire. The pattern works the same for any gateway that does.



Top comments (0)