Three hackathons in three months. Three different agent setups. Three times I wrote a system prompt that said something like "you are a helpful assistant that can use tools and reason across steps." Three times I shipped something, closed the terminal, and the agent forgot everything it had ever done.
I didn't think of it as a problem. I thought of it as just how agents work.
That's the part worth interrogating.
There's a normalization happening in how we build with AI. We've collectively agreed that agents are stateless, that every session starts cold, that the "memory" problem is either solved by shoving things into a context window or punted to some RAG pipeline you bolt on after the fact. We treat amnesia as a default and then we build around it. Bigger context windows. Better prompts. Smarter retrieval.
But none of that solves the actual problem, which is: the agent isn't getting better at working with you specifically. It's not accumulating judgment. It's not noticing that you always want JSON back, or that your last three API integrations broke on rate limits, or that you prefer terse explanations over verbose ones. Every session, you re-establish who you are. Every session, the agent starts from scratch.
That's not intelligence. That's a very fast amnesiac with a lot of knowledge.
The thing that made me stop and actually think was this line from Hermes Agent's README:
"The self-improving AI agent... it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions."
Not "we have a memory feature." Not "long-term context support." Skills from experience. A deepening model of who you are.
That's a different claim. And it's architectural, not cosmetic.
When Hermes completes a task, it can crystallise what it did into a reusable skill, tag it, store it, and pull from it next time a similar problem comes up. The first time you ask it to do something gnarly, it figures it out. By the third time, it's refined the approach. This is not a context window trick. The learning loop is built into the agent itself.
Here's a concrete thing to sit with: imagine you're onboarding a junior developer. Day one, you explain your codebase, your conventions, your preferences. Day two, they remember. Day thirty, they're fast because they've internalized how you work. Now imagine that junior dev reset every morning with zero memory of the previous day. You'd re-onboard them every single time. You'd eventually stop delegating anything non-trivial because the overhead is too high.
That's what most of us are doing with agents right now, and calling it "agentic AI."
Hermes Agent is at least trying to solve the actual problem instead of building better workarounds for it.
The model-agnostic architecture is worth one paragraph because it doesn't get enough attention. Hermes runs on anything: Claude, GPT-4o, Gemini, 200+ models through OpenRouter, your own endpoint. Switch models with hermes model, no code changes. This matters less as a convenience feature and more as a philosophical stance. The model is not the agent. The learning loop, the skill library, the memory system, the planning layer: those are the agent. The model is just the reasoning engine you slot in. When the next frontier model drops in six months and blows the current one out of the water, Hermes users swap it in with one command. Everyone building on a single-model API foundation has to rearchitect.
The infrastructure story also compounds with everything above. You can run this on a $5 VPS. Not "you can try it on a $5 VPS before you upgrade," actually run it there. The serverless option means you're paying essentially nothing when it's idle. You can message it from Telegram while it does work on a cloud machine. That's not a demo feature, it's a deployment story that actually works outside your laptop.
I want to be honest: I haven't run Hermes for six months and watched it compound knowledge into something that feels like a real collaborator. Neither has basically anyone outside the Nous Research team. The learning loop architecture is genuinely interesting but the proof will be in sustained use cases, and those take time to surface.
But here's what I do know: the agents I've built and shipped have all hit the same ceiling. They can reason well within a session. They fall apart across sessions. Every improvement I made was either a bigger context window, a more detailed system prompt, or a retrieval system I had to design and maintain separately. None of it made the agent better at working with me. I was compensating for statelessness, not solving it.
Hermes Agent is trying to solve it. The architecture is open source, the model layer is swappable, and the whole thing runs on infrastructure you control.
In a space where most "agents" are prompt chains with an API call attached, that's worth taking seriously.
[NousResearch/hermes-agent](https://github.com/NousResearch/hermes-agent) on GitHub. Quickstart video is linked in the docs. Worth an afternoon.
Top comments (0)