<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: hey foo</title>
    <description>The latest articles on DEV Community by hey foo (@hey_foo_7d1e3daa7575b0914).</description>
    <link>https://dev.to/hey_foo_7d1e3daa7575b0914</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3323296%2F989536b0-4212-45cc-bfdf-4467355cfed6.png</url>
      <title>DEV Community: hey foo</title>
      <link>https://dev.to/hey_foo_7d1e3daa7575b0914</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hey_foo_7d1e3daa7575b0914"/>
    <language>en</language>
    <item>
      <title>Agents From First Principles: What Hermes Actually Is When You Strip The Marketing Off</title>
      <dc:creator>hey foo</dc:creator>
      <pubDate>Mon, 01 Jun 2026 02:31:14 +0000</pubDate>
      <link>https://dev.to/hey_foo_7d1e3daa7575b0914/agents-from-first-principles-what-hermes-actually-is-when-you-strip-the-marketing-off-446i</link>
      <guid>https://dev.to/hey_foo_7d1e3daa7575b0914/agents-from-first-principles-what-hermes-actually-is-when-you-strip-the-marketing-off-446i</guid>
      <description>&lt;p&gt;x;lc&lt;/p&gt;

&lt;h1&gt;
  
  
  Agents From First Principles: What Hermes Actually Is When You Strip The Marketing Off
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;A rebel's walkthrough — no hand-waving, no vibes, just the bones of the thing.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  0. Why I'm writing this
&lt;/h2&gt;

&lt;p&gt;Every other post about Hermes opens by telling you it's "self-improving" and "persistent" and "the future of agents." Fine. True, even. But if that's all you take away, you're going to install it, poke it for an afternoon, and walk away thinking it's a fancier ChatGPT wrapper. You'll have missed the entire point.&lt;/p&gt;

&lt;p&gt;The point isn't the features. The point is a &lt;em&gt;worldview shift&lt;/em&gt; about what an agent fundamentally is. And once that shift clicks, the features stop looking like a feature list and start looking like the &lt;em&gt;only sensible answers&lt;/em&gt; to a set of problems most of the industry is still pretending don't exist.&lt;/p&gt;

&lt;p&gt;So we're going to do this the hard way. We're going to start from absolutely nothing — no frameworks, no jargon, no "LangThis vs CrewThat" — and rebuild the concept of an autonomous agent from the ground up. By the time we get to Hermes, you won't need me to sell it to you. The design will sell itself, because you'll already have invented half of it in your head.&lt;/p&gt;

&lt;p&gt;Buckle up. We're going to first principles.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. What is an "agent," really?
&lt;/h2&gt;

&lt;p&gt;Strip every blog post, every demo video, every breathless tweet thread. Reduce the concept of an "AI agent" to its skeleton.&lt;/p&gt;

&lt;p&gt;First Principles Derivation&lt;br&gt;
Information is physical. (Landauer’s principle)&lt;br&gt;
Every bit of information you store or process costs energy. An agent must be parsimonious—it cannot model everything; it must model what’s relevant to its goals. That’s the origin of attention and state abstraction.&lt;br&gt;
Control requires counterfactuals.&lt;br&gt;
To choose action A over B, you must predict what would happen if you did A vs B. That requires a generative world model, even if rudimentary.&lt;br&gt;
Goals are constraints on future states.&lt;br&gt;
Without a preference ordering over possible futures, action selection is random. Goals provide the gradient.&lt;br&gt;
You have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A model that can produce text&lt;/strong&gt; given some input text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A loop&lt;/strong&gt; that takes the model's output, does something in the world based on it, then feeds the result back in as the next input.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A goal&lt;/strong&gt; — some target state the loop is supposed to drive toward.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. That's an agent. Everything else — planning, tool use, memory, subagents, GUIs, voice, "reasoning traces" — is &lt;em&gt;plumbing&lt;/em&gt; around those three things. Important plumbing. But plumbing.&lt;/p&gt;

&lt;p&gt;A chatbot has #1 and a degenerate version of #2 (the "loop" is one turn: user types, bot replies, conversation ends in memory's eyes). It doesn't really have #3 — its goal is just "respond plausibly to whatever the human just said."&lt;/p&gt;

&lt;p&gt;An agent has all three, and crucially, the loop runs &lt;strong&gt;without a human in every step&lt;/strong&gt;. The model decides what to do next, executes it, observes the result, decides again. The human sets the goal at the top and (hopefully) checks in occasionally.&lt;/p&gt;

&lt;p&gt;The moment you internalize this, two consequences fall out immediately, and they're the consequences nobody talks about loudly enough:&lt;/p&gt;

&lt;p&gt;Therefore, the minimal viable agent loop is:&lt;br&gt;
text&lt;br&gt;
while not terminal:&lt;br&gt;
    observation = sense(environment)&lt;br&gt;
    world_model.update(observation)&lt;br&gt;
    goal = infer_current_goal(world_model, user_context)&lt;br&gt;
    plan = planner.generate(world_model, goal)&lt;br&gt;
    action = plan.select_first_step()&lt;br&gt;
    execute(action)&lt;br&gt;
    environment.apply(action)&lt;br&gt;
But that’s just the surface. The magic is in how world_model.update and planner.generate work. Let’s build them from scratch.&lt;/p&gt;
&lt;h3&gt;
  
  
  Consequence A: The model is now writing the program as it runs.
&lt;/h3&gt;

&lt;p&gt;A traditional program is a static artifact: you write the code, you ship it, it executes deterministically. An agent is a &lt;em&gt;dynamic&lt;/em&gt; program — the "code" being executed is whatever the model decides to output this turn. Your program is being rewritten on every iteration, by a stochastic process, in response to inputs you don't fully control.&lt;/p&gt;

&lt;p&gt;This is wild. This is genuinely new. Most software-engineering instincts (testing, version control, code review, deterministic behavior) were designed for the static case. They translate to the dynamic case awkwardly at best.&lt;/p&gt;
&lt;h3&gt;
  
  
  Consequence B: Time becomes a resource the agent spends, not a constraint you impose.
&lt;/h3&gt;

&lt;p&gt;A chatbot only consumes resources when you're typing. An agent can — and will — consume resources while you sleep. A stateless chatbot only costs money when you type. A scheduled, always-on agent that delegates to subagents costs money when you're asleep.&lt;/p&gt;

&lt;p&gt;That's not a bug; it's the entire value proposition. You &lt;em&gt;want&lt;/em&gt; the agent to work while you sleep. The horror story of waking up to a $47 surprise bill from an overnight run isn't an exotic failure mode — it's what happens when you build a thing whose job is to keep working unattended and then don't put guardrails on it.&lt;/p&gt;

&lt;p&gt;Hold these two consequences in your head. We'll come back to them.&lt;/p&gt;


&lt;h2&gt;
  
  
  2. The three holes every agent falls into
&lt;/h2&gt;

&lt;p&gt;Now that we know what an agent &lt;em&gt;is&lt;/em&gt;, let's figure out what an agent &lt;em&gt;needs&lt;/em&gt; — by watching one fail.&lt;/p&gt;

&lt;p&gt;Imagine the simplest possible agent. A while-loop. Inside the loop: feed the conversation so far to a model, get back either (a) a tool call to execute or (b) a final answer. If tool call, execute it, append the result to the conversation, loop. If final answer, stop.&lt;/p&gt;

&lt;p&gt;This is the entire architecture of about 80% of "agent frameworks" you've heard of. It works. Demo it on stage, you'll get applause.&lt;/p&gt;

&lt;p&gt;Now try to use it for actual work. You will fall into three holes, in this order:&lt;/p&gt;
&lt;h3&gt;
  
  
  Hole #1: Context blows up.
&lt;/h3&gt;

&lt;p&gt;The conversation grows every turn. After fifteen tool calls, your context window is half-full of &lt;code&gt;git diff&lt;/code&gt; output and curl responses. After thirty, you're truncating. The model starts hallucinating because it can't see the original goal anymore. By turn fifty, the agent is wandering — solving problems it invented, forgetting why it started, repeating work.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;stateful work problem&lt;/strong&gt;. Real tasks span more tokens than any model can hold.&lt;/p&gt;
&lt;h3&gt;
  
  
  Hole #2: Memory dies at session end.
&lt;/h3&gt;

&lt;p&gt;You close the terminal. Tomorrow you re-open it. The agent has no idea who you are, what you were working on, or that it figured out yesterday that your codebase uses pnpm, not npm. You re-explain everything. The agent makes the same mistakes again. You open a terminal, give an agent a task, watch it spin through a dozen steps, and feel genuinely impressed — right up until you close the session. Tomorrow, you start completely from scratch. The agent has no memory of what it learned, no idea who you are, and zero awareness that it made the same mistake three sessions ago. You're its first user. Every single time.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;amnesia problem&lt;/strong&gt;. The agent never gets &lt;em&gt;better at you&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Hole #3: The agent has no skills, only flailing.
&lt;/h3&gt;

&lt;p&gt;When you ask the agent to "deploy this to staging," it doesn't have a procedure. It has a model that's read about deployments on the internet and will improvise one, from scratch, every single time you ask. Sometimes it gets it right. Sometimes it tries to &lt;code&gt;kubectl apply&lt;/code&gt; on a repo that uses Vercel. The model knows everything in general and nothing in particular.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;no-procedures problem&lt;/strong&gt;. The agent has knowledge but no &lt;em&gt;competence&lt;/em&gt; — no calcified, repeatable, "I have done this exact thing before and here is what works" muscle memory.&lt;/p&gt;

&lt;p&gt;Almost every agent framework you've used has all three of these holes, and the framework's response to them is "well, the user can solve that by writing better prompts" or "we have a vector store, just bolt it on." Those are duct-tape answers. They don't &lt;em&gt;solve&lt;/em&gt; the problem; they push it onto you.&lt;/p&gt;

&lt;p&gt;A real solution would have to do three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Compress state&lt;/strong&gt; intelligently as the agent works, so context doesn't drown.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persist memory&lt;/strong&gt; across sessions, indexed in a way the agent can actually search.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crystallize procedures&lt;/strong&gt; out of successful runs so the agent gets &lt;em&gt;competent&lt;/em&gt;, not just knowledgeable.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you sat down to design an agent from first principles with these three requirements, you'd end up reinventing a lot of what Nous Research built into Hermes. Let's go look at what they actually did.&lt;/p&gt;


&lt;h2&gt;
  
  
  3. Enter Hermes — and why each piece exists
&lt;/h2&gt;

&lt;p&gt;Hermes Agent is an open-source autonomous AI agent framework built by Nous Research, released in February 2026 under the MIT license. Within roughly twelve weeks of release it crossed 140,000+ GitHub stars and became the most-used agent on OpenRouter.&lt;/p&gt;

&lt;p&gt;That growth isn't pure hype. It's because Hermes' design is a direct, point-by-point answer to the three holes above. Let me walk you through it not as a feature tour but as an &lt;em&gt;engineering argument&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  3.1 Answering Hole #3 first: the skills system
&lt;/h3&gt;

&lt;p&gt;I'm going out of order on purpose, because skills are the most distinctive thing Hermes does and the easiest to misunderstand.&lt;/p&gt;

&lt;p&gt;At the core of Hermes is what Nous Research calls a closed learning loop. When you give Hermes a complex task, it does not just complete it and discard the work. Instead, it automatically generates a reusable skill from that interaction, a structured procedure it can recall and improve upon in future sessions. These skills live in your local filesystem under ~/.hermes/skills/ and are indexed so the agent can search and invoke them without you having to remember they exist. Over time, this means Hermes builds a growing library of procedures tailored to your specific workflows.&lt;/p&gt;

&lt;p&gt;Stop and think about what that means. A "skill" is a &lt;em&gt;file&lt;/em&gt;. It lives on your disk. You can &lt;code&gt;cat&lt;/code&gt; it. You can &lt;code&gt;git diff&lt;/code&gt; it. You can delete it. You can edit it by hand. The agent's growing competence is not locked inside some opaque vector blob — it's &lt;em&gt;legible&lt;/em&gt;, &lt;em&gt;portable&lt;/em&gt;, &lt;em&gt;yours&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This is the first-principles answer to Hole #3. The agent solves a problem; it writes down how it solved the problem; next time it sees a similar problem it grabs the writeup and follows it. That's exactly how human competence works. We don't re-derive how to make coffee every morning; we have a procedure cached, and we tweak it when the procedure fails.&lt;/p&gt;

&lt;p&gt;Skills are also why Hermes is the only mainstream agent framework with a built-in learning loop that creates, edits, and improves its own skills during normal use. Most other frameworks treat the agent as a fixed graph — you wire up the nodes, ship it, and that's the agent you have forever. Hermes treats the agent as a &lt;em&gt;process that grows&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  3.2 Answering Hole #2: persistent memory with structure
&lt;/h3&gt;

&lt;p&gt;A skill is "how to do X." Memory is "what's been happening with this particular user / project / thread of work."&lt;/p&gt;

&lt;p&gt;Hermes splits this into layers. It builds a three-layer memory system as it works: short-term conversation context, medium-term session summaries, long-term skill documents that capture how it solved specific problems.&lt;/p&gt;

&lt;p&gt;The structural choice here is doing serious work. Short-term is the raw turn-by-turn context (what's in the model's window right now). Medium-term is LLM-summarized session history — distilled, searchable, not the raw transcript. Long-term is the skills library plus persistent project/user facts.&lt;/p&gt;

&lt;p&gt;The medium layer is the one most frameworks skip entirely, and it's the one that quietly makes the whole thing function. Alongside the skills system, Hermes maintains persistent memory across conversations. This includes project context, user preferences, and a searchable session history powered by SQLite FTS5 with LLM-summarized recall. When you return to a project after two weeks, Hermes can surface relevant prior context without you having to re-explain everything from scratch.&lt;/p&gt;

&lt;p&gt;Note the implementation: &lt;strong&gt;SQLite FTS5&lt;/strong&gt;. Not Pinecone. Not Weaviate. Not "we deployed a vector database cluster." A SQLite file. On your disk. Indexed for full-text search.&lt;/p&gt;

&lt;p&gt;This is a &lt;em&gt;deeply&lt;/em&gt; opinionated technical choice and it's the right one. SQLite is the most boring database in the world, which is exactly why it's the right database for something that has to run on your laptop, your VPS, your Termux session on a phone, and a $5 droplet, all without ops overhead. The agent's brain is a file. You can back it up by copying a file. You can move it to a new machine by copying a file. You can inspect what it remembers by opening it with the sqlite3 CLI.&lt;/p&gt;

&lt;p&gt;The first-principles principle: &lt;strong&gt;the simplest representation that can do the job is almost always the correct one&lt;/strong&gt;, because anything fancier multiplies your failure modes. A vector store is impressive in a slide deck. A SQLite database with FTS5 is impressive in production for fifteen years straight.&lt;/p&gt;
&lt;h3&gt;
  
  
  3.3 Answering Hole #1: subagents for context isolation
&lt;/h3&gt;

&lt;p&gt;When work blows past one model's context window, what do you do? You don't make the window bigger — that's a losing race. You &lt;em&gt;decompose&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Beyond memory, the architecture supports isolated subagents. When a task has parallel components, Hermes can spawn isolated child agents with their own terminal environments and Python RPC sessions, then aggregate the results. This is particularly useful for long-running research or code tasks where you want to avoid context blowout in a single conversation thread.&lt;/p&gt;

&lt;p&gt;Why is this the right shape? Because of the same reason multi-threading works in regular programming: most "big" tasks are actually a bunch of small independent tasks pretending to be one big task. "Audit this repo for security issues" is fifty small, independent file audits glued together by a final summary step. If you do that in one conversation, you choke on context. If you spawn fifty subagents, each with its own clean context, and aggregate their outputs, the parent agent only ever sees the &lt;em&gt;summaries&lt;/em&gt; — which fit comfortably.&lt;/p&gt;

&lt;p&gt;This is structurally identical to MapReduce, or to how a manager handles a project. Decompose, delegate, aggregate. The fact that the workers are themselves LLM agents instead of cores or junior engineers is a detail. The shape of the solution is ancient and proven.&lt;/p&gt;
&lt;h3&gt;
  
  
  3.4 The reach problem: surfaces, not interfaces
&lt;/h3&gt;

&lt;p&gt;There's a fourth hole I didn't list above because it's less obvious until you've lived it: &lt;strong&gt;the agent is only useful if it's where you are.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A CLI agent is useful when you're at your laptop. The moment you're walking to lunch and remember "oh I needed to ask the agent about that PR," it's useless. You're not going to SSH from your phone.&lt;/p&gt;

&lt;p&gt;Hermes' answer is what they call a gateway. Hermes ships with a gateway process that connects the agent to messaging platforms. You can run a task from your laptop terminal, switch to Telegram while commuting, and pick up the same session without any ceremony. This cross-platform continuity is something most agentic frameworks simply do not address.&lt;/p&gt;

&lt;p&gt;The full surface list is genuinely sprawling: A CLI / TUI you run locally (hermes). A messaging gateway that turns Telegram / Discord / Slack / WhatsApp / Signal / Email / Matrix into agent surfaces. A web UI and an Agent Client Protocol (ACP) endpoint for AI-native editors. A cron scheduler for unattended work.&lt;/p&gt;

&lt;p&gt;The first-principles framing: &lt;strong&gt;the agent is one process; the surfaces are many.&lt;/strong&gt; This is the inverse of how most products work — most products are one app you have to open. Hermes treats itself as infrastructure that &lt;em&gt;you&lt;/em&gt; reach into from whatever channel you're already in. That's how email works. That's how SMS works. That's how good infrastructure works.&lt;/p&gt;
&lt;h3&gt;
  
  
  3.5 The portability problem: backends and providers
&lt;/h3&gt;

&lt;p&gt;Two more design decisions that look like trivia but are actually load-bearing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Execution backends.&lt;/strong&gt; Hermes ships multiple execution backends — local, Docker, SSH, Singularity, Modal, Daytona — precisely so you can isolate what the agent can touch. This matters because the same agent process should be able to run "harmlessly explore my notes folder" locally and "execute scary stuff" inside a Docker container, &lt;em&gt;without&lt;/em&gt; you rewriting the agent. The blast radius is a config flag, not a rewrite.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Models.&lt;/strong&gt; It supports 200+ models through Nous Portal, OpenRouter, OpenAI, Anthropic, NVIDIA NIM, Hugging Face, NovitaAI, z.ai/GLM, Kimi, MiniMax, xAI Grok, and any OpenAI-compatible endpoint. Switching providers is hermes model — no code change.&lt;/p&gt;

&lt;p&gt;The principle: &lt;strong&gt;never marry your tool to your dependencies.&lt;/strong&gt; Models will get better, cheaper, weirder. The provider you're using today will not be the optimal one next quarter. An agent framework that locks you to a vendor is signing you up for a forced migration in twelve months. Hermes refused to make that bet, and that refusal is going to age extraordinarily well.&lt;/p&gt;


&lt;h2&gt;
  
  
  4. The architecture in one breath
&lt;/h2&gt;

&lt;p&gt;Let me give you the whole thing as one mental picture, because once you can hold the picture in your head, everything else is just zooming in.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       [ You, via any channel ]
                 │
                 ▼
        [ Gateway / CLI / TUI / Web ]
                 │
                 ▼
   ┌─────────────────────────────┐
   │     The Agent Loop          │
   │  plan → act → observe → ... │
   └─────────────────────────────┘
        │            │
        ▼            ▼
[ Memory (SQLite) ] [ Skills (~/.hermes/skills/) ]
        │            │
        ▼            ▼
   ┌─────────────────────────────┐
   │   Execution Backend          │
   │ local / Docker / SSH / Modal │
   └─────────────────────────────┘
        │
        ▼
   [ The actual world: files, APIs, shells ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Every "feature" Hermes has is a layer on that diagram. Subagents are recursive instances of the agent loop. Cron is just "trigger the loop on a schedule instead of on user input." Multi-channel reach is just the gateway box fanning out.&lt;/p&gt;

&lt;p&gt;Once you see the architecture this way, the marketing copy stops mattering. You can predict what Hermes can do without reading the docs. &lt;em&gt;Can it monitor a website and ping me on Telegram when it changes?&lt;/em&gt; Of course — cron triggers loop, loop uses a fetch skill, loop calls the Telegram surface. &lt;em&gt;Can it spawn a research subagent that runs for an hour without blowing my main context?&lt;/em&gt; Of course — that's literally the subagent box. &lt;em&gt;Can I move my entire setup to a new laptop?&lt;/em&gt; Of course — copy &lt;code&gt;~/.hermes/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is the test of a good architecture: when you understand the shape, the capabilities become obvious. When you have to memorize a feature list, the shape is wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. A walkthrough you can actually do tonight
&lt;/h2&gt;

&lt;p&gt;Enough theory. Let's get our hands on it. I'm going to compress the install because there's nothing clever to say about it — the script does what you'd expect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Install
&lt;/h3&gt;

&lt;p&gt;The install process is straightforward on Linux, macOS, WSL2, and Termux. A single curl command handles the dependency chain, including uv, Python 3.11, Node.js, ripgrep, and ffmpeg. That's the whole install step. No dependency hunting. No pip install hell. The script handles everything and places the agent at ~/.hermes/hermes-agent.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;~/.hermes/&lt;/code&gt; directory is going to become your agent's entire being — skills, memory, configs, all of it. Bookmark that path. Back it up. That folder is the agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Pick a model
&lt;/h3&gt;

&lt;p&gt;This is the most important setup step. Run hermes model for an interactive selection menu. Here are the main options worth knowing: ⚠️ Critical: Hermes requires a model with at least 64,000 tokens of context. Most hosted models (Claude, GPT, Gemini, Qwen, DeepSeek) meet this easily. If you're running locally, set your context size to at least 64K (e.g., --ctx-size 65536 in llama.cpp). You can switch providers at any time with hermes model — no lock-in.&lt;/p&gt;

&lt;p&gt;The 64K context floor is real. Don't try to be clever and run a 16K local model "just to see." The agent will work, badly, and you'll blame Hermes for it. The architecture &lt;em&gt;assumes&lt;/em&gt; enough headroom for plan + working context + skill recall.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Launch and verify
&lt;/h3&gt;

&lt;p&gt;hermes # classic CLI hermes --tui # modern TUI with overlays and mouse support (recommended) You'll see a welcome banner showing your provider, model, available tools, and loaded skills. Start with something easy to verify: Summarize this repo in 5 bullets and tell me the main entrypoint.&lt;/p&gt;

&lt;p&gt;I want you to do something specific the first time. Don't ask it to write you a snake game. Don't ask it to plan your week. Ask it to do something &lt;em&gt;boring, real, and verifiable&lt;/em&gt; — like that repo summary. The reason: you want to see the loop actually work end to end on a task where you can immediately tell if the output is correct. Save the wow-moment prompts for after you trust the basics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Watch a skill get born
&lt;/h3&gt;

&lt;p&gt;Now do something slightly nontrivial. Something with three or four steps. "Find all TODO comments in this repo, group them by file, and write a markdown report." Watch it work. When it finishes, check &lt;code&gt;~/.hermes/skills/&lt;/code&gt;. There will be a new file there, or an updated one. Open it. Read it.&lt;/p&gt;

&lt;p&gt;That moment — when you see a file appear on your disk that represents &lt;em&gt;the procedure the agent just figured out&lt;/em&gt; — is the moment Hermes' design clicks for you. The agent isn't a black box anymore. The agent's growing competence has a physical address.&lt;/p&gt;

&lt;p&gt;Tomorrow, ask it to do a similar TODO-style task on a different repo. Watch the skill get invoked. Watch it work &lt;em&gt;faster and more reliably&lt;/em&gt; than the first time. This is the closed learning loop in action, and it is genuinely different from anything stateless.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Try the cron piece
&lt;/h3&gt;

&lt;p&gt;Here's a thing that'll change how you think about agents. The cron scheduling part especially. In Hermes, you set scheduled tasks in natural language — "every morning at 8am, pull tech news and brief me" — and it handles the cron job internally. No YAML. No crontab entries. You just describe what you want and it figures out the execution.&lt;/p&gt;

&lt;p&gt;Pick something small you actually want to know every day. "Every morning at 8, check if there are new issues on this GitHub repo and DM me a summary on Telegram." Set it up. Walk away.&lt;/p&gt;

&lt;p&gt;The first morning it works, you will feel something. That feeling is the agent crossing the line from "a thing I prompt" to "a thing that does work in the world for me." That's the line everything else is pointing at. Most agent frameworks never cross it, because crossing it requires &lt;em&gt;all the other pieces&lt;/em&gt; — persistence, scheduling, multi-channel reach — to actually exist and work together. Hermes crosses it on day one.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The dark side: what nobody tells you in the demos
&lt;/h2&gt;

&lt;p&gt;I'd be selling you a fairy tale if I stopped here. The same architecture that makes Hermes powerful makes it &lt;em&gt;expensive in ways stateless tools aren't&lt;/em&gt;, and you need to design for that from day zero, not after the bill arrives.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.1 Autonomy removes the natural brake on cost
&lt;/h3&gt;

&lt;p&gt;I quoted this above and I'm bringing it back because it's the single most important thing in this entire post:&lt;/p&gt;

&lt;p&gt;Autonomy removes the natural brake. A stateless chatbot only costs money when you type. A scheduled, always-on agent that delegates to subagents costs money when you're asleep.&lt;/p&gt;

&lt;p&gt;A scheduled agent with subagent capability is a fork bomb waiting for the wrong prompt. Set it loose on an open-ended task — "monitor my industry and tell me what matters" — and it can spawn subagents recursively, each making API calls, each costing money, each running at 3am while you dream.&lt;/p&gt;

&lt;p&gt;The mitigation isn't "trust the model." The mitigation is &lt;em&gt;engineering controls&lt;/em&gt;: hard token caps per scheduled task, hard subagent depth limits, alerting on cost spikes, and reading your provider's billing dashboard with the same discipline you read your bank statement. Set budget alerts at OpenRouter, Anthropic, OpenAI — wherever you've pointed Hermes — &lt;em&gt;before&lt;/em&gt; you set up your first cron job, not after.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.2 The blast radius is your scopes, not your chat
&lt;/h3&gt;

&lt;p&gt;The blast radius is your scopes, not your chat. A stateless chatbot that's tricked says something dumb. An autonomous agent that's tricked acts — with your API keys, your file system, your messaging integrations. Sandboxing is a choice, not a default to ignore.&lt;/p&gt;

&lt;p&gt;This is the part of agent ops most people are sleepwalking through. The agent isn't producing &lt;em&gt;text&lt;/em&gt; anymore; it's producing &lt;em&gt;actions&lt;/em&gt;. If it gets prompt-injected by a malicious webpage during a research task, the consequence isn't a weird reply — the consequence is &lt;code&gt;rm -rf&lt;/code&gt; on whatever your execution backend has access to.&lt;/p&gt;

&lt;p&gt;The good news: the design anticipates this. The bad news: the safe configuration is the one you select. "It runs on your server" is liberating and is also the entire threat model in five words.&lt;/p&gt;

&lt;p&gt;The minimum sane posture: run Hermes in Docker, not local-shell, the moment it touches anything from the open internet. Give it scoped API tokens, not your personal-account tokens. Treat its filesystem access like you'd treat a fresh intern's filesystem access — read-only by default, write where needed, never to your home directory.&lt;/p&gt;

&lt;p&gt;None of this means "don't run it." It means the correct emotional posture toward a self-improving agent is the one you'd have toward a sharp, fast, sleep-deprived junior engineer with production access: enormous upside, and you do not skip code review.&lt;/p&gt;

&lt;p&gt;That metaphor is the best one I've heard for this whole class of tool. Hermes is a junior engineer who never sleeps, gets smarter every week, works for pennies, and would absolutely push to main if you let it. Treat it accordingly.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.3 Output quality is context-dependent in ways you'll learn the hard way
&lt;/h3&gt;

&lt;p&gt;Real users have flagged the gap. The gaps are real though. Output quality is heavily context-dependent. The shallow-domain problem on complex workflows (the code review checklist, the cold intervention notes) is a real limitation. Silent failures on misconfiguration — like the GitHub token scope issue — need better error communication.&lt;/p&gt;

&lt;p&gt;What this means in practice: Hermes is fantastic at domains where its memory has been built up over weeks and its skills have been refined. It's &lt;em&gt;uneven&lt;/em&gt; at domains where it's working cold. The first time you ask it to do something in a new domain, expect mediocrity. The fifth time, expect excellence. This is by design — it's the closed loop loop'ing — but it means the demo-on-day-one isn't the same as the experience-after-a-month, and you need to be patient enough to get to the second one.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Where Hermes fits in the larger picture
&lt;/h2&gt;

&lt;p&gt;Let me try to give you a single sentence that captures why Hermes matters at this moment in the agent ecosystem.&lt;/p&gt;

&lt;p&gt;Hermes treats an agent as a process that accumulates capability over time. Concretely: Said differently: LangGraph is a build-time framework. Hermes is a run-time being. The two are not competitors so much as different scales of the same problem — LangGraph is excellent for building a deterministic flow inside an enterprise app; Hermes is excellent when you want an agent that lives somewhere, hears you across channels, and gets better at you over months.&lt;/p&gt;

&lt;p&gt;That distinction — &lt;em&gt;build-time framework&lt;/em&gt; vs &lt;em&gt;run-time being&lt;/em&gt; — is the cleanest articulation I've seen of why Hermes is a different &lt;em&gt;kind&lt;/em&gt; of thing rather than just a better one of the same thing. If you want an agent embedded inside your SaaS product to handle a specific bounded workflow, use a graph framework. If you want an agent &lt;em&gt;for you&lt;/em&gt;, that lives on your box, hears you on Telegram, remembers what matters, and gets sharper every week, you want Hermes. They're answers to different questions.&lt;/p&gt;

&lt;p&gt;The agentic frameworks that win the next few years will not be the ones with the prettiest LangGraph diagrams. They'll be the ones that let a single human and a single agent build a working relationship that compounds. That's a much weirder, much more personal product category than "AI dev tools." It's closer to "domesticating a familiar." And the design constraints for it — local, persistent, learning, reachable, sandboxable — are exactly what Hermes shipped.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. The first-principles takeaway
&lt;/h2&gt;

&lt;p&gt;If you remember nothing else from this post, remember this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;An agent is a model + a loop + a goal.&lt;/strong&gt; Everything else is plumbing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The three holes in naive agent design are context blowup, session amnesia, and no crystallized procedures.&lt;/strong&gt; Any agent framework not directly addressing all three is leaving the hard problems to you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hermes' answers — subagents, layered SQLite memory, skill files on disk — are the simplest solutions that can possibly work.&lt;/strong&gt; The simplicity is the feature.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomy is a leverage multiplier in both directions.&lt;/strong&gt; It multiplies your useful output and it multiplies your blast radius. Engineer the brakes deliberately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The right mental model is a junior engineer who never sleeps.&lt;/strong&gt; Productive, eager, occasionally catastrophically wrong, deserving of code review and a Docker container.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A piece of advice from the team behind it that's worth tattooing somewhere visible: don't overthink the "agent" part. Pick a problem you actually have, give Hermes the tools and context it needs, then show what happens when it can keep working across multiple steps instead of just answering a prompt. Small, useful, and real beats flashy every time.&lt;/p&gt;

&lt;p&gt;The agents that are about to change your life are not the ones doing breathtaking demos. They're the ones quietly making your Tuesday morning slightly better, every Tuesday, forever. Hermes is built to be that. Whether it becomes that for you is mostly about whether you give it a real problem instead of a clever one.&lt;/p&gt;

&lt;p&gt;Go install it. Give it a boring task. Watch the skill file appear. Then come back and tell me first principles are dead.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Sources &amp;amp; further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Challenge launch post&lt;/strong&gt; — &lt;a href="https://dev.to/devteam/join-the-hermes-agent-challenge-1000-in-prizes-13cd"&gt;Join the Hermes Agent Challenge&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Challenge rules &amp;amp; submission templates&lt;/strong&gt; — &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge page&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The cost/risk side&lt;/strong&gt; — &lt;a href="https://dev.to/chintanonweb/hermes-agent-gets-smarter-every-day-so-does-the-bill-4i8o"&gt;Hermes Agent Gets Smarter Every Day. So Does the Bill.&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture deep-dive vs OpenClaw/GoClaw&lt;/strong&gt; — &lt;a href="https://dev.to/truongpx396/hermes-agent-the-self-improving-agent-framework-and-how-it-compares-to-openclaw-goclaw-22mc"&gt;Hermes Agent: A Practical Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Beginner-friendly overview&lt;/strong&gt; — &lt;a href="https://dev.to/emmanuelthecoder/hermes-the-self-improving-agent-you-can-actually-run-yourself-555l"&gt;Hermes, The Self-Improving Agent You Can Actually Run Yourself&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field test under pressure&lt;/strong&gt; — &lt;a href="https://dev.to/syedahmershah/i-gave-hermes-agent-5-impossible-tasks-1k16"&gt;I Gave Hermes Agent 5 Impossible Tasks&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-world build example&lt;/strong&gt; — &lt;a href="https://dev.to/abraham_airco_67/this-is-a-submission-for-the-hermes-agent-challenge-autonomous-github-pr-reviewer-with-hermes-4kl9"&gt;Autonomous GitHub PR Reviewer with Hermes Agent&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Tagged: #hermesagentchallenge #devchallenge #agents #ai&lt;/em&gt;&lt;/p&gt;




</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
