DEV Community

Anup Karanjkar
Anup Karanjkar

Posted on • Originally published at wowhow.cloud

I Wired Claude Code Into Hermes Agent (And Hermes Into Claude Code): The Full Tool Gateway Reference

The Hermes Tool Gateway is the piece of the Hermes architecture that most documentation glosses over. Every guide shows you how to start Hermes, pick a model, and chat. Almost none of them explain how the MCP (Model Context Protocol) integration actually works in production — specifically, how to configure Hermes as an MCP client so it can call external tools, and simultaneously as an MCP server so Claude Code or any other agent can call Hermes as a tool. That bidirectionality is what makes Hermes genuinely useful in a multi-agent stack.

I spent a week wiring Hermes v0.13.0 into my production setup — Claude Code as the primary coder, Hermes as the orchestration layer, with shared tool access across both. What follows are 12 working patterns from that process, in increasing order of complexity. Every YAML block is a copy-paste starting point. I have run each of these in production and debugged the failure modes so you do not have to.

Before we get into the patterns, one prerequisite: run the sanity check. Everything else depends on your environment being clean.

Setup Step Zero: The Doctor Check

Before touching a single config file, run Hermes's built-in diagnostics command:

hermes doctor
Enter fullscreen mode Exit fullscreen mode

A healthy environment produces output that looks like this:

[hermes doctor] Checking environment...

  ✓ hermes binary          v0.13.0
  ✓ config file            ~/.hermes/config.yaml (found)
  ✓ default model          anthropic/claude-sonnet-4-6 (reachable)
  ✓ API credentials        ANTHROPIC_API_KEY set (not expired)
  ✓ MCP runtime            @modelcontextprotocol/server v1.9.2
  ✓ Node.js                v22.4.0 (required ≥ 20)
  ✓ filesystem permissions ~/.hermes/ writable
  ✓ log directory          ~/.hermes/logs/ writable

[hermes doctor] All checks passed. Tool gateway ready.
Enter fullscreen mode Exit fullscreen mode

If you see any red entries, fix them before proceeding. The two most common failures are a stale ANTHROPIC_API_KEY and a Node.js version below 20. The MCP runtime in v0.13.0 uses native fetch and async generators that require Node 20+. On Node 18, you will get silent failures in the tool invocation layer — no error, just missing tool calls. Upgrade first.

If the config file is missing entirely:

hermes init
# Creates ~/.hermes/config.yaml with defaults
Enter fullscreen mode Exit fullscreen mode

Pattern 1: Outbound — Filesystem MCP Server

The simplest outbound MCP connection gives Hermes read and write access to a directory on your local filesystem. This is the foundation for every pattern that follows — once you understand how the mcpServers key works in config.yaml, the rest of the patterns are variations on the same structure.

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

mcpServers:
  filesystem:
    command: npx
    args:
      - -y
      - "@modelcontextprotocol/server-filesystem"
      - /Users/yourname/projects        # root directory Hermes can access
    env:
      NODE_ENV: production
Enter fullscreen mode Exit fullscreen mode

With this config, Hermes gains the following tools automatically: read_file, write_file, list_directory, create_directory, move_file, search_files, and get_file_info. You do not define these tools yourself — they come from the MCP server binary. Hermes discovers them at startup via the MCP handshake and makes them available to the model during every conversation.

Verify it is working:

hermes tools list
# Expected output includes:
# [mcp:filesystem] read_file
# [mcp:filesystem] write_file
# [mcp:filesystem] list_directory
# ... (7 tools total)
Enter fullscreen mode Exit fullscreen mode

The command: npx pattern is important. Hermes spawns the MCP server as a child process using stdio transport. The -y flag on npx auto-installs the package if it is not cached. In production I pin the version to avoid surprises:

args:
  - -y
  - "@modelcontextprotocol/server-filesystem@1.9.2"
  - /Users/yourname/projects
Enter fullscreen mode Exit fullscreen mode

Pattern 2: Outbound — GitHub Issues with Tool Whitelist

The GitHub MCP server exposes a large surface area by default — pull request management, repo creation, branch operations, issue management, gist operations. In most workflows you only want a subset. Use the include key to whitelist exactly the tools Hermes is allowed to call:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

mcpServers:
  github:
    command: npx
    args:
      - -y
      - "@modelcontextprotocol/server-github"
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
    include:
      - list_issues
      - get_issue
      - create_issue
      - add_issue_comment
      - list_pull_requests
      - get_pull_request
Enter fullscreen mode Exit fullscreen mode

The include key is a capability filter. Tools not in the list are invisible to the model — they are filtered out during the MCP handshake response. This matters for two reasons. First, it reduces the tool count in the model's context, which measurably improves tool selection accuracy. When Hermes has 60 GitHub tools available, it occasionally picks the wrong one. With 6 relevant tools, it consistently picks correctly. Second, it eliminates accidental destructive operations. A model that cannot see delete_repository cannot call delete_repository, no matter what a user asks.

Note the ${GITHUB_TOKEN} syntax — Hermes resolves environment variables in the config file at startup. Set the variable in your shell profile before running Hermes. Never hardcode tokens in the config file.

Pattern 3: Outbound — OAuth-Protected Remote Servers (Stripe Example)

Not all MCP servers run as local stdio processes. Some run as remote HTTP servers that use OAuth for authentication. The Stripe MCP server is the canonical example, and it is the pattern I use for payment-related Hermes tasks:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

mcpServers:
  stripe:
    transport: http
    url: "https://mcp.stripe.com/v1"
    auth:
      type: oauth2
      clientId: "${STRIPE_MCP_CLIENT_ID}"
      clientSecret: "${STRIPE_MCP_CLIENT_SECRET}"
      tokenUrl: "https://mcp.stripe.com/oauth/token"
      scopes:
        - customers:read
        - charges:read
        - subscriptions:read
    include:
      - list_customers
      - retrieve_customer
      - list_charges
      - retrieve_subscription
Enter fullscreen mode Exit fullscreen mode

The transport: http key switches Hermes from stdio to HTTP/SSE transport. Hermes handles the OAuth token lifecycle automatically — it fetches a token on first use and refreshes it before expiry. You do not need to manage token rotation in your application code.

For the Stripe MCP server specifically, request only read scopes unless your workflow explicitly requires write operations. Hermes is good at knowing when to call tools, but defense in depth means limiting what a tool can do even when called correctly. I use customers:read, charges:read, and subscriptions:read for my billing workflow — no write access, no refund operations, no configuration changes.

Pattern 4: Multi-Server Composition

The real power of the mcpServers key is that you can define multiple servers simultaneously. Hermes aggregates all the tools from all connected servers into a single unified toolset. The model sees one flat list of available tools and picks from it based on the task. Here is my baseline five-server composition:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

systemPrompt: |
  You are a senior developer assistant with access to filesystem tools,
  git operations, GitHub API, a Postgres database, and project documentation.
  Always prefer reading documentation before modifying code. Always commit
  changes to git before marking a task complete.

mcpServers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/Users/yourname/projects"]

  git:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-git@0.6.2"]
    env:
      GIT_AUTHOR_NAME: "Hermes Agent"
      GIT_AUTHOR_EMAIL: "hermes@yourproject.local"

  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
    include:
      - list_issues
      - get_issue
      - create_issue
      - add_issue_comment

  postgres:
    command: npx
    args:
      - -y
      - "@modelcontextprotocol/server-postgres"
      - "${DATABASE_URL}"
    include:
      - query
      - list_tables
      - describe_table

  docs:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-fetch"]
    env:
      ALLOWED_DOMAINS: "docs.yourproject.com,api.yourproject.com"
    include:
      - fetch
      - search
Enter fullscreen mode Exit fullscreen mode

The systemPrompt key at the top-level config is essential in multi-server setups. Without a system prompt, the model treats all tools as equally available and equally appropriate. With a system prompt that sets behavioral priorities — "read documentation before modifying code", "commit before marking complete" — tool selection becomes more intentional and the overall task completion quality improves significantly.

One thing to watch in multi-server setups: tool name collisions. If two MCP servers expose a tool called search, Hermes namespaces them as mcp:docs:search and mcp:github:search in its internal registry but presents both to the model. Whether the model picks the right one depends heavily on how well the server's tool descriptions distinguish the two operations. If you see the model consistently picking the wrong search variant, add an include filter to one of the servers to remove the ambiguous tool.

Pattern 5: Inbound — Hermes as MCP Server

Everything so far has been outbound: Hermes calling external MCP servers. Pattern 5 flips the direction. Hermes itself becomes an MCP server that external agents — Claude Code, Cursor, custom tools — can call as a tool.

# Start Hermes as an MCP server
hermes mcp serve   --transport stdio   --name "hermes-orchestrator"   --description "Multi-step task orchestrator with filesystem, git, and GitHub access"   --port 0
Enter fullscreen mode Exit fullscreen mode

When run in this mode, Hermes exposes itself via stdio using the standard MCP protocol. External agents connect to it exactly as they connect to any other MCP server. From the external agent's perspective, Hermes is a tool with one primary method: run_task, which accepts a natural-language task description and returns the result of Hermes completing that task using its own tool chain.

To make this permanent and auto-starting, configure it in a hermes-server.yaml:

# ~/.hermes/server.yaml
serve:
  transport: stdio
  name: hermes-orchestrator
  description: |
    Hermes orchestration agent. Accepts natural language task descriptions
    and completes them autonomously using filesystem, git, GitHub, and
    Postgres tools. Returns structured results with tool call traces.
  tools:
    - name: run_task
      description: |
        Execute a multi-step development task. Provide a clear task description
        including success criteria. Hermes will plan and execute the task using
        its available tools and return a structured result.
      inputSchema:
        type: object
        properties:
          task:
            type: string
            description: Natural language task description with success criteria
          context:
            type: string
            description: Optional additional context (file paths, constraints, etc.)
          max_steps:
            type: integer
            default: 20
            description: Maximum tool calls before halting
        required: [task]
Enter fullscreen mode Exit fullscreen mode

Start it:

hermes mcp serve --config ~/.hermes/server.yaml
Enter fullscreen mode Exit fullscreen mode

Pattern 6: The Dual-Stack — Claude Code as Coder, Hermes as Orchestrator

This is the pattern I use every day and the one that makes the biggest difference to how I work. Claude Code handles code generation, edits, and TypeScript/React tasks. Hermes handles orchestration — running sequences of tasks, managing git state, interacting with GitHub and Postgres, and coordinating work across multiple files and tools. Both agents share tool access, but they have different roles.

Here is how to wire it up. First, configure Hermes to expose itself as an MCP server AND connect to Claude Code's tool surface:

# ~/.hermes/config.yaml (Hermes side)
model: anthropic/claude-sonnet-4-6

systemPrompt: |
  You are an orchestration agent. You plan multi-step development workflows,
  manage git state, coordinate with GitHub, and delegate code-generation
  subtasks to Claude Code when needed. You do not write code directly — you
  describe what code is needed and call the claude_code tool to generate it.
  Always verify git status before and after code changes.

mcpServers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/Users/yourname/projects"]

  git:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-git@0.6.2"]

  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
    include:
      - list_issues
      - create_pull_request
      - add_issue_comment

serve:
  transport: stdio
  name: hermes-orchestrator
Enter fullscreen mode Exit fullscreen mode

Second, add Hermes as an MCP server in Claude Code's .claude.json:

# ~/.claude.json (Claude Code side)
{
  "mcpServers": {
    "hermes": {
      "command": "hermes",
      "args": ["mcp", "serve", "--config", "/Users/yourname/.hermes/config.yaml"],
      "description": "Hermes orchestration agent — use for multi-step tasks, git workflows, GitHub operations, and database queries"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Third, add Claude Code as an MCP server in Hermes's config, completing the bidirectional bridge:

# Add to ~/.hermes/config.yaml under mcpServers:
  claude_code:
    command: claude
    args: ["mcp", "serve"]
    description: "Claude Code  use for code generation, TypeScript, React, file edits"
    include:
      - edit_file
      - create_file
      - read_file
      - run_bash_command
Enter fullscreen mode Exit fullscreen mode

With both sides configured, the workflow looks like this in practice:

  1. I describe a feature to Hermes: "Implement the UserProfile component, wire it to the /api/profile endpoint, commit to a feature branch, and open a draft PR."

  2. Hermes plans the steps, reads the existing codebase via filesystem tools, and calls Claude Code via MCP to generate the actual component code.

  3. Claude Code returns the generated code. Hermes writes it to disk via filesystem tools, runs the git commit via git tools, and creates the PR via GitHub tools.

  4. Hermes reports the PR URL and a summary of what was done.

Neither agent is trying to do everything. Claude Code is better at code generation. Hermes is better at multi-step planning and tool orchestration. The dual-stack lets each do what it does best.

One critical thing to get right: the systemPrompt on the Hermes side must explicitly tell it NOT to write code directly and to call Claude Code for code generation. Without that instruction, Hermes will try to generate code itself using its own model — which works, but loses the specialization advantage. The system prompt is the architectural boundary.

Pattern 7: Cron-Driven Scheduled MCP Workflows

Hermes supports scheduled task execution via its cron integration. This is useful for automated workflows — daily digest emails, scheduled database cleanups, periodic GitHub sync, and similar recurring operations. The cron config goes in a separate cron.yaml file:

# ~/.hermes/cron.yaml
jobs:
  daily-digest:
    # Run at 8 AM IST (2:30 AM UTC) every weekday
    schedule: "30 2 * * 1-5"
    task: |
      1. Query the Postgres database for all open issues created in the last 24 hours.
      2. Fetch the corresponding GitHub issues to get current status.
      3. Generate a markdown summary grouped by priority.
      4. Create a new page in the project docs directory with today's date as the filename.
      5. Post the summary as a comment on the tracking GitHub issue #1.
    model: anthropic/claude-haiku-4-5-20251001    # cheaper model for scheduled tasks
    max_steps: 15
    on_failure:
      notify: "hermes-alerts"    # Telegram integration (configured separately)

  weekly-cleanup:
    # Every Sunday at midnight UTC
    schedule: "0 0 * * 0"
    task: |
      Query the Postgres database for rows in the task_log table older than 30 days.
      Delete them in batches of 100 to avoid locking. Report the row count deleted.
    model: anthropic/claude-haiku-4-5-20251001
    max_steps: 10
    dry_run: false    # set to true to preview without executing
Enter fullscreen mode Exit fullscreen mode

Start the cron runner:

hermes cron start --config ~/.hermes/cron.yaml --daemon
Enter fullscreen mode Exit fullscreen mode

Check cron status:

hermes cron status
# Output:
# daily-digest    next run: 2026-05-17T02:30:00Z    last: completed (14 steps, 42s)
# weekly-cleanup  next run: 2026-05-17T00:00:00Z    last: completed (8 steps, 12s)
Enter fullscreen mode Exit fullscreen mode

The model override in cron jobs is important. Scheduled tasks that run unattended do not need the highest-capability model. Using Haiku for daily digest generation costs roughly 10x less than Sonnet and produces equivalent quality for structured, well-defined tasks. I only use Sonnet or Opus in cron jobs when the task involves genuine reasoning or ambiguous input.

Pattern 8: Per-Tool Gateway Routing with use_gateway

By default, all MCP tool calls go through Hermes's primary model. The use_gateway key lets you route specific tool calls through a different configuration — a different model, different retry behavior, or a different timeout. This is useful when one tool in your stack is unreliable or slow:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

mcpServers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/Users/yourname/projects"]

  slow-api:
    command: npx
    args: ["-y", "@company/internal-mcp-server"]
    env:
      API_URL: "${INTERNAL_API_URL}"
      API_KEY: "${INTERNAL_API_KEY}"
    use_gateway:
      timeout_seconds: 120      # Default is 30s — this server is slow
      retry:
        max_attempts: 3
        backoff: exponential
        initial_delay_ms: 1000
      circuit_breaker:
        failure_threshold: 5    # Open circuit after 5 consecutive failures
        recovery_timeout_s: 60  # Try again after 60s
Enter fullscreen mode Exit fullscreen mode

The use_gateway block is per-server, not per-tool. Every tool from slow-api inherits the 120-second timeout and exponential backoff. If you need different timeouts for different tools from the same server, you need to run two instances of the server with different configs — there is no per-tool timeout override in v0.13.0.

The circuit breaker is the most important part of the use_gateway config for production. Without it, Hermes will keep trying a failing MCP server on every tool call, accumulating timeouts and burning model tokens on retries. With a circuit breaker, after 5 consecutive failures the server is marked as unavailable for 60 seconds. Hermes continues with the remaining tools and reports that the unavailable server's tools are offline. This degrades gracefully instead of hanging.

Pattern 9: Auxiliary Models for Sub-Tasks

Hermes supports a models block that lets you define auxiliary model configurations for specific purposes. The three patterns I use most are: a fast router model for initial task classification, a reasoning model for complex multi-step planning, and a cheap model for tool result summarization:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6    # primary model

models:
  router:
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0.0                   # deterministic routing decisions
    use_for: task_classification       # Hermes uses this model to classify tasks before routing

  reasoner:
    provider: anthropic
    model: claude-opus-4-7
    temperature: 0.3
    use_for: complex_planning          # Used when task requires multi-step planning > 10 steps
    max_tokens: 8192

  summarizer:
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0.0
    use_for: result_summarization      # Condenses verbose tool output before adding to context

mcpServers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/Users/yourname/projects"]

routing:
  classify_tasks: true                 # Enable automatic task classification via router model
  auto_route_complex: true             # Auto-upgrade to reasoner for complex tasks
  complexity_threshold: 8              # Tasks requiring > 8 steps use the reasoner model
Enter fullscreen mode Exit fullscreen mode

The routing.complexity_threshold is calibrated by trial and error. At 8, about 20% of my tasks get routed to Opus. Those are the tasks where the reasoning model's stronger multi-step planning genuinely produces better outcomes. If you set it too low, you pay Opus rates for tasks that Sonnet handles fine. If you set it too high, complex orchestration tasks fail partway through because Sonnet loses the thread.

The summarizer model is particularly valuable in multi-server setups where some tools return verbose output. Postgres query results with hundreds of rows, GitHub API responses with nested JSON, fetch results from documentation pages — all of these can balloon the context window quickly. The summarizer runs after each tool call and condenses the output to the relevant facts before it goes into the conversation context. This alone cuts my average token cost by roughly 30% on database-heavy workflows.

Pattern 10: Fallback Provider for Resilience

If Anthropic's API is degraded — which happens a few times a year — a Hermes instance without a fallback becomes completely non-functional. The fallback key configures an alternative provider that Hermes switches to automatically when the primary provider returns errors:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

fallback:
  model: openai/gpt-4o
  env:
    OPENAI_API_KEY: "${OPENAI_API_KEY}"
  trigger:
    on_error_codes: [429, 500, 502, 503, 504]
    consecutive_failures: 3            # Switch after 3 consecutive failures
  recovery:
    check_interval_seconds: 300        # Check primary provider every 5 minutes
    return_after_successful_checks: 2  # Return to primary after 2 successful health checks
Enter fullscreen mode Exit fullscreen mode

The fallback model does not need to match the primary model's capability tier exactly — it needs to be good enough to handle the tasks your cron jobs and automated workflows run while the primary provider is down. I use GPT-4o as the fallback for Sonnet because the capability overlap is high and the MCP tool calling behavior is similar enough that my existing workflows work without modification.

One thing to watch: fallback models may have different tool calling output formats, especially for complex nested tool calls. If your downstream code parses Hermes's output programmatically, test it against the fallback model before relying on automatic switching in production.

Pattern 11: Capability Filter Discipline

The include and exclude keys apply at the server level, but there are two additional capability types you can filter that most documentation does not mention: prompts and resources. MCP servers can expose three capability categories — tools, prompts, and resources — and Hermes enables all three by default:

# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6

mcpServers:
  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"

    # Tool capability filter (whitelist)
    include:
      - list_issues
      - get_issue
      - create_issue
      - add_issue_comment

    # Explicitly disable prompt and resource capabilities
    # These add unnecessary context overhead in tool-only workflows
    capabilities:
      prompts: false
      resources: false
      tools: true

  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/Users/yourname/projects"]
    capabilities:
      prompts: false      # No prompt templates needed — we use the top-level systemPrompt
      resources: true     # Keep resource access for file content fetching
      tools: true
Enter fullscreen mode Exit fullscreen mode

Disabling unused capability types reduces the size of the MCP capability advertisement that Hermes sends to the model at the start of each conversation. In a five-server setup, a full capability advertisement can run to 4,000-6,000 tokens before the user has said a word. Turning off prompts and resources on servers where you only need tools cuts this by 30-50%.

The exclude key is the alternative to include when you want most tools but need to block specific ones:

mcpServers:
  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
    exclude:
      - delete_repository
      - create_repository
      - delete_branch
      - force_push
Enter fullscreen mode Exit fullscreen mode

Use include when you want a small, well-defined set of tools (fewer than 10). Use exclude when you want most tools but need to block specific destructive operations. Never use both on the same server — if you specify both include and exclude, Hermes v0.13.0 applies include first and ignores exclude. This behavior may change in future versions.

Pattern 12: Production Diagnostics

When something goes wrong in production — a tool call hangs, a cron job fails, the dual-stack produces unexpected results — these are the diagnostic commands I run in order:

Step 1: Full doctor check

hermes doctor --verbose
# The --verbose flag checks MCP server reachability individually
# Output includes per-server connection test results
Enter fullscreen mode Exit fullscreen mode

Step 2: List connected tools

hermes tools list --mcp
# Shows all tools currently registered from all connected MCP servers
# With --mcp flag, groups by server and shows server health status
# Example output:
#
# [mcp:filesystem] HEALTHY (7 tools)
#   read_file, write_file, list_directory, create_directory,
#   move_file, search_files, get_file_info
#
# [mcp:github] HEALTHY (4 tools, 2 excluded)
#   list_issues, get_issue, create_issue, add_issue_comment
#
# [mcp:slow-api] CIRCUIT_OPEN (circuit opened 3m ago, recovery in 57s)
#   (no tools available — circuit breaker active)
Enter fullscreen mode Exit fullscreen mode

Step 3: Inspect recent tool call logs

hermes logs --last 50 --format json | jq '.[] | select(.type == "tool_call")'
# Shows the last 50 log entries filtered to tool calls only
# Each entry includes: timestamp, tool_name, server, duration_ms, status, error (if any)
Enter fullscreen mode Exit fullscreen mode

Step 4: Replay a failed task with debug logging

hermes run --task "your failed task description"   --debug   --log-file /tmp/hermes-debug-$(date +%Y%m%d-%H%M%S).json
# Runs the task with full tool call tracing enabled
# Writes structured log to /tmp/ for inspection
Enter fullscreen mode Exit fullscreen mode

Step 5: Test a specific MCP server connection in isolation

hermes mcp test --server github
# Runs the connection handshake for a single MCP server
# Reports capability negotiation, tool list, and a test tool call
# Useful for isolating whether a failure is in Hermes itself or the MCP server
Enter fullscreen mode Exit fullscreen mode

Step 6: Cron job inspection

hermes cron logs --job daily-digest --last 10
# Shows the last 10 runs of a specific cron job
# Includes task output, step count, duration, and failure reason if applicable
Enter fullscreen mode Exit fullscreen mode

The structured JSON log format (Pattern 12, step 3) is essential for debugging failures that are intermittent or time-sensitive. I pipe these logs to a simple monitoring script that alerts via Telegram when error rates exceed a threshold. The log format is stable across Hermes patch versions — the fields type, tool_name, server, duration_ms, and status are documented in the v0.13.0 changelog as stable public API.

Full Production Config Reference

Here is the complete config.yaml that combines all 12 patterns into a single production-ready file. This is my actual config, sanitized for sharing:

# ~/.hermes/config.yaml
# Hermes v0.13.0 — Full production config
# Last updated: 2026-05-16

model: anthropic/claude-sonnet-4-6

systemPrompt: |
  You are a senior development orchestration agent. You plan and execute
  multi-step development workflows using your available tools. You do not
  write code directly — delegate code generation to the claude_code tool.

  Behavioral rules:
  - Always read the current git status before modifying files
  - Always run tests after modifying code (use bash tool to run npm test)
  - Always commit changes to a feature branch, never directly to main
  - Prefer reading documentation before making architectural decisions
  - Summarize what you did at the end of every task

models:
  router:
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0.0
    use_for: task_classification
  reasoner:
    provider: anthropic
    model: claude-opus-4-7
    temperature: 0.3
    use_for: complex_planning
    max_tokens: 8192
  summarizer:
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0.0
    use_for: result_summarization

fallback:
  model: openai/gpt-4o
  env:
    OPENAI_API_KEY: "${OPENAI_API_KEY}"
  trigger:
    on_error_codes: [429, 500, 502, 503, 504]
    consecutive_failures: 3
  recovery:
    check_interval_seconds: 300
    return_after_successful_checks: 2

routing:
  classify_tasks: true
  auto_route_complex: true
  complexity_threshold: 8

mcpServers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/Users/yourname/projects"]
    capabilities:
      prompts: false
      resources: true
      tools: true

  git:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-git@0.6.2"]
    env:
      GIT_AUTHOR_NAME: "Hermes Agent"
      GIT_AUTHOR_EMAIL: "hermes@yourproject.local"
    capabilities:
      prompts: false
      resources: false
      tools: true

  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
    include:
      - list_issues
      - get_issue
      - create_issue
      - add_issue_comment
      - list_pull_requests
      - create_pull_request
    capabilities:
      prompts: false
      resources: false
      tools: true

  postgres:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-postgres", "${DATABASE_URL}"]
    include:
      - query
      - list_tables
      - describe_table
    use_gateway:
      timeout_seconds: 60
      retry:
        max_attempts: 2
        backoff: exponential
        initial_delay_ms: 500
      circuit_breaker:
        failure_threshold: 3
        recovery_timeout_s: 120
    capabilities:
      prompts: false
      resources: false
      tools: true

  claude_code:
    command: claude
    args: ["mcp", "serve"]
    description: "Claude Code  use for all code generation, TypeScript, React, and file edits"
    include:
      - edit_file
      - create_file
      - read_file
      - run_bash_command

serve:
  transport: stdio
  name: hermes-orchestrator
  description: |
    Hermes multi-step development orchestrator. Accepts natural language task
    descriptions and executes them autonomously using filesystem, git, GitHub,
    Postgres, and Claude Code tools.
Enter fullscreen mode Exit fullscreen mode

What the Dual-Stack Changes Day-to-Day

After running this setup for three months, the practical difference is that I spend significantly less time context-switching between tools. A task like "implement the subscription cancellation flow, write the API route, the client component, and the email notification, then open a PR" — which previously required me to coordinate multiple Claude Code sessions, manually run git commands, and interact with the GitHub UI — now runs end-to-end in a single Hermes invocation. Claude Code handles the code. Hermes handles the coordination.

The failure modes are real but manageable. MCP server startup time adds 2-5 seconds to the first tool call in a new session — this is the child process startup latency. In interactive sessions this is invisible. In cron jobs it is worth noting in your timeout calculations. The circuit breaker (Pattern 8) has saved me from cascading failures twice when my internal API was down. Without it, those cron jobs would have hung for their full timeout duration on every scheduled run until I manually intervened.

Version pinning (Pattern 1, the @1.9.2 syntax) is not optional in production. MCP server packages ship frequently and minor versions occasionally contain breaking changes in tool output format. Pin your versions, test before upgrading, and keep a note of which version is in production. I learned this the hard way when server-filesystem changed the list_directory output format in a patch version and broke a downstream script that was parsing the output.

Run hermes doctor as part of your deployment verification. If your deployment process changes environment variables or Node versions, a doctor check immediately after deploy catches configuration problems before they cause silent failures in production workflows. I have it in my post-deploy script alongside the standard HTTP health check.

Originally published at wowhow.cloud

Top comments (0)