<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Avinash Seethalam</title>
    <description>The latest articles on DEV Community by Avinash Seethalam (@avinash431).</description>
    <link>https://dev.to/avinash431</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3898593%2F6809101e-2ef9-4462-9473-8d6d51649985.png</url>
      <title>DEV Community: Avinash Seethalam</title>
      <link>https://dev.to/avinash431</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/avinash431"/>
    <language>en</language>
    <item>
      <title>Running Hermes Agent with NVIDIA-Hosted Models and Local Ollama</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sat, 09 May 2026 11:59:12 +0000</pubDate>
      <link>https://dev.to/avinash431/running-hermes-agent-with-nvidia-hosted-models-and-local-ollama-214j</link>
      <guid>https://dev.to/avinash431/running-hermes-agent-with-nvidia-hosted-models-and-local-ollama-214j</guid>
      <description>&lt;p&gt;I spent about a week migrating my agent setup off &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; and onto &lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;Hermes Agent&lt;/a&gt;, with a hybrid backend. NVIDIA-hosted inference for the heavy stuff, a local Ollama daemon for everything I didn't want leaving the box. This is what I ended up with after two false starts and one evening of yelling at WSL networking.&lt;/p&gt;

&lt;p&gt;If you're already happy with your current loop, skip this. If you've been hitting the same things I have (flaky routing, agents that lose the plot on multi-file edits, surprise bills), maybe useful.&lt;/p&gt;

&lt;p&gt;References I actually opened while writing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hermes repo: &lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;https://github.com/NousResearch/hermes-agent&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Hermes docs: &lt;a href="https://hermes-agent.nousresearch.com/docs/" rel="noopener noreferrer"&gt;https://hermes-agent.nousresearch.com/docs/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OpenClaw repo: &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;https://github.com/openclaw/openclaw&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OpenClaw docs: &lt;a href="https://docs.openclaw.ai" rel="noopener noreferrer"&gt;https://docs.openclaw.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;NVIDIA NIM / build catalog: &lt;a href="https://build.nvidia.com" rel="noopener noreferrer"&gt;https://build.nvidia.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Ollama repo: &lt;a href="https://github.com/ollama/ollama" rel="noopener noreferrer"&gt;https://github.com/ollama/ollama&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OpenRouter coding category: &lt;a href="https://openrouter.ai/apps/category/coding" rel="noopener noreferrer"&gt;https://openrouter.ai/apps/category/coding&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note on framing. OpenClaw and Hermes overlap but they are not the same shape of tool. OpenClaw is a personal-assistant gateway whose main surface is messaging channels (WhatsApp, Telegram, Discord, iMessage). Hermes ships a terminal TUI plus a messaging gateway plus a skills/memory loop, and includes a &lt;code&gt;hermes claw migrate&lt;/code&gt; command that imports OpenClaw configs directly. So the comparison below is based on how I was using OpenClaw in practice (terminal-first), not its actual elevator pitch. If you came to OpenClaw for the WhatsApp bot, your mileage will differ.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I bothered
&lt;/h2&gt;

&lt;p&gt;The OpenRouter coding category keeps growing. Hermes started showing up in those threads, and Nous shipping an explicit OpenClaw migration path made "try it for a week" cheap.&lt;/p&gt;

&lt;p&gt;My OpenClaw setup had drifted into a state I didn't trust. Three things in particular:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Long sessions silently lost context after compaction. Asked it to recall a migration plan we'd sketched two hours earlier and got back a confidently wrong summary that mixed in details from a totally different repo.&lt;/li&gt;
&lt;li&gt;Provider routing was opaque. I'd ask for a specific model and it'd quietly fall back to something cheaper. Only noticed because the latency dropped.&lt;/li&gt;
&lt;li&gt;Multi-file refactors needed too much hand-holding. Edit file A correctly, edit B as if A's edit hadn't happened, loop.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not OpenClaw-specific. General failure mode of agents that conflate "context window" with "memory." But it added up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Hermes Over an OpenClaw-Style Workflow
&lt;/h2&gt;

&lt;p&gt;What got better, in roughly a week of use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skills + persistent memory as first-class concepts. Hermes has a built-in skill loop and FTS5 session search. OpenClaw has a skills system too (ClawHub) but cross-session recall in Hermes felt tighter. Asking "how did we set up the OpenRouter pinning last week" actually returned the snippet.&lt;/li&gt;
&lt;li&gt;A real terminal UI. &lt;code&gt;hermes&lt;/code&gt; drops into a TUI with multiline editing, slash-command autocomplete, conversation history, streaming tool output. OpenClaw's chat surface is fine. Hermes' is just better suited to how I work.&lt;/li&gt;
&lt;li&gt;Config is YAML. Everything in &lt;code&gt;~/.hermes/config.yaml&lt;/code&gt;, secrets in &lt;code&gt;~/.hermes/.env&lt;/code&gt;. You can diff it. You can copy it.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;hermes model&lt;/code&gt; for switching providers. Or &lt;code&gt;hermes config set model openrouter/google/gemini-2.5-flash&lt;/code&gt; directly. No restart dance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where OpenClaw is still the better pick:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Messaging-first workflows. OpenClaw's channel coverage is broader (WeChat, Matrix, Feishu, LINE, Nostr, the long tail). If your bot lives on WhatsApp, stay there.&lt;/li&gt;
&lt;li&gt;Live Canvas and Voice Wake are nice if you're building a voice assistant rather than a coding agent. Hermes has voice memo transcription, not the same thing.&lt;/li&gt;
&lt;li&gt;If you're on Node-only infra, &lt;code&gt;npm install -g openclaw@latest&lt;/code&gt; is one line. Hermes pulls in &lt;code&gt;uv&lt;/code&gt;, Python 3.11, Node, ripgrep, ffmpeg.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The thing that mattered most to me architecturally: Hermes treats the provider as configuration, not code. The same &lt;code&gt;model.base_url&lt;/code&gt; field handles NVIDIA NIM, Ollama (local or Cloud), OpenRouter, anything OpenAI-compatible. One CLI command flips between them. OpenClaw can do this too. Hermes' YAML-first version is just faster to reason about when something breaks at 11pm.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;macOS daily, Ubuntu workstation, WSL2 on a Windows laptop I travel with. Same one-liner everywhere.&lt;/p&gt;

&lt;h3&gt;
  
  
  macOS
&lt;/h3&gt;

&lt;p&gt;The official installer handles &lt;code&gt;uv&lt;/code&gt;, Python 3.11, Node, ripgrep, ffmpeg:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output is illustrative, lines vary by version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;==&amp;gt; Installing uv
==&amp;gt; Installing Python 3.11
==&amp;gt; Installing Node.js
==&amp;gt; Cloning hermes-agent
==&amp;gt; Symlinking ~/.local/bin/hermes
✓ Hermes installed. Run: source ~/.zshrc &amp;amp;&amp;amp; hermes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; ~/.zshrc
hermes &lt;span class="nt"&gt;--version&lt;/span&gt;
hermes doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;hermes doctor&lt;/code&gt; is the most useful command during setup. Checks PATH, config location, provider reachability. Run it before anything else.&lt;/p&gt;

&lt;p&gt;One macOS thing that cost me twenty minutes: if you have an older Homebrew Python on PATH, the installer prefers its own &lt;code&gt;uv&lt;/code&gt;-managed Python (correct), but &lt;code&gt;python3&lt;/code&gt; on your shell is now a different interpreter than the one Hermes is using. Mostly fine, occasionally surprising when you're debugging. If you're hacking on Hermes itself, prefer the dev path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/NousResearch/hermes-agent.git
&lt;span class="nb"&gt;cd &lt;/span&gt;hermes-agent
./setup-hermes.sh
./hermes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ubuntu / Linux
&lt;/h3&gt;

&lt;p&gt;Same one-liner.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're on a minimal server image without &lt;code&gt;curl&lt;/code&gt;, install it first (&lt;code&gt;sudo apt install -y curl&lt;/code&gt;). Installer pulls into &lt;code&gt;~/.hermes/&lt;/code&gt; and symlinks &lt;code&gt;~/.local/bin/hermes&lt;/code&gt;. Make sure that's on PATH:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'export PATH="$HOME/.local/bin:$PATH"'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
which hermes
hermes doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;which hermes&lt;/code&gt; is empty, your rc is overriding PATH late. zsh+oh-my-zsh does this. Grep your &lt;code&gt;.zshrc&lt;/code&gt; for &lt;code&gt;export PATH=&lt;/code&gt; lines that come after the installer's edits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Windows + WSL2
&lt;/h3&gt;

&lt;p&gt;The Hermes README has a native Windows PowerShell installer flagged as &lt;strong&gt;early beta&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;irm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;iex&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I tried it. It works, but I went back to WSL2 within a day. Inside WSL2 Ubuntu the Linux one-liner is fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;WSL was annoying as hell, mostly for reasons that aren't Hermes' fault.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don't put your project on &lt;code&gt;/mnt/c/...&lt;/code&gt;. The 9P translation layer makes file watches and large reads slow enough that tool calls visibly lag. Workspaces on the WSL native filesystem (&lt;code&gt;~/work/...&lt;/code&gt;) only.&lt;/li&gt;
&lt;li&gt;If you install Ollama on Windows and Hermes inside WSL, you have to reach the Windows host from WSL. Host IP is in &lt;code&gt;/etc/resolv.conf&lt;/code&gt; as &lt;code&gt;nameserver&lt;/code&gt;. On some configurations it changes between reboots. I gave up. Installed Ollama inside WSL.&lt;/li&gt;
&lt;li&gt;Ctrl+C in some Windows terminals doesn't propagate cleanly to long tool calls. Windows Terminal is better than the legacy console here. Don't use the legacy one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Hermes browser-based dashboard chat pane requires WSL2 specifically (it uses a POSIX PTY). Classic CLI and gateway run natively. So if you only need the terminal, the PowerShell install is technically fine. I just didn't trust it enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  First-Run Setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wizard. Walks you through provider selection, key entry, writes the config. If you have an existing &lt;code&gt;~/.openclaw&lt;/code&gt; it offers to migrate skills, memories, command allowlists, API keys. From the README:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes claw migrate              &lt;span class="c"&gt;# Interactive migration&lt;/span&gt;
hermes claw migrate &lt;span class="nt"&gt;--dry-run&lt;/span&gt;    &lt;span class="c"&gt;# Preview what would be migrated&lt;/span&gt;
hermes claw migrate &lt;span class="nt"&gt;--preset&lt;/span&gt; user-data   &lt;span class="c"&gt;# Migrate without secrets&lt;/span&gt;
hermes claw migrate &lt;span class="nt"&gt;--overwrite&lt;/span&gt;  &lt;span class="c"&gt;# Overwrite existing conflicts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run &lt;code&gt;--dry-run&lt;/code&gt; first. It prints exactly what would be copied where. Useful, and the kind of thing that suggests someone actually thought about the migration UX. I imported user-data only and re-pasted my keys by hand because the OpenClaw config had three stale keys I'd forgotten about.&lt;/p&gt;

&lt;p&gt;After setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/.hermes/
├── config.yaml     # Settings (model, terminal, TTS, compression, etc.)
├── .env            # API keys and secrets
├── auth.json       # OAuth provider credentials
├── SOUL.md         # Primary agent identity
├── memories/       # Persistent memory
├── skills/         # Agent-created and imported skills
├── cron/           # Scheduled jobs
├── sessions/       # Gateway sessions
└── logs/           # Error and gateway logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;HERMES_HOME&lt;/code&gt; overrides the location if you want parallel installations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi6yiegjqmx22y8wcnmar.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi6yiegjqmx22y8wcnmar.png" alt="hermes doctor diagnostic output" width="800" height="1385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;hermes config&lt;/code&gt; after setup gives you a one-screen view of where everything resolves from. Useful to verify the model is actually pointed where you think.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2jpjoz669960evscc7ln.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2jpjoz669960evscc7ln.png" alt="hermes config showing resolved settings" width="800" height="798"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  NVIDIA Model Configuration
&lt;/h2&gt;

&lt;p&gt;NVIDIA's hosted endpoints (&lt;a href="https://build.nvidia.com" rel="noopener noreferrer"&gt;build.nvidia.com&lt;/a&gt;, the NIM-style ones) are OpenAI-compatible. Hermes already speaks OpenAI-compatible. So plugging them in is base URL plus key.&lt;/p&gt;

&lt;p&gt;Why I leaned on them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Latency was good and stayed good. I expected hosted endpoints to be uneven. Over a week they weren't, with one exception (more on that below).&lt;/li&gt;
&lt;li&gt;Llama variants, Qwen-Coder variants, DeepSeek-Coder, Nemotron, all reachable from one provider with one key. No juggling four credentials.&lt;/li&gt;
&lt;li&gt;A 70B-class model running in the cloud is, from my workstation's perspective, free RAM.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The exception: on April 18 the integrate.api.nvidia.com endpoint started throwing 5xx for about twenty minutes around midday Pacific. Hermes retried with backoff but the session was effectively frozen until I noticed and flipped to local Ollama. Not a big deal. Worth knowing the failure mode.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting a key
&lt;/h3&gt;

&lt;p&gt;Sign in at &lt;a href="https://build.nvidia.com" rel="noopener noreferrer"&gt;build.nvidia.com&lt;/a&gt;, generate an API key. Treat it like any other credential.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wiring it into Hermes
&lt;/h3&gt;

&lt;p&gt;Fastest path is the env-var route. Hermes reads provider keys and base URLs from &lt;code&gt;~/.hermes/.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# ~/.hermes/.env&lt;/span&gt;
&lt;span class="nv"&gt;NVIDIA_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nvapi-...
&lt;span class="nv"&gt;NVIDIA_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://integrate.api.nvidia.com/v1
&lt;span class="nv"&gt;HERMES_INFERENCE_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nvidia
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;NVIDIA_BASE_URL&lt;/code&gt; defaults to &lt;code&gt;https://integrate.api.nvidia.com/v1&lt;/code&gt; per the Hermes &lt;a href="https://hermes-agent.nousresearch.com/docs/reference/environment-variables" rel="noopener noreferrer"&gt;environment variables reference&lt;/a&gt;, so you can omit it unless you're hitting a self-hosted NIM. &lt;code&gt;HERMES_INFERENCE_PROVIDER&lt;/code&gt; accepts values like &lt;code&gt;nvidia&lt;/code&gt;, &lt;code&gt;openrouter&lt;/code&gt;, &lt;code&gt;anthropic&lt;/code&gt;, &lt;code&gt;ollama-cloud&lt;/code&gt; (from the same reference). It's the global "which provider is the default" switch.&lt;/p&gt;

&lt;p&gt;Pick a model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes model
&lt;span class="c"&gt;# or directly&lt;/span&gt;
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model nvidia/meta/llama-3.1-70b-instruct
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Model identifier depends on what NVIDIA exposes in the catalog at the time. The catalog moves. Verify the slug before pasting. Common ones I've used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;meta/llama-3.1-70b-instruct&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;qwen/qwen2.5-coder-32b-instruct&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;deepseek-ai/deepseek-coder-...&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Various &lt;code&gt;nemotron&lt;/code&gt; variants&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the slug doesn't resolve, Hermes tells you on first call rather than at config time. Mildly annoying. Fine once you know.&lt;/p&gt;

&lt;h3&gt;
  
  
  YAML equivalent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ~/.hermes/config.yaml&lt;/span&gt;
&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;meta/llama-3.1-70b-instruct&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia&lt;/span&gt;
  &lt;span class="na"&gt;base_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;        &lt;span class="c1"&gt;# leave empty to use NVIDIA_BASE_URL from .env&lt;/span&gt;
  &lt;span class="na"&gt;context_length&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;32768&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Hermes config docs are explicit about &lt;code&gt;base_url&lt;/code&gt;: when set, Hermes ignores the provider and calls that endpoint directly. Useful for self-hosted NIMs. Footgun if you forget about a stale URL from an experiment three weeks ago. Empty string is the safe default.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operational notes
&lt;/h3&gt;

&lt;p&gt;Rate limits exist and I haven't found a definitive published cap. In practice agent loops hit limits well before chat sessions do, because every tool result is going back into the context.&lt;/p&gt;

&lt;p&gt;Free-tier quotas are real. I'd planned to do bulk repo analysis on hosted models. Switched to local once I realized how fast the quota burns. Reserve the hosted ones for the parts that benefit from a 70B-class model.&lt;/p&gt;

&lt;p&gt;Advertised context windows and the windows that actually behave well are not the same. Past ~32K tokens on some models the recall got noticeably worse. I cap &lt;code&gt;context_length&lt;/code&gt; at 32768 even on models that claim more. (There's a separate question about whether the model is "using" the long context or just paying its memory cost. I haven't dug in.)&lt;/p&gt;

&lt;p&gt;Default timeouts were fine for chat, occasionally too short for long tool-augmented planning. Bump if you're seeing premature aborts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fueb8ut1petu9uzjlcbom.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fueb8ut1petu9uzjlcbom.png" alt="build.nvidia.com API key management" width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Ollama Configuration
&lt;/h2&gt;

&lt;p&gt;NVIDIA-hosted is great until you're on a plane, on a hotspot, or working on something you don't want leaving the machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;

&lt;p&gt;macOS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama
brew services start ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Linux:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Default listen: &lt;code&gt;127.0.0.1:11434&lt;/code&gt;. Need it reachable on a LAN, set &lt;code&gt;OLLAMA_HOST=0.0.0.0:11434&lt;/code&gt; before starting. Heads up: there's no auth on the Ollama API. Don't expose it on a public network. Don't bind it to &lt;code&gt;0.0.0.0&lt;/code&gt; on a coffee-shop wifi.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pulling models
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5-coder:7b
ollama pull qwen2.5-coder:14b      &lt;span class="c"&gt;# if you have the VRAM&lt;/span&gt;
ollama pull deepseek-coder-v2:16b   &lt;span class="c"&gt;# MoE, surprisingly fast for its size&lt;/span&gt;
ollama pull llama3.1:8b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pulling manifest
pulling abc123... 100% ▕████████████████▏ 4.7 GB
verifying sha256 digest
writing manifest
success
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify it runs before wiring it into Hermes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama run qwen2.5-coder:7b &lt;span class="s2"&gt;"write a python function that reverses a string"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If that hangs &amp;gt;30s on first invocation, the model is loading. Subsequent calls are fast. Hangs forever, you're probably on CPU fallback because the GPU couldn't initialize. Check &lt;code&gt;ollama ps&lt;/code&gt; and &lt;code&gt;nvidia-smi&lt;/code&gt; (or Activity Monitor on Mac).&lt;/p&gt;

&lt;h3&gt;
  
  
  Wiring local Ollama into Hermes
&lt;/h3&gt;

&lt;p&gt;This is the gotcha that cost me an hour. Hermes' default for &lt;code&gt;OLLAMA_BASE_URL&lt;/code&gt; is &lt;code&gt;https://ollama.com/v1&lt;/code&gt;, which is &lt;em&gt;Ollama Cloud&lt;/em&gt;, not your local daemon. Want local? Override it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# ~/.hermes/.env&lt;/span&gt;
&lt;span class="nv"&gt;OLLAMA_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ollama                       &lt;span class="c"&gt;# any non-empty string; local Ollama ignores it&lt;/span&gt;
&lt;span class="nv"&gt;OLLAMA_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:11434/v1   &lt;span class="c"&gt;# local daemon, NOT Ollama Cloud&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Doc-verified path uses the YAML &lt;code&gt;provider: custom&lt;/code&gt; form, which bypasses provider-name routing and calls &lt;code&gt;base_url&lt;/code&gt; directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ~/.hermes/config.yaml&lt;/span&gt;
&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;qwen2.5-coder:14b&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;custom&lt;/span&gt;
  &lt;span class="na"&gt;base_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:11434/v1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or from the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.provider custom
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.base_url http://localhost:11434/v1
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.default qwen2.5-coder:14b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Ollama Cloud, leave &lt;code&gt;OLLAMA_BASE_URL&lt;/code&gt; at default and set &lt;code&gt;HERMES_INFERENCE_PROVIDER=ollama-cloud&lt;/code&gt;. The env-vars reference lists &lt;code&gt;ollama-cloud&lt;/code&gt; explicitly. A bare &lt;code&gt;ollama&lt;/code&gt; provider isn't documented there at the time of writing, so I stuck with &lt;code&gt;provider: custom&lt;/code&gt; for local rather than guess.&lt;/p&gt;

&lt;p&gt;Sanity check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes doctor
hermes config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;hermes doctor&lt;/code&gt; will tell you if the configured base URL is unreachable. Faster signal than waiting for the first chat turn to fail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operational notes
&lt;/h3&gt;

&lt;p&gt;VRAM is the constraint. A &lt;code&gt;14b&lt;/code&gt; Q4 quant runs comfortably on a 16 GB GPU. A &lt;code&gt;32b&lt;/code&gt; does not. On my M2 Pro 16 GB Mac, &lt;code&gt;14b&lt;/code&gt; is the practical ceiling and I notice the memory pressure with a browser open.&lt;/p&gt;

&lt;p&gt;Quantization matters more than I expected going in. &lt;code&gt;q4_K_M&lt;/code&gt; is the sweet spot for coding tasks. &lt;code&gt;q8_0&lt;/code&gt; is noticeably better on nuanced refactors but the memory cost is real and you'll feel it.&lt;/p&gt;

&lt;p&gt;CPU fallback is unusable for interactive work. A &lt;code&gt;7b&lt;/code&gt; on pure CPU can take 30+ seconds per response. Fine for batch, painful for an agent loop.&lt;/p&gt;

&lt;p&gt;Ollama default context is 2048 tokens on some models. Trips people up constantly. Set &lt;code&gt;num_ctx&lt;/code&gt; via the model's Modelfile or pass it through Hermes; verify with &lt;code&gt;ollama show &amp;lt;model&amp;gt;&lt;/code&gt;. I lost an evening to this before realizing the model wasn't dumb, it was just blind past the first 2K tokens.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm738oy2xrdehkcuyejqx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm738oy2xrdehkcuyejqx.png" alt="Ollama runtime with model loaded" width="800" height="142"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommended Model Setup
&lt;/h2&gt;

&lt;p&gt;Qualitative, week of real use, no benchmarks.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Coding Quality&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;VRAM&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Llama 3.1 70B Instruct&lt;/td&gt;
&lt;td&gt;NVIDIA&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;n/a (hosted)&lt;/td&gt;
&lt;td&gt;Free-tier OK&lt;/td&gt;
&lt;td&gt;Planning, long-context reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen2.5-Coder 32B&lt;/td&gt;
&lt;td&gt;NVIDIA&lt;/td&gt;
&lt;td&gt;Very strong&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;n/a (hosted)&lt;/td&gt;
&lt;td&gt;Free-tier OK&lt;/td&gt;
&lt;td&gt;Multi-file refactors, code review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek-Coder (large)&lt;/td&gt;
&lt;td&gt;NVIDIA&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;n/a (hosted)&lt;/td&gt;
&lt;td&gt;Free-tier OK&lt;/td&gt;
&lt;td&gt;Algorithmic / DSA-style tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Nemotron family&lt;/td&gt;
&lt;td&gt;NVIDIA&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;n/a (hosted)&lt;/td&gt;
&lt;td&gt;Free-tier OK&lt;/td&gt;
&lt;td&gt;Worth A/B-testing on your domain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen2.5-Coder 14B (q4_K_M)&lt;/td&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;Solid&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;~10–12 GB&lt;/td&gt;
&lt;td&gt;Local only&lt;/td&gt;
&lt;td&gt;Daily driver, offline work&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen2.5-Coder 7B (q4_K_M)&lt;/td&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;OK&lt;/td&gt;
&lt;td&gt;Very fast&lt;/td&gt;
&lt;td&gt;~5–6 GB&lt;/td&gt;
&lt;td&gt;Local only&lt;/td&gt;
&lt;td&gt;Quick edits, autocomplete-style use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek-Coder-V2 16B (MoE)&lt;/td&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;~10–12 GB&lt;/td&gt;
&lt;td&gt;Local only&lt;/td&gt;
&lt;td&gt;Surprisingly capable for its footprint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 3.1 8B&lt;/td&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;OK&lt;/td&gt;
&lt;td&gt;Very fast&lt;/td&gt;
&lt;td&gt;~5–6 GB&lt;/td&gt;
&lt;td&gt;Local only&lt;/td&gt;
&lt;td&gt;Lightweight planning / chat&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Day-to-day I use Qwen2.5-Coder 32B on NVIDIA for serious work, Qwen2.5-Coder 14B locally for everything else, Llama 3.1 70B on NVIDIA when I need long-context planning. Tried the rest, rotated them out. A coworker on an M3 Max says the local 32B is usable for him; on my 16 GB Pro it isn't, so don't take VRAM numbers above as the floor for everyone.&lt;/p&gt;

&lt;p&gt;Switching between them is one line for hosted, three for local because of the &lt;code&gt;base_url&lt;/code&gt; switch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# hosted&lt;/span&gt;
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model nvidia/qwen/qwen2.5-coder-32b-instruct

&lt;span class="c"&gt;# local&lt;/span&gt;
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.provider custom
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.base_url http://localhost:11434/v1
hermes config &lt;span class="nb"&gt;set &lt;/span&gt;model.default qwen2.5-coder:14b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;hermes model&lt;/code&gt; (the interactive picker) does the same thing in fewer keystrokes once you've used it twice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real Workflow Improvements
&lt;/h2&gt;

&lt;p&gt;Concrete things that got better:&lt;/p&gt;

&lt;p&gt;Repository analysis. Pointing at a 200-file Python repo and asking "where does the auth flow start" used to be a coin flip. With Hermes routing the analysis pass to a 32B-class hosted model and edits to a local 14B, I get useful answers in under a minute, with file paths I can actually open.&lt;/p&gt;

&lt;p&gt;Multi-file refactors. Renaming a domain concept across a service used to require me to micromanage every file. Hermes' tool-call sequencing handles "edit A, re-read A, then edit B based on A's new state" without me nudging it each step. Not magic, it still gets confused on circular imports, but the baseline is better.&lt;/p&gt;

&lt;p&gt;Long-context exploration works. Pasting a stack trace plus three relevant files into context and asking for a hypothesis is reliable on the 70B hosted model. Local 14B handles shorter cases.&lt;/p&gt;

&lt;p&gt;Cross-session recall is the feature I miss most when I temporarily switch back to anything else. "How did I configure the NVIDIA timeout last week" returns the actual config snippet, not a guess. Different in kind.&lt;/p&gt;

&lt;p&gt;Skills. I haven't gone deep here yet. The bundled &lt;code&gt;openclaw-migration&lt;/code&gt; skill walked me through the import with dry-run previews and that alone saved a chunk of time. The autonomous skill creation after complex tasks is the part I want to evaluate over a longer horizon, ask me in a month.&lt;/p&gt;

&lt;p&gt;What didn't change:&lt;/p&gt;

&lt;p&gt;Tool calls run my tests fine. Interpreting flaky test output is still on me.&lt;/p&gt;

&lt;p&gt;Frontend work. All current models are mediocre at non-trivial CSS. Hermes doesn't fix that.&lt;/p&gt;

&lt;p&gt;Truly novel architectural decisions, the agent produces something plausible, which is worse than producing nothing if you're not careful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Failure Modes and Rough Edges
&lt;/h2&gt;

&lt;p&gt;The section that made me want to write the post.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;OLLAMA_BASE_URL&lt;/code&gt; defaults to Ollama Cloud, not local. Most common silent failure I've seen. Override to &lt;code&gt;http://localhost:11434/v1&lt;/code&gt; for local.&lt;/li&gt;
&lt;li&gt;API key not picked up. Hermes reads &lt;code&gt;~/.hermes/.env&lt;/code&gt; at startup. Edit while running, restart the session or run &lt;code&gt;hermes config check&lt;/code&gt;. (I keep meaning to file an issue about a &lt;code&gt;hermes reload&lt;/code&gt; command. Haven't.)&lt;/li&gt;
&lt;li&gt;OpenRouter routing inconsistencies. The underlying provider OpenRouter selects can change between requests. Pin a provider preference if reproducibility matters.&lt;/li&gt;
&lt;li&gt;Ollama context default of 2048 on some models. Your model isn't dumb. Set &lt;code&gt;num_ctx&lt;/code&gt;, verify with &lt;code&gt;ollama show &amp;lt;model&amp;gt;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;WSL filesystem. File watch events on &lt;code&gt;/mnt/c/...&lt;/code&gt; are unreliable. Workspaces on the WSL native filesystem only.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;model.base_url&lt;/code&gt; overrides &lt;code&gt;model.provider&lt;/code&gt; silently per the docs. A stale base_url from an earlier experiment will quietly route everything to the wrong endpoint. I did this to myself twice.&lt;/li&gt;
&lt;li&gt;Free-tier throttling. NVIDIA's free tier will throttle. Hermes retries on 429s. You'll see a session pause for 5–30s with no obvious indicator unless you're tailing &lt;code&gt;~/.hermes/logs/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Reasoning-heavy variants. Some Nemotron-family reasoning models produce great output 90% of the time and absolute nonsense the other 10%. Worth keeping in your config, don't make them the default.&lt;/li&gt;
&lt;li&gt;Token cost surprises. Long agent loops consume an order of magnitude more tokens than chat sessions because every tool call result goes back in. Watch the dashboard the first few days.&lt;/li&gt;
&lt;li&gt;Migration imports more than you might want. Default preset brings API keys over. Use &lt;code&gt;--preset user-data&lt;/code&gt; to skip.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The defaults aren't great. Not wrong, exactly. Just the combination of "Ollama base URL pointing at Cloud, plus 2048 context, plus free-tier quota" produces a setup that works for twenty minutes and then mysteriously degrades, and you spend an evening figuring out which knob.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Quick reference for things I've actually hit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;hermes: command not found&lt;/code&gt;. &lt;code&gt;~/.local/bin&lt;/code&gt; not on PATH. Add it.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PermissionError&lt;/code&gt; on config write. Set &lt;code&gt;HERMES_HOME&lt;/code&gt; to a writable path.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;401 Unauthorized&lt;/code&gt; from NVIDIA. Key not in &lt;code&gt;~/.hermes/.env&lt;/code&gt;, or rotated. &lt;code&gt;cat ~/.hermes/.env | grep NVIDIA&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;connection refused&lt;/code&gt; to Ollama. Daemon not running. &lt;code&gt;ollama serve&lt;/code&gt;, or &lt;code&gt;brew services start ollama&lt;/code&gt;, or &lt;code&gt;systemctl start ollama&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Hermes calls &lt;code&gt;https://ollama.com/v1&lt;/code&gt; instead of localhost. &lt;code&gt;OLLAMA_BASE_URL&lt;/code&gt; not overridden.&lt;/li&gt;
&lt;li&gt;Ollama model "doesn't follow instructions". Almost always the 2048-context default.&lt;/li&gt;
&lt;li&gt;Tool calls hang forever. Provider timeout too short, or the model is in a tool-call loop. Inspect &lt;code&gt;~/.hermes/logs/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Hermes "loses" the workspace. You're on WSL with the project on &lt;code&gt;/mnt/c/...&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Different answers from the same prompt. Provider-side cache, or a routing layer selecting a different upstream. Pin the provider, disable cache while debugging.&lt;/li&gt;
&lt;li&gt;Migration wizard doesn't see OpenClaw. Wizard looks at &lt;code&gt;~/.openclaw&lt;/code&gt;. Symlink if elsewhere.&lt;/li&gt;
&lt;li&gt;Sudden latency spike. Check the provider's status page. NVIDIA's hosted endpoints have been stable, mostly, but they're not magic and April 18 happened.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When in doubt, &lt;code&gt;hermes doctor&lt;/code&gt; first. Catches more first-line problems than you'd expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Hermes is best for engineers who already have an opinion about how their tooling should work and want an agent that exposes its config rather than hiding it. If you want defaults that just work with no thought, OpenClaw and the more polished alternatives are friendlier on day one. And OpenClaw is genuinely the better tool if your primary surface is messaging channels rather than a terminal.&lt;/p&gt;

&lt;p&gt;Where Hermes still needs work, in my opinion:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;OLLAMA_BASE_URL&lt;/code&gt; defaulting to Cloud is a usability footgun. A clearer default or a louder warning on first call would help.&lt;/li&gt;
&lt;li&gt;Configuration documentation lags the schema in places. I read source more than once.&lt;/li&gt;
&lt;li&gt;Error messages on misconfig are sometimes cryptic. I'd take slower startup for better diagnostics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hybrid setups make sense right now because hosted inference is fast and capable but unreliable in ways out of your control (rate limits, quotas, occasional regressions on newly-deployed models). Local inference is reliable but capacity-constrained. Running both, routing deliberately, gives you a setup that degrades gracefully. Not a revolution. Just how the production-engineering side of any "use a service" problem has always worked. The fact that we're now doing it for inference is the new part.&lt;/p&gt;

&lt;p&gt;I'll keep using this. If the NVIDIA endpoints change, or the Hermes config schema churns again, I'll update the post. Probably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Appendix A — Suggested image directory layout
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;images/
├── hermes-doctor.png               # 'hermes doctor' diagnostic output
├── hermes-config.png               # 'hermes config' resolved settings
├── nvidia-dashboard.png            # build.nvidia.com API key management
├── nvidia-key-creation.png         # NVIDIA API key creation (optional)
├── ollama-running.png              # ollama ps / loaded model
├── ollama-pull.png                 # 'ollama pull' progress bar (optional)
├── workflow-multifile-refactor.gif # multi-file refactor session (optional)
├── workflow-repo-analysis.png      # repo analysis output (optional)
└── failure-mode-429.png            # 429 retry-with-backoff (optional)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Appendix B — &lt;code&gt;assets/commands.sh&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Snippets I keep around as quick-reference. Adjust paths and model identifiers for your setup.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="c"&gt;# --- Hermes install (Linux/macOS/WSL2) ---&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;span class="c"&gt;# Reload shell so 'hermes' is on PATH&lt;/span&gt;
&lt;span class="c"&gt;# shellcheck disable=SC1090&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ZDOTDIR&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/.zshrc"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.bashrc"&lt;/span&gt;

&lt;span class="c"&gt;# --- First-run setup (also offers OpenClaw migration if ~/.openclaw exists) ---&lt;/span&gt;
hermes setup

&lt;span class="c"&gt;# --- Optional: explicit OpenClaw migration ---&lt;/span&gt;
hermes claw migrate &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;span class="c"&gt;# hermes claw migrate --preset user-data&lt;/span&gt;
&lt;span class="c"&gt;# hermes claw migrate --overwrite&lt;/span&gt;

&lt;span class="c"&gt;# --- Provider env: append to ~/.hermes/.env ---&lt;/span&gt;
&lt;span class="nv"&gt;ENV_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes/.env"&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes"&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"NVIDIA_API_KEY=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NVIDIA_API_KEY&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;-me&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"NVIDIA_BASE_URL=https://integrate.api.nvidia.com/v1"&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"OLLAMA_API_KEY=ollama"&lt;/span&gt;                       &lt;span class="c"&gt;# any non-empty value&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"OLLAMA_BASE_URL=http://localhost:11434/v1"&lt;/span&gt;   &lt;span class="c"&gt;# local Ollama, NOT Ollama Cloud&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"HERMES_INFERENCE_PROVIDER=nvidia"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ENV_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# --- Ollama install + local models ---&lt;/span&gt;
&lt;span class="c"&gt;# macOS: brew install ollama &amp;amp;&amp;amp; brew services start ollama&lt;/span&gt;
&lt;span class="c"&gt;# Linux: curl -fsSL https://ollama.com/install.sh | sh &amp;amp;&amp;amp; sudo systemctl enable --now ollama&lt;/span&gt;

ollama pull qwen2.5-coder:7b
ollama pull qwen2.5-coder:14b
ollama pull deepseek-coder-v2:16b
ollama pull llama3.1:8b

&lt;span class="c"&gt;# --- Sanity checks ---&lt;/span&gt;
hermes &lt;span class="nt"&gt;--version&lt;/span&gt;
hermes doctor &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
&lt;/span&gt;hermes config
ollama list
ollama ps

&lt;span class="c"&gt;# --- Smoke tests ---&lt;/span&gt;
ollama run qwen2.5-coder:7b &lt;span class="s2"&gt;"print('hello from local')"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
&lt;/span&gt;curl &lt;span class="nt"&gt;-sS&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NVIDIA_BASE_URL&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;https&lt;/span&gt;://integrate.api.nvidia.com/v1&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/models"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NVIDIA_API_KEY&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;-me&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; 400 &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# --- Switch models from CLI ---&lt;/span&gt;
&lt;span class="c"&gt;# hermes config set model nvidia/meta/llama-3.1-70b-instruct&lt;/span&gt;
&lt;span class="c"&gt;# hermes config set model nvidia/qwen/qwen2.5-coder-32b-instruct&lt;/span&gt;
&lt;span class="c"&gt;#&lt;/span&gt;
&lt;span class="c"&gt;# Local Ollama (provider=custom + base_url; bare 'ollama/...' isn't documented):&lt;/span&gt;
&lt;span class="c"&gt;# hermes config set model.provider custom&lt;/span&gt;
&lt;span class="c"&gt;# hermes config set model.base_url http://localhost:11434/v1&lt;/span&gt;
&lt;span class="c"&gt;# hermes config set model.default qwen2.5-coder:14b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the setup.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Building a Complete Developer Terminal Setup for Claude Code — Part 6: Dotfiles and Wrap-up</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sun, 26 Apr 2026 11:55:31 +0000</pubDate>
      <link>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-6-dotfiles-and-wrap-up-54c0</link>
      <guid>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-6-dotfiles-and-wrap-up-54c0</guid>
      <description>&lt;p&gt;&lt;em&gt;By Avinash, GenAI Practice Lead | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Part 1&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Part 2&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Part 3&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044"&gt;Part 4&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-5-terminal-environment-184h"&gt;Part 5&lt;/a&gt; | Part 6 of 6&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;A setup you can't reproduce is a setup you'll eventually lose. Hard drives fail, Macs get replaced, and without a dotfiles repo everything built in this series disappears with them. This final part covers packaging everything into a maintainable dotfiles repo and wrapping up the series.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Dotfiles Repo
&lt;/h2&gt;

&lt;p&gt;A dotfiles repo is a version-controlled collection of your configuration files. The goal is simple: clone the repo on a new machine, follow the checklist, and have a fully configured environment in under an hour.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/dotfiles
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/dotfiles
git init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copy all configuration files in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude Code&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; .claude/hooks
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.claude/statusline.sh .claude/statusline.sh
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.claude/hooks/notify-stop.sh .claude/hooks/notify-stop.sh
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.claude/hooks/notify-permission.sh .claude/hooks/notify-permission.sh
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.claude/settings.json .claude/settings.json

&lt;span class="c"&gt;# Terminal&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.tmux.conf .tmux.conf
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.zshrc .zshrc

&lt;span class="c"&gt;# Starship&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; .config
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/.config/starship.toml .config/starship.toml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Commit and push:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Initial dotfiles — Claude Code, tmux, starship, zsh"&lt;/span&gt;
gh repo create dotfiles &lt;span class="nt"&gt;--private&lt;/span&gt; &lt;span class="nt"&gt;--source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--push&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  File Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dotfiles/
├── README.md
├── .tmux.conf
├── .zshrc
├── .config/
│   └── starship.toml
└── .claude/
    ├── settings.json
    ├── statusline.sh
    └── hooks/
        ├── notify-stop.sh
        └── notify-permission.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Fresh Machine Checklist
&lt;/h2&gt;

&lt;p&gt;The README in the repo contains the full step-by-step setup guide. The checklist at the end covers every action in sequence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Install Homebrew&lt;/li&gt;
&lt;li&gt;[ ] Install core dependencies: &lt;code&gt;brew install git jq node fzf&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Install iTerm2 and set JetBrains Mono Nerd Font&lt;/li&gt;
&lt;li&gt;[ ] Set terminal type to &lt;code&gt;xterm-256color&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Import tokyo-night color theme&lt;/li&gt;
&lt;li&gt;[ ] Install &lt;code&gt;zsh-autosuggestions&lt;/code&gt; and &lt;code&gt;zsh-syntax-highlighting&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Install fzf and run key bindings setup&lt;/li&gt;
&lt;li&gt;[ ] Install starship and apply tokyo-night preset&lt;/li&gt;
&lt;li&gt;[ ] Install tmux and clone tpm&lt;/li&gt;
&lt;li&gt;[ ] Copy &lt;code&gt;.tmux.conf&lt;/code&gt; and install plugins with &lt;code&gt;Ctrl+B I&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Install Claude Code&lt;/li&gt;
&lt;li&gt;[ ] Copy &lt;code&gt;.claude/settings.json&lt;/code&gt;, &lt;code&gt;statusline.sh&lt;/code&gt;, and hook scripts&lt;/li&gt;
&lt;li&gt;[ ] &lt;code&gt;chmod +x&lt;/code&gt; the statusline and hook scripts&lt;/li&gt;
&lt;li&gt;[ ] Add plugin marketplaces in Claude Code&lt;/li&gt;
&lt;li&gt;[ ] Install all 9 plugins and reload&lt;/li&gt;
&lt;li&gt;[ ] Verify plugin counts: 9 plugins · 35 skills · 18 agents · 10 hooks · 2 plugin MCP servers · 1 plugin LSP server&lt;/li&gt;
&lt;li&gt;[ ] Verify claude-mem worker at &lt;code&gt;http://localhost:37777&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Test sound notifications with &lt;code&gt;afplay&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Test statusline with mock JSON input&lt;/li&gt;
&lt;li&gt;[ ] Create tmux sessions and save layout with &lt;code&gt;Ctrl+B Ctrl+S&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;If I were starting this setup from scratch with the knowledge I have now, I'd install &lt;code&gt;claude-mem&lt;/code&gt; and &lt;code&gt;pyright-lsp&lt;/code&gt; on day one. They have the highest ongoing return of anything in the stack — persistent memory and real-time type checking compound in value over time in a way that one-time tools don't.&lt;/p&gt;

&lt;p&gt;I'd also build the statusline earlier. Flying blind on token usage and rate limits for the first weeks of Pro plan use cost me more than the hour it took to write the script.&lt;/p&gt;

&lt;p&gt;The one thing I'd skip entirely is trying to get visual notification banners working. &lt;code&gt;osascript&lt;/code&gt; and &lt;code&gt;terminal-notifier&lt;/code&gt; are both unreliable on macOS Sequoia. &lt;code&gt;afplay&lt;/code&gt; is the right answer and I should have gone there first.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Terminal&lt;/td&gt;
&lt;td&gt;iTerm2 + tokyo-night&lt;/td&gt;
&lt;td&gt;True color, Nerd Font support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multiplexer&lt;/td&gt;
&lt;td&gt;tmux + resurrect&lt;/td&gt;
&lt;td&gt;Persistent sessions, 3-pane layout&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt&lt;/td&gt;
&lt;td&gt;Starship tokyo-night&lt;/td&gt;
&lt;td&gt;Git, Python, time at a glance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shell&lt;/td&gt;
&lt;td&gt;zsh + autosuggestions + fzf&lt;/td&gt;
&lt;td&gt;Faster command entry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI IDE&lt;/td&gt;
&lt;td&gt;Claude Code&lt;/td&gt;
&lt;td&gt;Primary development tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Statusline&lt;/td&gt;
&lt;td&gt;Custom bash script&lt;/td&gt;
&lt;td&gt;Real-time token/cost/rate limit visibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hooks&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;afplay&lt;/code&gt; sound notifications&lt;/td&gt;
&lt;td&gt;Async task completion awareness&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plugins&lt;/td&gt;
&lt;td&gt;9 curated plugins&lt;/td&gt;
&lt;td&gt;Code review, memory, type checking, workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Config&lt;/td&gt;
&lt;td&gt;Private dotfiles repo&lt;/td&gt;
&lt;td&gt;Reproducible setup in under an hour&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Series
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Part 1 — The Problem and Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.tolink-to-part-2"&gt;Part 2 — Custom Statusline&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Part 3 — Sound Notification Hooks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044"&gt;Part 4 — Plugin Stack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-5-terminal-environment-184h"&gt;Part 5 — Terminal Environment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Part 6 — Dotfiles and Wrap-up &lt;em&gt;(this article)&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All scripts and configuration files are at &lt;a href="https://github.com/ai-with-avinash/claude-code-best-setup" rel="noopener noreferrer"&gt;https://github.com/ai-with-avinash/claude-code-best-setup&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you build on this setup or find improvements, I'd genuinely like to know — leave a comment or open a PR on the dotfiles repo.&lt;/p&gt;

</description>
      <category>claude</category>
      <category>cli</category>
      <category>tooling</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Building a Complete Developer Terminal Setup for Claude Code — Part 5: Terminal Environment</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sun, 26 Apr 2026 11:48:59 +0000</pubDate>
      <link>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-5-terminal-environment-184h</link>
      <guid>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-5-terminal-environment-184h</guid>
      <description>&lt;p&gt;&lt;em&gt;By Avinash, GenAI Practice Lead | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Part 1&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Part 2&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Part 3&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044"&gt;Part 4&lt;/a&gt; | Part 5 of 6&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The terminal environment around Claude Code matters as much as Claude Code itself. You spend hours in this environment — the less friction it has, the more thinking you can direct at the actual work.&lt;/p&gt;

&lt;p&gt;This part covers everything outside Claude Code: iTerm2, tmux, starship, fzf, and zsh plugins.&lt;/p&gt;




&lt;h2&gt;
  
  
  iTerm2
&lt;/h2&gt;

&lt;p&gt;Replace macOS Terminal with iTerm2. The reasons that matter for this setup specifically are true 24-bit color rendering (ANSI color gradients in the statusline render correctly), better escape code support (the blinking compaction warning needs this), and mouse support in tmux.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--cask&lt;/span&gt; iterm2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two settings to configure immediately after installing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Font&lt;/strong&gt; — install JetBrains Mono Nerd Font for starship icons and powerline segments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--cask&lt;/span&gt; font-jetbrains-mono-nerd-font
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in iTerm2: &lt;strong&gt;Settings → Profiles → Text → Font&lt;/strong&gt; → set to &lt;code&gt;JetBrainsMono Nerd Font&lt;/code&gt;, size 13.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terminal type&lt;/strong&gt; — in iTerm2: &lt;strong&gt;Settings → Profiles → Terminal → Report Terminal Type&lt;/strong&gt; → set to &lt;code&gt;xterm-256color&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Color theme&lt;/strong&gt; — download and import the tokyo-night theme:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-L&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; ~/Downloads/tokyo-night.itermcolors &lt;span class="s2"&gt;"https://raw.githubusercontent.com/folke/tokyonight.nvim/main/extras/iterm/tokyonight_night.itermcolors"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then &lt;strong&gt;Settings → Profiles → Colors → Color Presets → Import&lt;/strong&gt; → select the file → apply.&lt;/p&gt;




&lt;h2&gt;
  
  
  tmux
&lt;/h2&gt;

&lt;p&gt;tmux is a terminal multiplexer. For Claude Code development the value is twofold: persistent sessions that survive terminal restarts, and multiple panes visible simultaneously without switching tabs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;tmux
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My standard 3-pane layout:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────┬──────────────┐
│                     │  Logs/Watch  │
│    Claude Code      ├──────────────┤
│                     │     Git      │
└─────────────────────┴──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code runs in the large left pane. Test output or log watching runs top right. Git and manual commands run bottom right. Everything visible at once — no tab switching mid-flow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Essential config&lt;/strong&gt; — add to &lt;code&gt;~/.tmux.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; mouse on              &lt;span class="c"&gt;# trackpad scrolling&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; history-limit 50000   &lt;span class="c"&gt;# large scrollback buffer&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-sg&lt;/span&gt; escape-time 0        &lt;span class="c"&gt;# no delay for escape key — important for Claude Code&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Session persistence&lt;/strong&gt; — install tmux-resurrect and tmux-continuum:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/tmux-plugins/tpm ~/.tmux/plugins/tpm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add to &lt;code&gt;~/.tmux.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @plugin &lt;span class="s1"&gt;'tmux-plugins/tpm'&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @plugin &lt;span class="s1"&gt;'tmux-plugins/tmux-resurrect'&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @plugin &lt;span class="s1"&gt;'tmux-plugins/tmux-continuum'&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @continuum-restore &lt;span class="s1"&gt;'on'&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @resurrect-capture-pane-contents &lt;span class="s1"&gt;'on'&lt;/span&gt;
run &lt;span class="s1"&gt;'~/.tmux/plugins/tpm/tpm'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Press &lt;code&gt;Ctrl+B then I&lt;/code&gt; inside tmux to install plugins. Save layout with &lt;code&gt;Ctrl+B then Ctrl+S&lt;/code&gt;. After this, closing iTerm2 and reopening it restores your exact layout — pane positions, working directories, and running processes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Starship Prompt
&lt;/h2&gt;

&lt;p&gt;The default macOS prompt tells you almost nothing. Starship shows git branch, git status, Python version, and time directly in your prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;ocr-eval-framework on  main [x!?] via 🐍 v3.12.9  18:43
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;[x!?]&lt;/code&gt; git status indicators: &lt;code&gt;x&lt;/code&gt; = staged changes, &lt;code&gt;!&lt;/code&gt; = unstaged modifications, &lt;code&gt;?&lt;/code&gt; = untracked files. At a glance you know your exact repo state without running &lt;code&gt;git status&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;starship
starship preset tokyo-night &lt;span class="nt"&gt;-o&lt;/span&gt; ~/.config/starship.toml
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'eval "$(starship init zsh)"'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.zshrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.zshrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  fzf
&lt;/h2&gt;

&lt;p&gt;fzf replaces linear &lt;code&gt;Ctrl+R&lt;/code&gt; command history search with an interactive fuzzy finder. Type any fragment of a previous command and it filters in real time across your entire history. For long evaluation commands with specific flags you ran three sessions ago, this is invaluable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;fzf
&lt;span class="si"&gt;$(&lt;/span&gt;brew &lt;span class="nt"&gt;--prefix&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/opt/fzf/install  &lt;span class="c"&gt;# say y to all three prompts&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.zshrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three shortcuts now available:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Ctrl+R&lt;/code&gt; — fuzzy search command history&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Ctrl+T&lt;/code&gt; — fuzzy search files, paste path to prompt&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Alt+C&lt;/code&gt; — fuzzy search directories and cd into selected&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  zsh Plugins
&lt;/h2&gt;

&lt;p&gt;Two additions that change daily terminal use immediately:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;zsh-autosuggestions&lt;/strong&gt; shows previous commands in grey as you type based on history. Right arrow to accept the suggestion. After a day of use your muscle memory adapts and you stop retyping long paths from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;zsh-syntax-highlighting&lt;/strong&gt; colors commands green (valid) or red (invalid) as you type. Catches typos before you press Enter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;zsh-autosuggestions zsh-syntax-highlighting
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'source /opt/homebrew/share/zsh-autosuggestions/zsh-autosuggestions.zsh'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.zshrc
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'source /opt/homebrew/share/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.zshrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.zshrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Full Visual Stack
&lt;/h2&gt;

&lt;p&gt;With everything configured, your terminal looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dark navy tokyo-night background in iTerm2&lt;/li&gt;
&lt;li&gt;JetBrains Mono Nerd Font rendering icons cleanly&lt;/li&gt;
&lt;li&gt;tmux status bar at the bottom showing session name and time&lt;/li&gt;
&lt;li&gt;Starship prompt showing directory, git branch, git status, Python version&lt;/li&gt;
&lt;li&gt;Claude Code statusline above the prompt showing tokens, cost, rate limits&lt;/li&gt;
&lt;li&gt;Grey autosuggestions completing commands as you type&lt;/li&gt;
&lt;li&gt;Green/red syntax highlighting on every command&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every layer of information has a deliberate place. Nothing is decorative.&lt;/p&gt;




&lt;p&gt;All configuration files are at &lt;a href="https://github.com/ai-with-avinash/claude-code-best-setup" rel="noopener noreferrer"&gt;https://github.com/ai-with-avinash/claude-code-best-setup&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;← &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044"&gt;Back to Part 4&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-6-dotfiles-and-wrap-up-54c0"&gt;Continue to Part 6 → Dotfiles and Wrap-up&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>cli</category>
      <category>productivity</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Building a Complete Developer Terminal Setup for Claude Code — Part 4: Plugin Stack</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sun, 26 Apr 2026 11:46:21 +0000</pubDate>
      <link>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044</link>
      <guid>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044</guid>
      <description>&lt;p&gt;&lt;em&gt;By Avinash, GenAI Practice Lead | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Part 1&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Part 2&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Part 3&lt;/a&gt; | Part 4 of 6&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Claude Code has a growing plugin ecosystem. The temptation is to install everything — more agents, more skills, more coverage. This is the wrong approach, especially on a Pro plan.&lt;/p&gt;

&lt;p&gt;Every plugin injects instructions into your session context at startup. A plugin with 38 agents and 156 skills adds a meaningful token overhead to every single session, whether you use those skills or not. On Pro where you're paying per input token, a bloated plugin stack is a recurring tax on every conversation.&lt;/p&gt;

&lt;p&gt;I installed, evaluated, and removed several plugins before settling on 9 that earn their place. Here's the final stack and the reasoning behind each.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 9 Plugins
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;caveman&lt;/code&gt;&lt;/strong&gt; — strips filler from Claude's responses. No pleasantries, no hedging, just signal. The benchmarks show ~65% output token reduction on coding tasks. In practice the savings are real and the response style actually improves for tight coding loops — you get code and decisions, not explanations you didn't ask for. Activate with &lt;code&gt;/caveman&lt;/code&gt;, deactivate with &lt;code&gt;"normal mode"&lt;/code&gt; before switching to documentation or client-facing writing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;claude-mem&lt;/code&gt;&lt;/strong&gt; — persistent memory across sessions. This is the most impactful plugin in the stack. Without it every Claude Code session starts completely blank — no knowledge of what you built yesterday, no context on architectural decisions made last week. With it, the session opens with a compressed summary of relevant past work injected automatically. For multi-week projects this eliminates the re-establishment overhead that quietly consumes 10-15 minutes of every session. It runs a background worker on port 37777 with a web viewer for your observation history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;code-review&lt;/code&gt;&lt;/strong&gt; — 5 parallel Sonnet agents reviewing your code before pushes. Covers CLAUDE.md compliance, bug detection, historical context, PR history, and code comments simultaneously. Trigger with &lt;code&gt;/code-review&lt;/code&gt; after any meaningful change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;pr-review-toolkit&lt;/code&gt;&lt;/strong&gt; — deeper review covering tests, error handling, type design, code quality, and simplification. Use this before anything that goes into a production codebase or published whitepaper. Run with &lt;code&gt;/pr-review-toolkit:review-pr all&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;feature-dev&lt;/code&gt;&lt;/strong&gt; — three-agent workflow for new features: explore codebase → architect solution → review quality. Reserve it for moments where you'd naturally step back and think about design before writing code. Overkill for bug fixes, well-suited for new model wrappers or evaluation dimensions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;commit-commands&lt;/code&gt;&lt;/strong&gt; — auto-generates meaningful commit messages from staged changes. Replaces the cognitive overhead of writing commit messages at the end of a long session when you're tired and just want to push.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;context7&lt;/code&gt;&lt;/strong&gt; — pulls current SDK documentation into context automatically when you're working with external libraries. When your code references a LiteLLM function or a Boto3 call, context7 fetches the current docs rather than relying on Claude's training data which may be stale on specific SDK versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;taches-cc-resources&lt;/code&gt;&lt;/strong&gt; — a collection of workflow commands worth knowing: &lt;code&gt;/create-plans&lt;/code&gt; for structured project planning with PLAN.md, &lt;code&gt;/debug-like-expert&lt;/code&gt; for systematic debugging with evidence gathering, and &lt;code&gt;/ask-me-questions&lt;/code&gt; for requirement clarification before starting a large ambiguous task.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;pyright-lsp&lt;/code&gt;&lt;/strong&gt; — Python language server via Pyright. Gives Claude real-time type errors, import resolution, and go-to-definition across your codebase. Without it Claude reads files reactively when something breaks. With it, it sees problems as they exist in your code continuously.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Removed
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;everything-claude-code&lt;/code&gt;&lt;/strong&gt; — 38 agents, 156 skills, 72 legacy command shims. The context footprint is enormous and the coverage largely duplicates what the targeted plugins above already handle. Removed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Installing the Stack
&lt;/h2&gt;

&lt;p&gt;Add the marketplaces first — these commands run &lt;strong&gt;inside a Claude Code session&lt;/strong&gt;, not in your terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin marketplace add anthropics/claude-code
/plugin marketplace add anthropics/claude-plugins-official
/plugin marketplace add glittercowboy/taches-cc-resources
/plugin marketplace add thedotmack/claude-mem
/plugin marketplace add JuliusBrussee/caveman
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then install:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;code-review@claude-code-plugins
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;pr-review-toolkit@claude-code-plugins
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;feature-dev@claude-code-plugins
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;commit-commands@claude-code-plugins
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;context7@claude-plugins-official
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;pyright-lsp@claude-plugins-official
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;taches-cc-resources@taches-cc-resources
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;claude-mem@thedotmack
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;caveman@caveman
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload with &lt;code&gt;/reload-plugins&lt;/code&gt;. Expected output: &lt;code&gt;9 plugins · 35 skills · 18 agents · 10 hooks · 2 plugin MCP servers · 1 plugin LSP server&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Trigger&lt;/th&gt;
&lt;th&gt;When to use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;caveman&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/caveman&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All coding sessions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-mem&lt;/td&gt;
&lt;td&gt;Auto&lt;/td&gt;
&lt;td&gt;Always on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;code-review&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/code-review&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Before any meaningful push&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pr-review-toolkit&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/pr-review-toolkit:review-pr&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Before production or published code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;feature-dev&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/feature-dev&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;New features requiring design thinking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;commit-commands&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/commit-commands:commit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Every commit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;context7&lt;/td&gt;
&lt;td&gt;Auto&lt;/td&gt;
&lt;td&gt;Working with external SDKs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;taches-cc-resources&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;/create-plans&lt;/code&gt;, &lt;code&gt;/debug-like-expert&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Planning and complex debugging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pyright-lsp&lt;/td&gt;
&lt;td&gt;Auto&lt;/td&gt;
&lt;td&gt;Always on for Python projects&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;← &lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Back to Part 3&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-5-terminal-environment-184h"&gt;Continue to Part 5 → Terminal Environment&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>cli</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Building a Complete Developer Terminal Setup for Claude Code — Part 3: Sound Notification Hooks</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sun, 26 Apr 2026 11:40:26 +0000</pubDate>
      <link>https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38</link>
      <guid>https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38</guid>
      <description>&lt;p&gt;&lt;em&gt;By Avinash, GenAI Practice Lead | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Part 1&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Part 2&lt;/a&gt; | Part 3 of 6&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The problem is simple: Claude Code finishes a task while you're reading documentation, reviewing a PR, or staring out the window. You have no idea it's done. You check back 5 minutes later to find it waiting. Multiply this across a full workday and the lost time adds up significantly.&lt;/p&gt;

&lt;p&gt;The solution is a sound notification. Hear the sound, look at the screen. Simple.&lt;/p&gt;

&lt;p&gt;Getting there was less simple than I expected.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Tried First
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;osascript&lt;/code&gt;&lt;/strong&gt; is the standard macOS approach for sending notifications from bash:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;osascript &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s1"&gt;'display notification "Task completed" with title "Claude Code ✅" sound name "Hero"'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On macOS Sequoia this runs silently and does nothing. No error, no notification. Apple tightened notification sandboxing in recent versions and &lt;code&gt;osascript&lt;/code&gt; notifications now require Script Editor to be registered in System Settings → Notifications — and on many machines it never appears there at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;terminal-notifier&lt;/code&gt;&lt;/strong&gt; is the community-standard alternative:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;terminal-notifier
terminal-notifier &lt;span class="nt"&gt;-title&lt;/span&gt; &lt;span class="s2"&gt;"Claude Code ✅"&lt;/span&gt; &lt;span class="nt"&gt;-message&lt;/span&gt; &lt;span class="s2"&gt;"Task completed"&lt;/span&gt; &lt;span class="nt"&gt;-sound&lt;/span&gt; &lt;span class="s2"&gt;"Hero"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Apple Silicon Macs running Sequoia, &lt;code&gt;terminal-notifier&lt;/code&gt; 2.0.0 sends no notification and produces no error. After removing Gatekeeper quarantine flags with &lt;code&gt;sudo xattr -dr com.apple.quarantine&lt;/code&gt;, trying the &lt;code&gt;.app&lt;/code&gt; bundle path directly, and verifying the binary location — still nothing. The package is effectively broken on modern macOS.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Actually Works
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;afplay&lt;/code&gt; is a macOS command-line audio player. It ships with every Mac, requires zero setup, and plays system sounds reliably:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;afplay /System/Library/Sounds/Hero.aiff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No notification banner. Just sound. And for the actual use case — knowing when Claude is done without watching the screen — sound alone is sufficient. You're already in the terminal when you care about the visual output.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Hooks
&lt;/h2&gt;

&lt;p&gt;Claude Code fires hooks on specific events. I set up two hook scripts covering three scenarios:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Task complete&lt;/strong&gt; (&lt;code&gt;Stop&lt;/code&gt; event) → &lt;strong&gt;Hero sound&lt;/strong&gt;&lt;br&gt;
The most important hook. Fires when Claude finishes responding. Deep tone, clearly distinct from system sounds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Permission needed&lt;/strong&gt; (&lt;code&gt;permission_prompt&lt;/code&gt;) → &lt;strong&gt;Glass sound&lt;/strong&gt;&lt;br&gt;
Fires when Claude needs your approval before proceeding. Higher pitch, slightly urgent. You hear this and know you need to look at the screen and make a decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Awaiting input&lt;/strong&gt; (&lt;code&gt;idle_prompt&lt;/code&gt;) → &lt;strong&gt;Ping sound&lt;/strong&gt;&lt;br&gt;
Fires when Claude is waiting for your next message. Softer, lower priority.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;notify-permission.sh&lt;/code&gt; script reads the &lt;code&gt;notification_type&lt;/code&gt; field from the JSON payload using &lt;code&gt;jq&lt;/code&gt; to distinguish between &lt;code&gt;permission_prompt&lt;/code&gt; and &lt;code&gt;idle_prompt&lt;/code&gt; and play the appropriate sound.&lt;/p&gt;


&lt;h2&gt;
  
  
  Wiring Up the Hooks
&lt;/h2&gt;

&lt;p&gt;In &lt;code&gt;~/.claude/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Stop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bash ~/.claude/hooks/notify-stop.sh"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Notification"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"permission_prompt|idle_prompt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bash ~/.claude/hooks/notify-permission.sh"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also add the hook scripts to the permissions allow list so Claude Code doesn't prompt for approval every time they run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"allow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Bash(bash ~/.claude/hooks/notify-stop.sh)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Bash(bash ~/.claude/hooks/notify-permission.sh)"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hooks only activate for new sessions — restart Claude Code after updating &lt;code&gt;settings.json&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Testing the Sounds
&lt;/h2&gt;

&lt;p&gt;Before wiring up the hooks, verify all three sounds play on your machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;afplay /System/Library/Sounds/Hero.aiff
afplay /System/Library/Sounds/Glass.aiff
afplay /System/Library/Sounds/Ping.aiff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If any of these don't play, check System Settings → Sound → Output volume. &lt;code&gt;afplay&lt;/code&gt; respects system volume but is not affected by Do Not Disturb.&lt;/p&gt;




&lt;p&gt;Both hook scripts are at &lt;a href="https://github.com/ai-with-avinash/claude-code-best-setup" rel="noopener noreferrer"&gt;https://github.com/ai-with-avinash/claude-code-best-setup&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;← &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Back to Part 2&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044"&gt;Continue to Part 4 → Plugin Stack&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>claudecode</category>
      <category>terminal</category>
    </item>
    <item>
      <title>Building a Complete Developer Terminal Setup for Claude Code — Part 2: Custom Statusline</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sun, 26 Apr 2026 10:31:31 +0000</pubDate>
      <link>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj</link>
      <guid>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj</guid>
      <description>&lt;h1&gt;
  
  
  Building a Complete Developer Terminal Setup for Claude Code — Part 2: Custom Statusline
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;By Avinash, GenAI Practice Lead | &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Part 1&lt;/a&gt; | Part 2 of 6&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Note: This setup is macOS-specific.&lt;/strong&gt; All tools, commands, and configurations in this series are tested on macOS (Apple Silicon). Linux and Windows users will need to adapt certain steps, particularly around &lt;code&gt;afplay&lt;/code&gt;, &lt;code&gt;brew&lt;/code&gt;, iTerm2, and system font installation.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Claude Code supports a &lt;code&gt;statusLine&lt;/code&gt; configuration that pipes a live JSON object to a bash script on every update. Most developers ignore this. I spent time building it out properly and it's now the most-glanced piece of my development environment.&lt;/p&gt;

&lt;p&gt;Here's what my statusline shows in a real session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Sonnet 4.6 [Pro] | ⎇ feature/docling-eval | in:42k out:8k | ctx 61% | $0.0284 | +142/-38 | 5h 31% ⏱ 3h12m | 7d 18% ⏱ 4d6h
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each segment is deliberate. Let me walk through them.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Each Segment Shows
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Model name&lt;/strong&gt; — &lt;code&gt;Claude Sonnet 4.6 [Pro]&lt;/code&gt;. Useful when switching between Sonnet and Opus mid-project. You always know what you're paying for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Git branch&lt;/strong&gt; — &lt;code&gt;⎇ feature/docling-eval&lt;/code&gt;. The script uses &lt;code&gt;workspace.current_dir&lt;/code&gt; from the JSON payload for accuracy in worktree setups, not the shell's current directory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token usage&lt;/strong&gt; — &lt;code&gt;in:42k out:8k&lt;/code&gt;. Input and output tokens in compact &lt;code&gt;k&lt;/code&gt; format. Abbreviated above 1000, raw below — so early in a session you see &lt;code&gt;in:340 out:89&lt;/code&gt; and it flips to &lt;code&gt;in:1k&lt;/code&gt; naturally as it grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context %&lt;/strong&gt; — &lt;code&gt;ctx 61%&lt;/code&gt; with color thresholds tuned for Pro plan. Green below 50%, yellow at 50%, orange at 60%, red at 75%. These are tighter than defaults because on Pro every token in a compacted context gets re-billed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost&lt;/strong&gt; — &lt;code&gt;$0.0284&lt;/code&gt; to 4 decimal places. Three decimal places rounds sub-cent sessions in a way that loses signal. At 4dp you can see the actual cost of a session clearly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lines changed&lt;/strong&gt; — &lt;code&gt;+142/-38&lt;/code&gt;. Only appears when there are actual file edits. Stays clean during pure Q&amp;amp;A work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5-hour rate limit&lt;/strong&gt; — &lt;code&gt;5h 31% ⏱ 3h12m&lt;/code&gt;. Usage percentage and countdown to reset. On Pro this is what tells you whether to push through a task or wrap up cleanly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7-day rate limit&lt;/strong&gt; — &lt;code&gt;7d 18% ⏱ 4d6h&lt;/code&gt;. The weekly ceiling is the one that bites during multi-day project sprints. Knowing you're at 18% on Wednesday is actionable in a way that daily usage alone isn't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compaction warning&lt;/strong&gt; — at 75% context the display shows a blinking &lt;code&gt;⚠ COMPACT&lt;/code&gt;. This fires earlier than the default because on Pro, hitting auto-compaction mid-task reruns your entire context through the API.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Implementation
&lt;/h2&gt;

&lt;p&gt;The script is pure bash with &lt;code&gt;jq&lt;/code&gt; as the only dependency. &lt;code&gt;jq&lt;/code&gt; is a command-line JSON parser — install it with &lt;code&gt;brew install jq&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Wire it up in &lt;code&gt;~/.claude/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"statusLine"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.claude/statusline.sh"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script reads from stdin (&lt;code&gt;input=$(cat)&lt;/code&gt;), extracts fields with &lt;code&gt;jq&lt;/code&gt;, applies color logic with ANSI escape codes, and outputs a single line. The JSON payload contains everything — model info, context window state, cost, rate limits, and workspace path.&lt;/p&gt;




&lt;h2&gt;
  
  
  Testing Without a Live Session
&lt;/h2&gt;

&lt;p&gt;You don't need to start a Claude Code session to test the script. Pipe mock JSON directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'{"model":{"display_name":"Claude Sonnet 4.6"},"workspace":{"current_dir":"/your/project"},"context_window":{"total_input_tokens":12000,"total_output_tokens":3000,"used_percentage":45},"cost":{"total_cost_usd":0.0084,"total_lines_added":42,"total_lines_removed":7},"rate_limits":{"five_hour":{"used_percentage":38,"resets_at":'&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;7200&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s1"&gt;'},"seven_day":{"used_percentage":22,"resets_at":'&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;345600&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s1"&gt;'}}}'&lt;/span&gt; | ~/.claude/statusline.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reset timestamps are computed from &lt;code&gt;now + seconds&lt;/code&gt; so the countdown shows real numbers.&lt;/p&gt;




&lt;h2&gt;
  
  
  One Tradeoff Worth Knowing
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;used_percentage&lt;/code&gt; field is calculated from input tokens only — it does not include output tokens. So the context % reflects input-side pressure, which is the more meaningful signal for context window exhaustion, but it may differ slightly from what &lt;code&gt;/context&lt;/code&gt; reports. The script displays raw input and output token counts separately precisely for this reason.&lt;/p&gt;




&lt;p&gt;The full script is at &lt;a href="https://github.com/ai-with-avinash/claude-code-best-setup" rel="noopener noreferrer"&gt;https://github.com/ai-with-avinash/claude-code-best-setup&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;← &lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8"&gt;Back to Part 1&lt;/a&gt; | &lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Continue to Part 3 → Sound Notification Hooks&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>claudecode</category>
      <category>terminal</category>
    </item>
    <item>
      <title>Building a Complete Developer Terminal Setup for Claude Code — Part 1: The Problem</title>
      <dc:creator>Avinash Seethalam</dc:creator>
      <pubDate>Sun, 26 Apr 2026 10:12:40 +0000</pubDate>
      <link>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8</link>
      <guid>https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-1-the-problem-5eb8</guid>
      <description>&lt;p&gt;&lt;em&gt;By Avinash, GenAI Practice Lead&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Note: This setup is macOS-specific.&lt;/strong&gt; All tools, commands, and configurations in this series are tested on macOS (Apple Silicon). Linux and Windows users will need to adapt certain steps, particularly around &lt;code&gt;afplay&lt;/code&gt;, &lt;code&gt;brew&lt;/code&gt;, iTerm2, and system font installation.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;I lead a GenAI practice and manage a broad portfolio of active AI projects. A significant part of my day is spent inside Claude Code — building model evaluation frameworks, debugging pipelines, writing architecture documents, and reviewing code with my team. Claude Code is genuinely powerful, but after weeks of daily use I kept running into the same friction points.&lt;/p&gt;

&lt;p&gt;Sessions started completely blind every time. No memory of what we built yesterday, no context on decisions made last week. I'd spend the first 10 minutes of every session re-establishing context that Claude had already processed the day before.&lt;/p&gt;

&lt;p&gt;There was no visibility into token usage or cost while working. I'd hit a rate limit mid-task with no warning, losing momentum at the worst possible moment. On a Pro plan where every token has a cost, flying blind is expensive.&lt;/p&gt;

&lt;p&gt;Claude finishing a task while I was context-switched elsewhere meant I'd come back 5 minutes later to find it waiting. No notification, no signal — just a blinking cursor.&lt;/p&gt;

&lt;p&gt;And the terminal itself was a plain, low-information environment. No git context, no Python version, no time — just a prompt.&lt;/p&gt;

&lt;p&gt;These aren't complaints about Claude Code. They're gaps in the surrounding environment that any developer can close with the right setup. So I spent a day closing them.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Over a focused session I assembled a complete terminal environment optimised specifically for Claude Code development on macOS. Here's the full stack:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code layer:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Custom bash statusline showing model, git branch, token usage, cost, context %, and rate limit countdowns&lt;/li&gt;
&lt;li&gt;Sound notification hooks using macOS &lt;code&gt;afplay&lt;/code&gt; for task completion, permission requests, and idle state&lt;/li&gt;
&lt;li&gt;9 curated plugins covering code review, persistent memory, Python type checking, and workflow automation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Terminal layer:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;iTerm2 with tokyo-night color theme and JetBrains Mono Nerd Font&lt;/li&gt;
&lt;li&gt;tmux with a 3-pane layout and session persistence across restarts&lt;/li&gt;
&lt;li&gt;Starship prompt with tokyo-night preset showing git status and Python version&lt;/li&gt;
&lt;li&gt;fzf for fuzzy command history search&lt;/li&gt;
&lt;li&gt;zsh-autosuggestions and zsh-syntax-highlighting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is committed to a dotfiles repo at &lt;a href="https://github.com/ai-with-avinash/claude-code-best-setup" rel="noopener noreferrer"&gt;https://github.com/ai-with-avinash/claude-code-best-setup&lt;/a&gt; with a fresh machine checklist so the entire setup can be reproduced on a new Mac in under an hour.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Series
&lt;/h2&gt;

&lt;p&gt;This is Part 1 of 6. Each subsequent article covers one layer of the setup in detail:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Part 2&lt;/a&gt;&lt;/strong&gt; — Custom Statusline: real-time token, cost, and rate limit visibility&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/avinash431/-building-a-complete-developer-terminal-setup-for-claude-code-part-3-sound-notification-hooks-4i38"&gt;Part 3&lt;/a&gt;&lt;/strong&gt; — Sound Notification Hooks: knowing when Claude is done without watching the screen&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-4-plugin-stack-1044"&gt;Part 4&lt;/a&gt;&lt;/strong&gt; — Plugin Stack: 9 plugins that earn their place and what I removed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-5-terminal-environment-184h"&gt;Part 5&lt;/a&gt;&lt;/strong&gt; — Terminal Environment: iTerm2, tmux, starship, fzf, and zsh&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-6-dotfiles-and-wrap-up-54c0"&gt;Part 6&lt;/a&gt;&lt;/strong&gt; — Dotfiles Repo: packaging everything for reproducibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each part is self-contained — you can read them in any order depending on what's most relevant to your setup. But if you're starting fresh, the sequence matters and Part 2 is where I'd begin.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;&lt;a href="https://dev.to/avinash431/building-a-complete-developer-terminal-setup-for-claude-code-part-2-custom-statusline-4lmj"&gt;Continue to Part 2 → Custom Statusline&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>claude</category>
      <category>claudecode</category>
    </item>
  </channel>
</rss>
