<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: ramsbaby</title>
    <description>The latest articles on DEV Community by ramsbaby (@ramsbaby).</description>
    <link>https://dev.to/ramsbaby</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3756790%2F3a810b21-0691-4556-8b82-5300804c900d.png</url>
      <title>DEV Community: ramsbaby</title>
      <link>https://dev.to/ramsbaby</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ramsbaby"/>
    <language>en</language>
    <item>
      <title>I turned my Claude Max subscription into a 24/7 AI company — here's what actually happened</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Wed, 18 Mar 2026 10:45:26 +0000</pubDate>
      <link>https://dev.to/ramsbaby/i-turned-my-claude-max-subscription-into-a-247-ai-company-heres-what-actually-happened-33fj</link>
      <guid>https://dev.to/ramsbaby/i-turned-my-claude-max-subscription-into-a-247-ai-company-heres-what-actually-happened-33fj</guid>
      <description>&lt;h1&gt;
  
  
  I turned my Claude Max subscription into a 24/7 AI company — here's what actually happened
&lt;/h1&gt;

&lt;p&gt;Six months ago I asked myself a dumb question: &lt;em&gt;if I have a $100/month Claude Max subscription, why is it only answering my questions for 30 minutes a day?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So I built something. I call it &lt;strong&gt;Jarvis Company&lt;/strong&gt; — a personal AI automation system where Claude runs as a full-time employee 24/7, not a chatbot.&lt;/p&gt;

&lt;p&gt;This is not a tutorial. This is a story about what I built, what actually works, and what I was completely wrong about.&lt;/p&gt;




&lt;h2&gt;
  
  
  The idea
&lt;/h2&gt;

&lt;p&gt;I'm a backend developer. I spend most of my day writing code, reviewing docs, tracking news, maintaining side projects, and forgetting to do all of the above.&lt;/p&gt;

&lt;p&gt;The insight was simple: &lt;strong&gt;Claude doesn't need to sleep.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So instead of using Claude as a question-answering chatbot, I built a structure where Claude runs scheduled jobs — like a company with departments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Jarvis Company (me = CEO)
  └── 7 "team leads" running as cron jobs
        ├── 📡 TrendHunter     — 07:30 daily, scans tech/market news
        ├── 🔍 Audit           — 23:00 daily, reviews bot quality &amp;amp; KPIs
        ├── 📚 Academy         — 20:00 Sunday, generates learning plans
        ├── 🗄️ Recorder        — 23:50 daily, consolidates memory
        ├── ⚙️ Infra           — 09:00 daily, system health checks
        ├── 🚀 Growth          — 09:00 Monday, career &amp;amp; side project nudges
        └── 📣 Brand           — 08:00 Tuesday, open source activity
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each "team lead" is a bash script that calls Claude via the SDK with a specific system prompt and task. Results are posted to a private Discord server where I can see everything in one place.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it actually does today
&lt;/h2&gt;

&lt;p&gt;After months of iteration, this is running in production (on my Mac Mini):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;56 scheduled tasks&lt;/strong&gt; across 7 teams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~40 Claude invocations per day&lt;/strong&gt; on average&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discord as the UI&lt;/strong&gt; — every report, alert, and insight goes to a specific channel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory system&lt;/strong&gt; — RAG-based context that persists across sessions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Escalation logic&lt;/strong&gt; — critical issues get flagged to a "CEO channel" (me)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Real examples of things that happened this week:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Morning briefing summarized 3 papers and 2 product launches relevant to my work&lt;/li&gt;
&lt;li&gt;Audit team caught that my bot was producing zero-tool responses (a regression I'd missed)&lt;/li&gt;
&lt;li&gt;Growth team reminded me I haven't pushed to my open source repo in 11 days&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Is this magic? No. Is it surprisingly useful? Yes.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I got completely wrong
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Wrong assumption #1: More crons = more value&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I peaked at 64 scheduled tasks. Most of them were producing reports nobody read (me included). I just cut it back to 56, and the signal-to-noise ratio improved immediately. Less is more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrong assumption #2: This could be a popular open source tool&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I submitted to Hacker News. Got 2 points and 0 comments.&lt;/p&gt;

&lt;p&gt;Honest self-assessment: this is not a general tool. It's deeply personal infrastructure. It requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Claude Max subscription ($100/month)&lt;/li&gt;
&lt;li&gt;macOS (hardcoded paths everywhere)&lt;/li&gt;
&lt;li&gt;Setting up Discord, webhooks, LaunchAgents&lt;/li&gt;
&lt;li&gt;Understanding what &lt;em&gt;you&lt;/em&gt; want automated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's closer to "sophisticated dotfiles" than a product. And that's fine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrong assumption #3: AI will figure out what's important&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Early versions had Claude summarize everything. The output was verbose and I stopped reading it.&lt;/p&gt;

&lt;p&gt;The real value came when I was &lt;em&gt;specific&lt;/em&gt;: "Find papers published in the last 7 days about WebFlux performance" is 10x better than "summarize tech news."&lt;/p&gt;

&lt;p&gt;The more opinionated the prompt, the more useful the output.&lt;/p&gt;




&lt;h2&gt;
  
  
  The architecture (briefly)
&lt;/h2&gt;

&lt;p&gt;I'm not going to pretend this is easy to replicate. But conceptually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[LaunchAgent / launchd]
    ↓ triggers
[bash script: bot-cron.sh]
    ↓ loads task config from tasks.json
[Claude SDK (Node.js)]
    ↓ system prompt + task context
[Claude runs with tools]
    → reads files, calls APIs, checks logs
    ↓ result
[Discord webhook → private server channel]
    ↓
[Me, reading it over morning coffee]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key design decision: &lt;strong&gt;Claude has tools&lt;/strong&gt;. It's not just generating text — it can read files, run shell commands, query APIs. That's what makes the outputs actually grounded in reality.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;If I were starting over:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with 3 tasks max&lt;/strong&gt; — one morning briefing, one daily recap, one anomaly detector&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Define "done" for each task&lt;/strong&gt; — what does a good output look like? Write it down before prompting.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build the Discord UI first&lt;/strong&gt; — having a readable output channel forces you to make the output worth reading&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track whether you act on it&lt;/strong&gt; — I have 56 tasks but probably only respond to ~20% of outputs. The other 80% are noise I'm slowly eliminating.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Is it worth $100/month?
&lt;/h2&gt;

&lt;p&gt;For me, yes — but not because of the automation.&lt;/p&gt;

&lt;p&gt;The real value is that Claude Max means &lt;strong&gt;no usage anxiety&lt;/strong&gt;. I can let it run all day, fail, retry, and explore without watching a token counter. The 24/7 company only works if you're not rationing.&lt;/p&gt;

&lt;p&gt;If you're on the free tier or a cheaper plan, most of this isn't viable.&lt;/p&gt;




&lt;h2&gt;
  
  
  What you can actually steal from this
&lt;/h2&gt;

&lt;p&gt;Even if you don't build the whole system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The "team" mental model&lt;/strong&gt; — assign Claude a role with a specific responsibility. It produces better outputs than a generic assistant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discord as a personal ops dashboard&lt;/strong&gt; — free, mobile-friendly, and infinitely customizable with webhooks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled briefings&lt;/strong&gt; — even one morning summary cron can meaningfully improve how you start your day&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory as a first-class concern&lt;/strong&gt; — if your AI doesn't remember last week, every conversation starts from zero&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;I'm slowly open-sourcing the interesting pieces — not as a "install this and get a company" package, but as reference implementations of patterns I found useful.&lt;/p&gt;

&lt;p&gt;If you're building something similar — personal AI infrastructure, Claude automation, Discord bots — I'd genuinely like to compare notes.&lt;/p&gt;

&lt;p&gt;The code is at: &lt;a href="https://github.com/ramsbaby/jarvis-discord" rel="noopener noreferrer"&gt;https://github.com/ramsbaby/jarvis-discord&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Drop a comment if you've tried something like this. Especially interested in hearing what you actually &lt;em&gt;stopped&lt;/em&gt; automating.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Claude Max + Node.js + bash + too much coffee.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>automation</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Jarvis — self-hosted AI butler on Discord (Claude-powered)</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Wed, 18 Mar 2026 09:47:03 +0000</pubDate>
      <link>https://dev.to/ramsbaby/jarvis-self-hosted-ai-butler-on-discord-claude-powered-18c9</link>
      <guid>https://dev.to/ramsbaby/jarvis-self-hosted-ai-butler-on-discord-claude-powered-18c9</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Jarvis is a self-hosted personal AI assistant that lives in your Discord server. Powered by Claude, it manages tasks, executes shell commands, monitors systems, stores long-term memory via RAG, and coordinates multiple AI agents — all from your phone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/ramsbaby/jarvis" rel="noopener noreferrer"&gt;https://github.com/ramsbaby/jarvis&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I built this
&lt;/h2&gt;

&lt;p&gt;I wanted an AI assistant that actually &lt;em&gt;does things&lt;/em&gt; — not just chats. One that runs on my own hardware, remembers past conversations, and is reachable from my phone 24/7.&lt;/p&gt;

&lt;p&gt;Discord turned out to be the perfect UI: mobile notifications, slash commands, channel-based context separation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discord (mobile)
     |
 discord-bot.js  &amp;lt;- router
     |
  +--+------------------+
  |  Claude API (sonnet) |
  |  MCP Tools           | &amp;lt;- exec, rag_search, health, file_peek
  |  Agent SDK           | &amp;lt;- multi-agent orchestration
  |  LaunchAgents (macOS)| &amp;lt;- scheduled crons
  +---------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Multi-Agent Council&lt;/strong&gt; — Specialized agents (Career, Finance, Security, Health) run on schedules, post summaries to dedicated channels, and a Council agent synthesizes cross-agent insights.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RAG Memory&lt;/strong&gt; — Conversation history is embedded and searchable. When you reference past discussions, Jarvis searches vector memory, not just the current context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real Execution&lt;/strong&gt; — Run shell commands, tail logs, check system health, manage files — all from Discord mobile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Schedule-Aware&lt;/strong&gt; — Integrated with external calendars for real-time schedule + earnings data.&lt;/p&gt;




&lt;h2&gt;
  
  
  Install in 30 seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/ramsbaby/jarvis
&lt;span class="nb"&gt;cd &lt;/span&gt;jarvis
bash install.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requires: Node 22+, macOS, Claude API key, Discord bot token.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discord &amp;gt; custom web UI&lt;/strong&gt; for personal AI. Mobile-first, always-on, zero maintenance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LaunchAgents &amp;gt; crontab&lt;/strong&gt; on macOS — more reliable, survives reboots, proper logging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP structured tools&lt;/strong&gt; make LLM responses dramatically more accurate than free-form prompts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG memory changes the relationship&lt;/strong&gt; — it stops feeling like a tool and starts feeling like a collaborator.&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;v1.1.0 | 64 E2E tests | CI passing&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/ramsbaby/jarvis" rel="noopener noreferrer"&gt;https://github.com/ramsbaby/jarvis&lt;/a&gt; ⭐&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>selfhosted</category>
      <category>discord</category>
      <category>claude</category>
    </item>
    <item>
      <title>I built Jarvis — a self-hosted AI butler on Discord powered by Claude</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Wed, 18 Mar 2026 09:46:35 +0000</pubDate>
      <link>https://dev.to/ramsbaby/i-built-jarvis-a-self-hosted-ai-butler-on-discord-powered-by-claude-3eme</link>
      <guid>https://dev.to/ramsbaby/i-built-jarvis-a-self-hosted-ai-butler-on-discord-powered-by-claude-3eme</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Jarvis is a self-hosted personal AI assistant that lives in your Discord server. It runs on Claude (Anthropic), manages tasks, reads Preply schedules, executes shell commands, monitors systems, stores long-term memory via RAG, and coordinates multiple AI agents — all from your phone via Discord.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/ramsbaby/jarvis" rel="noopener noreferrer"&gt;https://github.com/ramsbaby/jarvis&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I built this
&lt;/h2&gt;

&lt;p&gt;I wanted an AI assistant that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lives on my phone&lt;/strong&gt; (Discord mobile)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remembers context&lt;/strong&gt; across sessions (RAG memory)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actually executes things&lt;/strong&gt; — not just chat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs on my own hardware&lt;/strong&gt; — no cloud lock-in&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Is extensible&lt;/strong&gt; — new agents, new tools, new channels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Existing solutions were either too limited (just chat) or too complex to self-host. So I built Jarvis.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discord (mobile UI)
    │
    ▼
discord-bot.js        ← message router
    │
    ├── Claude API    ← LLM backbone (claude-3-5-sonnet)
    ├── MCP Tools     ← exec / file / health / rag / nexus
    ├── Agent SDK     ← multi-agent orchestration
    └── LaunchAgents  ← macOS scheduled crons
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Jarvis runs as a &lt;strong&gt;macOS LaunchAgent&lt;/strong&gt; (launchd), auto-starting on login. All scheduled tasks (daily KPIs, security scans, weekly career reviews) are also LaunchAgents — no crontab hacks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🤖 Multi-Agent Council
&lt;/h3&gt;

&lt;p&gt;Multiple specialized agents (Career, Finance, Security, Health) run on schedules and post summaries to dedicated Discord channels. A Council agent synthesizes insights across all agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧠 RAG Memory
&lt;/h3&gt;

&lt;p&gt;Conversation history + context files are embedded and searchable. Ask "기억해?" ("remember?") and Jarvis searches vector memory, not just the current context window.&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚙️ Real Execution
&lt;/h3&gt;

&lt;p&gt;Jarvis can run shell commands, peek at files, tail logs, check system health, list/create/delete cron jobs — all from Discord.&lt;/p&gt;

&lt;h3&gt;
  
  
  📅 Schedule-Aware
&lt;/h3&gt;

&lt;p&gt;Integrated with Preply (wife's tutoring schedule) — ask "오늘 수업 있어?" and get today's lessons with earnings.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔒 Secret-Safe
&lt;/h3&gt;

&lt;p&gt;API keys live in &lt;code&gt;$BOT_HOME/config/secrets/&lt;/code&gt; (gitignored). The install script sets up the directory structure and guides through token setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Installation (30 seconds)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/ramsbaby/jarvis
&lt;span class="nb"&gt;cd &lt;/span&gt;jarvis
bash install.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requires: Node 22+, macOS (LaunchAgents), Claude API key, Discord bot token.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Discord is a surprisingly great AI UI.&lt;/strong&gt; Mobile notifications, slash commands, channel-based context separation — it works better than any custom web UI I could have built.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. LaunchAgents &amp;gt; crontab.&lt;/strong&gt; On macOS, &lt;code&gt;com.vix.cron&lt;/code&gt; is often disabled. LaunchAgents are more reliable, survive reboots, and have proper logging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. MCP (Model Context Protocol) is genuinely powerful.&lt;/strong&gt; Giving Claude structured tools (exec, rag_search, health) instead of free-form text makes responses dramatically more accurate and actionable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. RAG memory changes the relationship with an AI.&lt;/strong&gt; When it actually remembers past conversations, it stops feeling like a tool and starts feeling like a collaborator.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current Status
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;v1.1.0 released (2026-03-17)&lt;/li&gt;
&lt;li&gt;64 E2E tests passing&lt;/li&gt;
&lt;li&gt;GitHub Actions CI passing (Node 22)&lt;/li&gt;
&lt;li&gt;Used daily for task management, system monitoring, schedule tracking&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Phase 3: Discord &lt;code&gt;/approve&lt;/code&gt; for AI-drafted documents&lt;/li&gt;
&lt;li&gt;Web dashboard (read-only)&lt;/li&gt;
&lt;li&gt;Linux / systemd support&lt;/li&gt;
&lt;li&gt;More agent types (Finance deep-dive, Travel planner)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;If you're building something similar or have feedback, I'd love to hear it. Star the repo if it's useful ⭐&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/ramsbaby/jarvis" rel="noopener noreferrer"&gt;https://github.com/ramsbaby/jarvis&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>claude</category>
      <category>selfhosted</category>
      <category>discord</category>
    </item>
    <item>
      <title>Self-hosted AI ops on Mac mini: Claude Max + Discord + MCP at $0/month extra</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Tue, 17 Mar 2026 04:34:53 +0000</pubDate>
      <link>https://dev.to/ramsbaby/self-hosted-ai-ops-on-mac-mini-claude-max-discord-mcp-at-0month-extra-53hb</link>
      <guid>https://dev.to/ramsbaby/self-hosted-ai-ops-on-mac-mini-claude-max-discord-mcp-at-0month-extra-53hb</guid>
      <description>&lt;p&gt;Running a personal AI operations system for 2 months now. Here's the full architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hardware &amp;amp; Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hardware:&lt;/strong&gt; Mac mini M2 (always on, silent, no cloud bill)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime:&lt;/strong&gt; Node.js + bash&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory:&lt;/strong&gt; LanceDB (local vector) + SQLite (queues)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UI:&lt;/strong&gt; Discord&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI engine:&lt;/strong&gt; &lt;code&gt;claude -p&lt;/code&gt; headless CLI (included in Claude Max $20/month)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What runs 24/7
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reactive Discord chat&lt;/strong&gt; — multi-turn threads, session memory, RAG knowledge base&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;30 scheduled cron tasks&lt;/strong&gt; — standups, market alerts, code audits, log rotation, vault sync&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;11 specialized agent teams&lt;/strong&gt; — each has a role, schedule, and output channel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-healing infrastructure&lt;/strong&gt; — 4-layer recovery: preflight → keepalive → watchdog → Claude AI diagnosis&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Context Intelligence Gateway
&lt;/h2&gt;

&lt;p&gt;Main problem with 24/7 Claude agents: context window fills up in 20-30 minutes on heavy workloads.&lt;/p&gt;

&lt;p&gt;Built a local MCP server (Nexus CIG) that intercepts every tool output, classifies type (JSON/log/table/markdown), extracts signal, drops noise before it hits Claude's context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;315 KB raw output → 5.4 KB compressed (~98%)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sessions now run 3+ hours reliably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Uptime:&lt;/strong&gt; 99.7%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context savings:&lt;/strong&gt; ~98% on heavy tool use&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incidents:&lt;/strong&gt; 14 total, 9 auto-resolved (64% autonomous recovery)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extra cost:&lt;/strong&gt; $0/month&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Open source (MIT)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Ramsbaby/claude-discord-bridge" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/claude-discord-bridge&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Ramsbaby/openclaw-self-healing" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/openclaw-self-healing&lt;/a&gt; — autonomous crash recovery&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Ramsbaby/openclaw-memorybox" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/openclaw-memorybox&lt;/a&gt; — memory hygiene CLI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy to answer questions about any part of the architecture.&lt;/p&gt;

</description>
      <category>selfhosted</category>
      <category>claude</category>
      <category>mcp</category>
      <category>discord</category>
    </item>
    <item>
      <title>I turned my idle Claude Max subscription into a 24/7 AI company — $0 extra</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Tue, 17 Mar 2026 04:34:05 +0000</pubDate>
      <link>https://dev.to/ramsbaby/i-turned-my-idle-claude-max-subscription-into-a-247-ai-company-0-extra-54g6</link>
      <guid>https://dev.to/ramsbaby/i-turned-my-idle-claude-max-subscription-into-a-247-ai-company-0-extra-54g6</guid>
      <description>&lt;p&gt;Two months ago I realized my Claude Max subscription was idle 23 hours a day. So I built a 24/7 AI operations system around it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;11 specialized AI teams&lt;/strong&gt; — Infra, Trend, Record, Brand, Academy, Career, Standup, Security, Finance, Recon, Council&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;30 automated cron tasks&lt;/strong&gt; — morning standups, market alerts, code audits, memory management, log rotation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Discord chat&lt;/strong&gt; — ask questions, Claude answers instantly in threads with context memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;4-layer self-healing&lt;/strong&gt; — crashes auto-diagnose using Claude, recover in ~30 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;98% context compression&lt;/strong&gt; — custom MCP server keeps sessions running 3+ hours&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The key insight
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;claude -p&lt;/code&gt; is Claude Code's headless CLI, included in the Max subscription. Spawn unlimited processes at zero incremental cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Context Intelligence Gateway (Nexus CIG)
&lt;/h2&gt;

&lt;p&gt;Built a local MCP server that sits between Claude and every system call. Instead of dumping raw logs/JSON into context, it classifies output type, extracts signal, drops noise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; 315 KB → 5.4 KB (98%). Sessions run 3+ hours vs 30 min without it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numbers after 2 months
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$0 extra per month&lt;/strong&gt; (Max subscription only)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;99.7% uptime&lt;/strong&gt; (14 incidents, 9 resolved autonomously)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3+ hour sessions&lt;/strong&gt; with Nexus CIG active&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;60/61 E2E tests&lt;/strong&gt; passing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  GitHub
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Main repo:&lt;/strong&gt; &lt;a href="https://github.com/Ramsbaby/claude-discord-bridge" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/claude-discord-bridge&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-healing:&lt;/strong&gt; &lt;a href="https://github.com/Ramsbaby/openclaw-self-healing" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/openclaw-self-healing&lt;/a&gt; (28 ⭐)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory CLI:&lt;/strong&gt; &lt;a href="https://github.com/Ramsbaby/openclaw-memorybox" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/openclaw-memorybox&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Would love to hear if anyone else is running Claude agents 24/7. What's your approach to context management?&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>discord</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Finally Fixed My AI Self-Healing System (After It Was Broken for Weeks)</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Mon, 23 Feb 2026 02:07:04 +0000</pubDate>
      <link>https://dev.to/ramsbaby/i-finally-fixed-my-ai-self-healing-system-after-it-was-broken-for-weeks-io7</link>
      <guid>https://dev.to/ramsbaby/i-finally-fixed-my-ai-self-healing-system-after-it-was-broken-for-weeks-io7</guid>
      <description>&lt;h1&gt;
  
  
  I Finally Fixed My AI Self-Healing System (After It Was Broken for Weeks)
&lt;/h1&gt;

&lt;p&gt;I built an AI-powered self-healing system for my personal AI Gateway. It has four tiers of recovery, uses Claude Code as an emergency doctor, and sends alerts to Discord.&lt;/p&gt;

&lt;p&gt;It was also completely broken at the most critical handoff — for weeks — and I had no idea.&lt;/p&gt;

&lt;p&gt;This is the story of how I found the bug, why it was invisible, and what v3.1.0 does to fix it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The System I Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/Ramsbaby/openclaw-self-healing" rel="noopener noreferrer"&gt;OpenClaw Self-Healing&lt;/a&gt; is a 4-tier autonomous recovery system for OpenClaw Gateway (an AI agent orchestration service). The tiers work like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Level 1 → launchd/systemd auto-restart
          (process crash → restart in seconds)
            ↓ (if still failing after 3 retries)
Level 2 → gateway-watchdog.sh
          (HTTP health check every 60s)
            ↓ (if failing for 30+ minutes)
Level 3 → emergency-recovery-v2.sh
          (Claude Code AI diagnosis + repair)
            ↓ (always, on Level 3 trigger)
Level 4 → Discord/Telegram notification
          (human escalation)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The idea: small problems fix themselves automatically. Big problems get an AI doctor. Critical problems wake up the human.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bug I Missed
&lt;/h2&gt;

&lt;p&gt;Here's what &lt;code&gt;gateway-watchdog.sh&lt;/code&gt; looked like in v3.0.0 (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;consecutive_failures&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;consecutive_failures &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$consecutive_failures&lt;/span&gt; &lt;span class="nt"&gt;-ge&lt;/span&gt; 30 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;log &lt;span class="s2"&gt;"CRITICAL: 30+ consecutive failures. Escalating to Level 3..."&lt;/span&gt;
    &lt;span class="c"&gt;# TODO: invoke emergency recovery&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See it? The log message says "Escalating to Level 3." The code never actually does it.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;emergency-recovery-v2.sh&lt;/code&gt; script existed. It was tested in isolation. But the watchdog never called it. The chain was broken at exactly the point where it mattered most.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It Was Invisible
&lt;/h2&gt;

&lt;p&gt;This is the insidious part: &lt;strong&gt;everything looked fine.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Level 1 was working (launchd restarted crashed processes)&lt;/li&gt;
&lt;li&gt;Level 2 was running (the watchdog was checking health every 60s)&lt;/li&gt;
&lt;li&gt;Logs showed activity: &lt;code&gt;[WARN] Health check failed (attempt 2/3)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The system appeared operational&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only way to catch this was to simulate a long-duration failure — one that lasted more than 30 minutes — and watch whether Level 3 actually fired.&lt;/p&gt;

&lt;p&gt;I hadn't done that test. I assumed the chain worked because each component started successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Second Bug
&lt;/h2&gt;

&lt;p&gt;While fixing the watchdog, I found another problem.&lt;/p&gt;

&lt;p&gt;The installer (&lt;code&gt;install.sh&lt;/code&gt;) in v3.0.0 set up Level 2 (the health check cron) and Level 1 (the launchd plist). But it never created &lt;code&gt;.env&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Without &lt;code&gt;.env&lt;/code&gt;, &lt;code&gt;emergency-recovery-v2.sh&lt;/code&gt; would fail silently — no &lt;code&gt;DISCORD_WEBHOOK_URL&lt;/code&gt;, no &lt;code&gt;CLAUDE_API_KEY&lt;/code&gt;. Level 3 would attempt to run and immediately exit. Level 4 notifications would never fire.&lt;/p&gt;

&lt;p&gt;Both ends of the chain were broken.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v3.1.0 Fixes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The Escalation Logic
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# gateway-watchdog.sh v4.1&lt;/span&gt;
&lt;span class="nv"&gt;consecutive_failures&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;consecutive_failures &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$consecutive_failures&lt;/span&gt; &lt;span class="nt"&gt;-ge&lt;/span&gt; 30 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;log &lt;span class="s2"&gt;"CRITICAL: 30+ consecutive failures. Invoking Level 3 emergency recovery..."&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EMERGENCY_RECOVERY_SCRIPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;bash &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EMERGENCY_RECOVERY_SCRIPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &amp;amp;
        &lt;span class="nv"&gt;emergency_pid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$!&lt;/span&gt;
        log &lt;span class="s2"&gt;"Emergency recovery launched (PID: &lt;/span&gt;&lt;span class="nv"&gt;$emergency_pid&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
    &lt;span class="k"&gt;else
        &lt;/span&gt;log &lt;span class="s2"&gt;"ERROR: Emergency recovery script not found at &lt;/span&gt;&lt;span class="nv"&gt;$EMERGENCY_RECOVERY_SCRIPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        notify_discord &lt;span class="s2"&gt;"Level 3 unavailable — manual intervention required"&lt;/span&gt;
    &lt;span class="k"&gt;fi
fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix is three lines. The diagnosis took two days.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Graceful .env Loading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# emergency-recovery-v2.sh&lt;/span&gt;
&lt;span class="nv"&gt;ENV_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/.openclaw/.env"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ENV_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="c"&gt;# shellcheck source=/dev/null&lt;/span&gt;
    &lt;span class="nb"&gt;source&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ENV_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    log &lt;span class="s2"&gt;"Environment loaded from &lt;/span&gt;&lt;span class="nv"&gt;$ENV_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;log &lt;span class="s2"&gt;"WARNING: .env not found at &lt;/span&gt;&lt;span class="nv"&gt;$ENV_FILE&lt;/span&gt;&lt;span class="s2"&gt; — Discord alerts may fail"&lt;/span&gt;
    &lt;span class="c"&gt;# Continue anyway; Claude recovery can still run&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script now handles missing &lt;code&gt;.env&lt;/code&gt; gracefully instead of dying silently.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Installer Rewrite
&lt;/h3&gt;

&lt;p&gt;The old installer did this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create launchd plist for Level 1&lt;/li&gt;
&lt;li&gt;Set up health check cron for Level 2&lt;/li&gt;
&lt;li&gt;Done ✓&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The new installer does this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Interactive setup (collect API keys, webhook URLs)&lt;/li&gt;
&lt;li&gt;Create &lt;code&gt;.env&lt;/code&gt; with all required variables&lt;/li&gt;
&lt;li&gt;Create launchd plist for Level 1&lt;/li&gt;
&lt;li&gt;Set up health check cron for Level 2&lt;/li&gt;
&lt;li&gt;Wire Level 2 → Level 3 escalation path&lt;/li&gt;
&lt;li&gt;Verify the full chain is connected&lt;/li&gt;
&lt;li&gt;Run a test invocation of each level&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Step 6 is new. The installer now outputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✅ Level 1: launchd plist loaded
✅ Level 2: watchdog cron active  
✅ Level 3: emergency recovery script found + executable
✅ Level 4: Discord webhook reachable (HTTP 204)
✅ Chain verification: all levels connected
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If any level fails verification, the installer tells you what's wrong and how to fix it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Meta-Lesson
&lt;/h2&gt;

&lt;p&gt;I fell into a trap I see often in infrastructure work:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I have a monitoring/alerting/recovery system" ≠ "my monitoring/alerting/recovery chain is actually connected."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Testing that each component &lt;em&gt;starts&lt;/em&gt; is not the same as testing that the system &lt;em&gt;works end-to-end&lt;/em&gt;. A log message that says "escalating" is not the same as code that escalates.&lt;/p&gt;

&lt;p&gt;For any multi-tier recovery system, the tests that matter are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Simulate a failure that triggers Level N&lt;/li&gt;
&lt;li&gt;Verify Level N+1 actually runs&lt;/li&gt;
&lt;li&gt;Verify Level N+1's output reaches the next tier&lt;/li&gt;
&lt;li&gt;Repeat for every handoff&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's tedious. It's the only way to know.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install / Upgrade
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Fresh install&lt;/span&gt;
curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://raw.githubusercontent.com/Ramsbaby/openclaw-self-healing/main/install.sh | bash

&lt;span class="c"&gt;# Upgrade from v3.0.0&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/openclaw-self-healing
git pull origin main
bash install.sh &lt;span class="nt"&gt;--upgrade&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supports macOS (launchd) and Linux (systemd). Level 3 requires &lt;code&gt;CLAUDE_API_KEY&lt;/code&gt;. Levels 1, 2, and 4 work without it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Ramsbaby/openclaw-self-healing" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/openclaw-self-healing&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you've built something similar — or found a similar "looks fine, is broken" bug in your own infrastructure — I'd love to hear about it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>selfhosted</category>
      <category>automation</category>
    </item>
    <item>
      <title>Building a Self-Healing AI System with Claude Code: When Your AI Doctor Fixes Your AI Assistant</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Mon, 09 Feb 2026 11:54:48 +0000</pubDate>
      <link>https://dev.to/ramsbaby/building-a-self-healing-ai-system-with-claude-code-when-your-ai-doctor-fixes-your-ai-assistant-28h4</link>
      <guid>https://dev.to/ramsbaby/building-a-self-healing-ai-system-with-claude-code-when-your-ai-doctor-fixes-your-ai-assistant-28h4</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: 3 AM Wake-Up Calls
&lt;/h2&gt;

&lt;p&gt;You know that feeling when your home automation dies at 3 AM and you have to SSH into your server half-asleep? I got tired of it.&lt;/p&gt;

&lt;p&gt;I run &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt;, a Claude-powered AI assistant platform, on my home server. It's great when it works, but servers crash. Configs get corrupted. Things break.&lt;/p&gt;

&lt;p&gt;So I built a self-healing system that uses &lt;strong&gt;Claude Code as an emergency doctor&lt;/strong&gt; for my AI assistant.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: 4 Levels of Recovery
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Level 1: Quick Restart (30 sec)
    ↓ if fails
Level 2: Claude API Diagnostics  
    ↓ if fails
Level 3: Claude Code Deep Repair
    ↓ if fails  
Level 4: Recovery Mode + Alerts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Level 1: The Watchdog
&lt;/h3&gt;

&lt;p&gt;A simple LaunchAgent that runs every 60 seconds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; pgrep &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"openclaw gateway"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;openclaw gateway start
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Gateway restarted at &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/openclaw-recovery.log
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This catches 90% of issues. Simple crashes? Fixed in 30 seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 2: Intelligent Diagnostics
&lt;/h3&gt;

&lt;p&gt;When Level 1 fails 3 times, we call the Claude API to analyze the situation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"https://api.anthropic.com/v1/messages"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-api-key: &lt;/span&gt;&lt;span class="nv"&gt;$ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "claude-sonnet-4-20250514",
    "messages": [{
      "role": "user", 
      "content": "Analyze this error and suggest fixes: '&lt;/span&gt;&lt;span class="nv"&gt;$ERROR_LOG&lt;/span&gt;&lt;span class="s1"&gt;'"
    }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude identifies patterns like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Config syntax errors&lt;/li&gt;
&lt;li&gt;Permission issues
&lt;/li&gt;
&lt;li&gt;Port conflicts&lt;/li&gt;
&lt;li&gt;Missing dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Level 3: Claude Code as Emergency Doctor
&lt;/h3&gt;

&lt;p&gt;This is where it gets interesting. When diagnostics aren't enough, we unleash Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/opt/homebrew/bin/claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"You are an emergency recovery agent. The OpenClaw gateway is down.
   Error: &lt;/span&gt;&lt;span class="nv"&gt;$ERROR_LOG&lt;/span&gt;&lt;span class="s2"&gt;

   1. Analyze the root cause
   2. Attempt to fix it
   3. Validate the fix
   4. Report what you did"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read and edit config files&lt;/li&gt;
&lt;li&gt;Check system state&lt;/li&gt;
&lt;li&gt;Run diagnostic commands&lt;/li&gt;
&lt;li&gt;Apply fixes autonomously&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Level 4: Recovery Mode
&lt;/h3&gt;

&lt;p&gt;If all else fails, the system:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Sends Discord/Telegram alerts&lt;/li&gt;
&lt;li&gt;Creates a detailed incident report&lt;/li&gt;
&lt;li&gt;Enters safe mode with minimal config&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Real-World Example
&lt;/h2&gt;

&lt;p&gt;Last week, a config corruption crashed my gateway. Here's what happened:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;19:15:00 - Gateway crashed
19:15:30 - Level 1: Restart attempted (failed - config error)
19:16:00 - Level 1: Retry (failed)
19:16:30 - Level 1: Retry (failed)
19:17:00 - Level 2: Claude API diagnosed "Invalid JSON in config"
19:17:05 - Level 3: Claude Code activated
19:17:30 - Claude Code: Found corrupted unicode character
19:17:45 - Claude Code: Fixed config, validated syntax
19:18:00 - Gateway restarted successfully
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Total downtime: ~3 minutes.&lt;/strong&gt; Without this system? I'd have woken up to broken automations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Lessons
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Lock mechanism is crucial&lt;/strong&gt; - Prevent race conditions between watchdog and recovery scripts
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;LOCK_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/tmp/openclaw-emergency-recovery.lock"&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Recovery in progress, skipping..."&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Absolute paths for LaunchAgents&lt;/strong&gt; - Limited PATH means you need full paths like &lt;code&gt;/opt/homebrew/bin/claude&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Graduated escalation&lt;/strong&gt; - Don't jump to heavy solutions; simple restarts solve most problems&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Get the Code
&lt;/h2&gt;

&lt;p&gt;The full implementation is open source:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Ramsbaby/openclaw-self-healing" rel="noopener noreferrer"&gt;Ramsbaby/openclaw-self-healing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installation scripts&lt;/li&gt;
&lt;li&gt;LaunchAgent configurations
&lt;/li&gt;
&lt;li&gt;All recovery scripts&lt;/li&gt;
&lt;li&gt;Detailed documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;I'm working on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pattern learning&lt;/strong&gt; - Remember past failures to prevent recurrence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictive health&lt;/strong&gt; - Detect issues before they cause crashes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-node support&lt;/strong&gt; - Coordinate recovery across distributed systems&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Have you built self-healing systems? I'd love to hear about your approaches in the comments!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This post is part of my journey building autonomous AI infrastructure. Follow for more posts about Claude, home automation, and making computers fix themselves.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudeai</category>
      <category>selfhosted</category>
      <category>devops</category>
    </item>
    <item>
      <title>I Built a Self-Healing AI System Using Claude Code as Emergency Doctor</title>
      <dc:creator>ramsbaby</dc:creator>
      <pubDate>Sat, 07 Feb 2026 01:26:05 +0000</pubDate>
      <link>https://dev.to/ramsbaby/i-built-a-self-healing-ai-system-using-claude-code-as-emergency-doctor-183f</link>
      <guid>https://dev.to/ramsbaby/i-built-a-self-healing-ai-system-using-claude-code-as-emergency-doctor-183f</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;My OpenClaw AI gateway kept crashing at 3am. Every morning I'd wake up to a dead agent. Manual restarts were getting old.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: 4-Tier Self-Healing
&lt;/h2&gt;

&lt;p&gt;I built an autonomous recovery system with escalating levels:&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 1: Watchdog (180s interval)
&lt;/h3&gt;

&lt;p&gt;Simple process monitoring. If the gateway process dies, restart it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 2: Health Check (5min interval)
&lt;/h3&gt;

&lt;p&gt;HTTP 200 verification with 3 retries. Catches zombie processes that are running but not responding.&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 3: Claude Code Doctor (30min session)
&lt;/h3&gt;

&lt;p&gt;Here's where it gets interesting. When Level 2 fails 3 times, the system spawns a &lt;strong&gt;Claude Code session in tmux&lt;/strong&gt;. Claude reads logs, diagnoses issues, and attempts autonomous repair.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude runs in PTY, has full system access&lt;/span&gt;
tmux new-session &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; emergency-recovery
tmux send-keys &lt;span class="s2"&gt;"claude --dangerously-skip-permissions"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Level 4: Discord Alert
&lt;/h3&gt;

&lt;p&gt;If even Claude can't fix it, humans get notified.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;This is &lt;strong&gt;AI fixing AI&lt;/strong&gt;. The meta-level self-healing: when your AI agent dies, another AI diagnoses and repairs it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Verified recovery on Feb 5, 2026&lt;/li&gt;
&lt;li&gt;30-minute autonomous diagnosis window&lt;/li&gt;
&lt;li&gt;Zero 3am wake-up calls since deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  One-Click Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://raw.githubusercontent.com/Ramsbaby/openclaw-self-healing/main/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Ramsbaby/openclaw-self-healing" rel="noopener noreferrer"&gt;https://github.com/Ramsbaby/openclaw-self-healing&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Built by Jarvis (an AI assistant) for ramsbaby. Questions welcome!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>selfhosted</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
