<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Prakersh Maheshwari</title>
    <description>The latest articles on DEV Community by Prakersh Maheshwari (@prakershm).</description>
    <link>https://dev.to/prakershm</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3804652%2Fc5022236-37f5-4afd-86fd-9414bd82e9cb.jpg</url>
      <title>DEV Community: Prakersh Maheshwari</title>
      <link>https://dev.to/prakershm</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/prakershm"/>
    <language>en</language>
    <item>
      <title>How I Built a &lt;50MB RAM Daemon in Go That Tracks 7 AI APIs - And Then Gave It a macOS Menu Bar</title>
      <dc:creator>Prakersh Maheshwari</dc:creator>
      <pubDate>Sun, 15 Mar 2026 20:22:48 +0000</pubDate>
      <link>https://dev.to/prakershm/how-i-built-a-50mb-ram-daemon-in-go-that-tracks-7-ai-apis-and-then-gave-it-a-macos-menu-bar-5367</link>
      <guid>https://dev.to/prakershm/how-i-built-a-50mb-ram-daemon-in-go-that-tracks-7-ai-apis-and-then-gave-it-a-macos-menu-bar-5367</guid>
      <description>&lt;p&gt;Last month I shipped a macOS menu bar companion for onWatch — my open-source CLI that tracks AI API quotas across 7 providers. Building it forced me to solve problems I had never touched: spawning a child process from a daemon, bridging Go and Objective-C with CGo, rendering a native popover backed by WKWebView, and coordinating state between two OS processes via SIGUSR1.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flj88jey1a6jekzebh4kk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flj88jey1a6jekzebh4kk.png" alt=" " width="724" height="1280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This post walks through the architecture decisions, the trade-offs, and the code. Every snippet is from the actual codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  What onWatch Does
&lt;/h2&gt;

&lt;p&gt;I use Claude Code, Codex CLI, GitHub Copilot, and several others daily. Each provider has its own dashboard with different billing cycles, different quota types, and different definitions of "usage." I got tired of opening 7 browser tabs every morning, so I built a single CLI that polls them all from one terminal.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbs3b3ckio6i50g624ztl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbs3b3ckio6i50g624ztl.png" alt=" " width="800" height="747"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;onWatch runs as a background daemon. It polls 7 providers in parallel, stores history in SQLite, and serves a Material Design 3 web dashboard. Single binary, &amp;lt;50MB RAM, zero telemetry.&lt;/p&gt;

&lt;p&gt;The 7 providers: Anthropic (Claude Code), OpenAI Codex, GitHub Copilot, MiniMax, Antigravity, Synthetic, and Z.ai.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/onllm-dev/onwatch" rel="noopener noreferrer"&gt;github.com/onllm-dev/onwatch&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Goroutine-per-Provider Pattern
&lt;/h2&gt;

&lt;p&gt;Each provider runs as an independent &lt;code&gt;Agent&lt;/code&gt; in its own goroutine. If Anthropic's API is slow, Codex keeps polling. If Z.ai returns an error, Copilot is unaffected.&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;internal/agent/agent.go&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;       &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;
    &lt;span class="n"&gt;tracker&lt;/span&gt;      &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tracker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tracker&lt;/span&gt;
    &lt;span class="n"&gt;interval&lt;/span&gt;     &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;       &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;slog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Logger&lt;/span&gt;
    &lt;span class="n"&gt;sm&lt;/span&gt;           &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;SessionManager&lt;/span&gt;
    &lt;span class="n"&gt;notifier&lt;/span&gt;     &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;notify&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NotificationEngine&lt;/span&gt;
    &lt;span class="n"&gt;pollingCheck&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Agent started"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"interval"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sm&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Agent stopped"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;ticker&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewTicker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean shutdown via &lt;code&gt;context.Context&lt;/code&gt; cancellation. The &lt;code&gt;pollingCheck&lt;/code&gt; callback lets the dashboard toggle individual providers on/off at runtime without restarting the daemon.&lt;/p&gt;

&lt;h2&gt;
  
  
  SQLite as the Only Dependency
&lt;/h2&gt;

&lt;p&gt;No Postgres, no Redis, no message queue. One SQLite file with WAL mode. My running instance is 55MB after weeks of continuous polling across all 7 providers.&lt;/p&gt;

&lt;p&gt;The key insight: per-cycle historical usage patterns are more valuable than point-in-time snapshots. Every provider shows you current usage. None of them show you which sessions burn quota fastest, which days you're heaviest, or when exactly your cycle resets.&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;internal/store/store.go&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMaxOpenConns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMaxIdleConns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;pragmas&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"PRAGMA journal_mode=WAL;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"PRAGMA synchronous=NORMAL;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"PRAGMA cache_size=-500;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c"&gt;// 512KB page cache&lt;/span&gt;
    &lt;span class="s"&gt;"PRAGMA foreign_keys=ON;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"PRAGMA busy_timeout=5000;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The comment in the source explains why: "SQLite is single-writer anyway, and each connection allocates its own page cache (~2 MB with default settings). Limiting to 1 connection saves 2-4 MB RSS."&lt;/p&gt;

&lt;p&gt;The SQLite driver is &lt;code&gt;modernc.org/sqlite&lt;/code&gt; — a pure Go transpilation of SQLite. No CGO required for the main binary. This is what makes cross-compilation trivial and the Docker image tiny.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory Tuning: How &amp;lt;50MB Actually Works
&lt;/h2&gt;

&lt;p&gt;My running instance uses ~34MB RSS with all 7 providers polling every 60 seconds. Here is exactly how:&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;main.go&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Memory tuning: GOMEMLIMIT triggers MADV_DONTNEED which actually shrinks RSS.&lt;/span&gt;
&lt;span class="c"&gt;// Without this, Go uses MADV_FREE on macOS - pages are reclaimable but still&lt;/span&gt;
&lt;span class="c"&gt;// counted in RSS, causing a permanent ratchet effect.&lt;/span&gt;
&lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMemoryLimit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;40&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// 40 MiB soft limit&lt;/span&gt;
&lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetGCPercent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="c"&gt;// GC at 50% heap growth&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without &lt;code&gt;GOMEMLIMIT&lt;/code&gt;, Go on macOS uses &lt;code&gt;MADV_FREE&lt;/code&gt; — the kernel marks pages as reclaimable but they still show up in RSS. Your process looks like it is leaking memory when it is not. Setting a memory limit switches Go to &lt;code&gt;MADV_DONTNEED&lt;/code&gt;, which actually returns pages to the OS. This one line is the difference between "34MB" and "looks like 80MB."&lt;/p&gt;

&lt;p&gt;Other contributing decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bounded queries everywhere: &lt;code&gt;LIMIT 200&lt;/code&gt; for cycles, &lt;code&gt;LIMIT 50&lt;/code&gt; for insights&lt;/li&gt;
&lt;li&gt;All static assets embedded via &lt;code&gt;embed.FS&lt;/code&gt; — zero runtime allocations for serving HTML/JS/CSS&lt;/li&gt;
&lt;li&gt;Every API client caps response reads: &lt;code&gt;io.ReadAll(io.LimitReader(resp.Body, 1&amp;lt;&amp;lt;16))&lt;/code&gt; — 64KB max per response, preventing memory exhaustion from malformed API payloads&lt;/li&gt;
&lt;li&gt;No ORM. Parameterized SQL only.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Single Binary Trick
&lt;/h2&gt;

&lt;p&gt;The entire application — HTML templates, JavaScript, CSS, service worker manifest, favicon, menubar frontend — is embedded using Go's &lt;code&gt;embed.FS&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;//go:embed templates/*.html&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;templatesFS&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FS&lt;/span&gt;

&lt;span class="c"&gt;//go:embed all:static/*&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;staticFS&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even the version string is embedded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;//go:embed VERSION&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;embeddedVersion&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No npm build step, no external files, no runtime file access. You download one ~15MB file and run it. The Docker distroless image is ~10MB because &lt;code&gt;-ldflags="-s -w"&lt;/code&gt; strips debug symbols.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 7 Provider Clients
&lt;/h2&gt;

&lt;p&gt;Each provider lives in &lt;code&gt;internal/api/&lt;/code&gt; with its own client file. Some of the implementation details are genuinely interesting:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic&lt;/strong&gt; auto-detects tokens from Claude Code's credential files (&lt;code&gt;~/.claude/.credentials.json&lt;/code&gt; or macOS Keychain). The agent calls &lt;code&gt;SetTokenRefresh()&lt;/code&gt; before each poll to proactively refresh OAuth tokens before they expire. No manual configuration needed if Claude Code is installed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Codex&lt;/strong&gt; has dual-endpoint fallback. &lt;code&gt;buildCodexFallbackBaseURL()&lt;/code&gt; tries &lt;code&gt;/backend-api/wham/usage&lt;/code&gt; first, falls back to &lt;code&gt;/api/codex/usage&lt;/code&gt; on 404, because OpenAI serves different URL shapes in different environments. It also supports multi-account polling — the &lt;code&gt;CodexAgentManager&lt;/code&gt; runs per-account agents with per-account visibility toggles.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8af0cda1f4rvthu7ir0s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8af0cda1f4rvthu7ir0s.png" alt=" " width="800" height="1050"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Antigravity&lt;/strong&gt; has no static endpoint at all. The client calls &lt;code&gt;detectProcess()&lt;/code&gt; to find the Antigravity language server PID via &lt;code&gt;os/exec&lt;/code&gt;, then &lt;code&gt;discoverPorts()&lt;/code&gt; on that PID, then probes each port to find the Connect RPC endpoint. Zero configuration. It also sets &lt;code&gt;InsecureSkipVerify: true&lt;/code&gt; because the local server uses self-signed certificates — acceptable since it only ever talks to localhost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Z.ai&lt;/strong&gt; returns HTTP 200 with an error code in the body. The client parses &lt;code&gt;wrapper.Code == 401&lt;/code&gt; from the JSON response to detect auth failures rather than relying on HTTP status codes. This was fun to debug.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MiniMax&lt;/strong&gt; has shared-quota detection: &lt;code&gt;IsSharedQuota()&lt;/code&gt; checks if all active models (M2, M2.1, M2.5) report the same total/remain values. When true, the dashboard renders a single merged card instead of per-model cards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt; tracks Premium Requests, Chat, and Completions quotas with monthly reset cycles. Auth is via GitHub PAT with the &lt;code&gt;copilot&lt;/code&gt; scope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synthetic&lt;/strong&gt; tracks Subscription, Search, and Tool Call quotas via its &lt;code&gt;/v2/quotas&lt;/code&gt; endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  The macOS Menu Bar: A Separate Process, Not a Goroutine
&lt;/h2&gt;

&lt;p&gt;This was the most architecturally interesting decision. The menu bar is not a goroutine in the daemon process — it is a separate OS process.&lt;/p&gt;

&lt;p&gt;Why? &lt;code&gt;systray.Run()&lt;/code&gt; from &lt;code&gt;fyne.io/systray&lt;/code&gt; must block the OS main thread on macOS (Cocoa requirement). If we ran it in the daemon process, Cocoa would take ownership of the main goroutine, and the HTTP server would need to run in a background thread. That is the wrong inversion.&lt;/p&gt;

&lt;p&gt;Instead, the daemon spawns itself with a different subcommand. The daemon calls &lt;code&gt;startMenubarCompanion()&lt;/code&gt;, which runs &lt;code&gt;exec.Command(exe, "menubar", "--port=...", "--db=...")&lt;/code&gt; — the same binary, re-invoked as a child process. It writes a PID file to &lt;code&gt;~/.onwatch/onwatch-menubar.pid&lt;/code&gt; and redirects logs to a rotating log file.&lt;/p&gt;

&lt;h2&gt;
  
  
  The SIGUSR1 Refresh Protocol
&lt;/h2&gt;

&lt;p&gt;The two processes need to stay in sync. When the daemon polls a provider and gets new data, the menu bar should update immediately rather than waiting for its next tick.&lt;/p&gt;

&lt;p&gt;The solution is SIGUSR1:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;refreshCompanionSignal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;syscall&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SIGUSR1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After each successful quota poll, the daemon sends SIGUSR1 to the menu bar PID. The companion's signal goroutine calls &lt;code&gt;controller.refreshStatus()&lt;/code&gt; immediately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;runCompanion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;controller&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;trayController&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;signalChan&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Signal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Notify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signalChan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;refreshCompanionSignal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signalChan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;signalChan&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;controller&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;refreshStatus&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="n"&gt;systray&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;controller&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;onReady&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;controller&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;onExit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The menu bar also runs its own refresh loop as a fallback — if SIGUSR1 is missed (process paused, signal lost), the next tick picks it up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Native Popover: Go + Objective-C via CGo
&lt;/h2&gt;

&lt;p&gt;Clicking the tray icon does not open a browser tab. It opens a native &lt;code&gt;NSPanel&lt;/code&gt; backed by &lt;code&gt;WKWebView&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;internal/menubar/popover_darwin.m&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;The popover is a 360x680px borderless panel. The &lt;code&gt;OnWatchBorderlessPanel&lt;/code&gt; subclass overrides three methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;canBecomeKeyWindow&lt;/code&gt; returns YES (keyboard accessible)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;canBecomeMainWindow&lt;/code&gt; returns NO (never steals focus from your editor)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;acceptsFirstMouse&lt;/code&gt; returns YES (responds to first click without requiring activation)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Go layer in &lt;code&gt;webview_darwin.go&lt;/code&gt; wraps these Objective-C calls with &lt;code&gt;unsafe.Pointer&lt;/code&gt; handles. When CGo is unavailable (non-macOS builds), it falls back to opening a browser tab at &lt;code&gt;http://localhost:9211/menubar&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Build tag isolation keeps this clean:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;//go:build menubar &amp;amp;&amp;amp; darwin&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;menubar_stub.go&lt;/code&gt; provides no-ops for all other platforms. The standard binary compiles on Linux and Windows without any macOS dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Display Modes for the Tray Title
&lt;/h2&gt;

&lt;p&gt;The tray icon shows live data in three configurable modes.&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;internal/menubar/tray_display.go&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TrayTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snapshot&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Snapshot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Settings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;normalized&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusDisplay&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Mode&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;StatusDisplayIconOnly&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;StatusDisplayCriticalCount&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;snapshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Aggregate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WarningCount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;snapshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Aggregate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CriticalCount&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d ⚠"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;StatusDisplayMultiProvider&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;multiProviderMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalized&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusDisplay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;joinTrayParts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;multi_provider&lt;/strong&gt; — Live percentages per selected quota, separated by &lt;code&gt;│&lt;/code&gt;. Example: &lt;code&gt;67% │ 23% │ 91%&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;critical_count&lt;/strong&gt; — Count of warning + critical quotas. Example: &lt;code&gt;2 ⚠&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;icon_only&lt;/strong&gt; — No text, just the icon&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Users configure which quotas appear in the tray via Settings &amp;gt; Menubar. The &lt;code&gt;normalizeStatusSelections()&lt;/code&gt; function deduplicates entries, caps at 3, and strips empty values.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three View Presets for the Popover
&lt;/h2&gt;

&lt;p&gt;The popover frontend (embedded via &lt;code&gt;//go:embed&lt;/code&gt;) renders three view presets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;minimal&lt;/strong&gt; — A single circular percentage ring with aggregate status across all providers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;standard&lt;/strong&gt; — Individual provider cards with circular quota meters (SVG &lt;code&gt;stroke-dasharray&lt;/code&gt; animation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;detailed&lt;/strong&gt; — Expanded cards with sparkline trend lines (SVG polyline charts showing usage over time)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same JavaScript works in both the native WKWebView and a regular browser — a &lt;code&gt;window.__ONWATCH_MENUBAR_BRIDGE__&lt;/code&gt; object is injected server-side to handle the differences.&lt;/p&gt;

&lt;h2&gt;
  
  
  Notifications: Push and Email
&lt;/h2&gt;

&lt;p&gt;onWatch sends desktop push notifications (Web Push/VAPID) and email alerts (SMTP) when quotas cross configurable thresholds.&lt;/p&gt;

&lt;p&gt;Three notification types: Warning (default 80%), Critical (default 95%), and Reset (quota renewed). Each has a 30-minute cooldown to prevent alert fatigue. Per-quota overrides let you set different thresholds — or switch to absolute values instead of percentages.&lt;/p&gt;

&lt;p&gt;The VAPID key generation uses &lt;code&gt;ecdsa.GenerateKey(elliptic.P256(), rand.Reader)&lt;/code&gt; — standard Web Push. SMTP passwords are encrypted at rest with AES-256-GCM using HKDF-derived keys.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security
&lt;/h2&gt;

&lt;p&gt;Some decisions I want to highlight because they are easy to get wrong:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constant-time credential comparison:&lt;/strong&gt; &lt;code&gt;subtle.ConstantTimeCompare&lt;/code&gt; for both password and session token checks. Timing attacks against local services are unlikely, but the fix is one import.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Password hashing:&lt;/strong&gt; bcrypt at &lt;code&gt;DefaultCost&lt;/code&gt; (10) for new passwords. Legacy SHA-256 hashes are migrated transparently on next login.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SMTP password encryption at rest:&lt;/strong&gt; AES-256-GCM with HKDF-derived keys. When the admin password changes, &lt;code&gt;ReEncryptAllData()&lt;/code&gt; re-encrypts all stored SMTP credentials with the new key. The info string &lt;code&gt;"onwatch-smtp-encryption"&lt;/code&gt; provides domain separation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Login rate limiting:&lt;/strong&gt; 5 max failures per 5-minute window per IP before blocking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bounded response reading:&lt;/strong&gt; Every API client caps responses at 64KB via &lt;code&gt;io.LimitReader&lt;/code&gt;. A malicious or broken API cannot OOM the daemon.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;These are from the actual codebase (verified March 2026):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Go source lines&lt;/td&gt;
&lt;td&gt;~109,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test files&lt;/td&gt;
&lt;td&gt;98&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test functions&lt;/td&gt;
&lt;td&gt;2,473&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binary size&lt;/td&gt;
&lt;td&gt;~15MB (unstripped)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime memory&lt;/td&gt;
&lt;td&gt;~34MB RSS (all 7 providers polling)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Provider clients&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Docker image&lt;/td&gt;
&lt;td&gt;~10MB (distroless, non-root)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SQLite driver&lt;/td&gt;
&lt;td&gt;Pure Go (no CGO for main binary)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;CI runs &lt;code&gt;go test -race -coverprofile=coverage.out -covermode=atomic -count=1 ./...&lt;/code&gt; on every push. The &lt;code&gt;-race&lt;/code&gt; flag catches data races between the 7 concurrent polling goroutines. The menubar CI job on macOS runs an additional Playwright E2E suite.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;One-line install (macOS/Linux):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Homebrew:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;onllm-dev/tap/onwatch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Docker:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; .env.docker.example .env   &lt;span class="c"&gt;# add your API keys&lt;/span&gt;
docker-compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open-source, GPL-3.0: &lt;a href="https://github.com/onllm-dev/onwatch" rel="noopener noreferrer"&gt;github.com/onllm-dev/onwatch&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy to answer questions about any of the architecture decisions. What would you do differently?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>claudeai</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Got Tired of Checking 6 AI Dashboards, So I Built a CLI That Checks Them All</title>
      <dc:creator>Prakersh Maheshwari</dc:creator>
      <pubDate>Tue, 03 Mar 2026 21:55:55 +0000</pubDate>
      <link>https://dev.to/prakershm/i-got-tired-of-checking-6-ai-dashboards-so-i-built-a-cli-that-checks-them-all-2llj</link>
      <guid>https://dev.to/prakershm/i-got-tired-of-checking-6-ai-dashboards-so-i-built-a-cli-that-checks-them-all-2llj</guid>
      <description>&lt;p&gt;Last month I realized I was opening 6 browser tabs every morning just to check my AI API quotas. Anthropic, Copilot, Codex, three more. Each with different billing cycles, different UI, different definitions of "usage."&lt;/p&gt;

&lt;p&gt;So I built &lt;a href="https://github.com/onllm-dev/onwatch" rel="noopener noreferrer"&gt;onWatch&lt;/a&gt; — a CLI that polls all of them from one terminal.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5glkvzsjfzhfab7uss71.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5glkvzsjfzhfab7uss71.png" alt=" " width="800" height="745"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;I use Claude Code, Antigravity, and Codex CLI daily. Each has its own dashboard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic tracks 5-hour limits, weekly all-model limits, and weekly Sonnet limits — three separate quotas with different reset windows&lt;/li&gt;
&lt;li&gt;Codex has 5-hour limits plus review request quotas&lt;/li&gt;
&lt;li&gt;Antigravity splits quotas across Claude+GPT, Gemini Pro, and Gemini Flash&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's 6 provider dashboards, each with multiple quota types, different billing cycles, and no cross-provider view.&lt;/p&gt;

&lt;p&gt;The real problem isn't seeing &lt;em&gt;current&lt;/em&gt; usage — every provider shows that. The problem is having no &lt;em&gt;historical&lt;/em&gt; data. Which sessions burn quota fastest? Which days am I heaviest? When exactly does my cycle reset? No provider dashboard answers these questions.&lt;/p&gt;

&lt;p&gt;The breaking point: I hit a rate limit mid-session with Claude Code. Lost my context, lost my momentum, and had no idea when the limit would reset.&lt;/p&gt;

&lt;h3&gt;
  
  
  What I Built
&lt;/h3&gt;

&lt;p&gt;onWatch is a Go CLI that runs as a background daemon. It:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Polls 6 providers in parallel&lt;/strong&gt; — each provider runs as an independent agent with its own goroutine (&lt;code&gt;internal/agent/&lt;/code&gt;). Anthropic, Codex, Copilot, Antigravity, Synthetic, and Z.ai.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stores history in SQLite&lt;/strong&gt; — a single &lt;code&gt;*sql.DB&lt;/code&gt; connection with WAL mode. Currently my database is 55MB after weeks of continuous polling across all 6 providers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serves a Material Design 3 dashboard&lt;/strong&gt; — HTML templates and static assets embedded via Go's &lt;code&gt;embed.FS&lt;/code&gt;, served from the same binary. Dark and light mode.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight that drives everything: &lt;strong&gt;per-cycle, historical usage patterns are more valuable than point-in-time snapshots.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The dashboard shows things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your 5-hour Anthropic limit is at 72% utilization with 3h 9m until reset&lt;/li&gt;
&lt;li&gt;Your burn rate is 38.2%/hr — at this pace, quota exhausts in 43 minutes&lt;/li&gt;
&lt;li&gt;Your average 5-hour limit usage across 21 windows is 60%&lt;/li&gt;
&lt;li&gt;Recent usage is trending -16% compared to earlier periods&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are real numbers from my running instance, not mock data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical Decisions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Single binary (~13MB):&lt;/strong&gt; Everything — HTML templates, JavaScript, CSS, service worker manifest — is embedded using Go's &lt;code&gt;embed.FS&lt;/code&gt;. No npm build step, no external files. You download one file and run it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;//go:embed templates/*.html&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;templatesFS&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FS&lt;/span&gt;

&lt;span class="c"&gt;//go:embed all:static/*&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;staticFS&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&amp;lt;50MB RAM with 6 providers:&lt;/strong&gt; My running instance uses ~35MB RSS with all 6 providers polling every 60 seconds. Key decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single SQLite connection (not a pool) — &lt;code&gt;*sql.DB&lt;/code&gt; with one connection&lt;/li&gt;
&lt;li&gt;Bounded queries: cycles capped at 200, insights at 50&lt;/li&gt;
&lt;li&gt;No ORM — parameterized SQL everywhere for both performance and injection prevention&lt;/li&gt;
&lt;li&gt;Embedded assets mean zero runtime allocations for serving static files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The agent pattern:&lt;/strong&gt; Each provider is an independent &lt;code&gt;Agent&lt;/code&gt; struct that runs a polling loop in its own goroutine. Clean shutdown via &lt;code&gt;context.Context&lt;/code&gt; cancellation. If one provider's API is slow or errors, others aren't affected.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;   &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;
    &lt;span class="n"&gt;tracker&lt;/span&gt;  &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tracker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tracker&lt;/span&gt;
    &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;   &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;slog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Logger&lt;/span&gt;
    &lt;span class="n"&gt;sm&lt;/span&gt;       &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;SessionManager&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Polls immediately, then at configured interval&lt;/span&gt;
    &lt;span class="c"&gt;// until context is cancelled&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Security:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;subtle.ConstantTimeCompare&lt;/code&gt; for credential checks (timing attack prevention)&lt;/li&gt;
&lt;li&gt;Parameterized SQL only — plus an &lt;code&gt;validateOrderByColumn&lt;/code&gt; allowlist function for ORDER BY clauses&lt;/li&gt;
&lt;li&gt;All data stays on your machine — zero telemetry, no cloud dependency&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Numbers
&lt;/h3&gt;

&lt;p&gt;These are from the actual codebase and running instance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Go source lines&lt;/td&gt;
&lt;td&gt;~47,600&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test functions&lt;/td&gt;
&lt;td&gt;486 across 33 test files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binary size&lt;/td&gt;
&lt;td&gt;13MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime memory&lt;/td&gt;
&lt;td&gt;~35MB RSS (6 providers polling)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database after weeks&lt;/td&gt;
&lt;td&gt;55MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Provider agents&lt;/td&gt;
&lt;td&gt;6 (each with client, store, tracker, agent layers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Static assets&lt;/td&gt;
&lt;td&gt;6 files embedded (app.js, style.css, sw.js, manifest.json, favicon.svg, theme-init.js)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Try It
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;One-line install (macOS/Linux):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Windows (PowerShell):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;irm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.ps1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;iex&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Docker:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; .env.docker.example .env   &lt;span class="c"&gt;# add your API keys&lt;/span&gt;
docker-compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It self-daemonizes on macOS, uses systemd on Linux, and runs in the foreground in Docker (auto-detected via &lt;code&gt;IsDockerEnvironment()&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Open-source, GPL-3.0: &lt;a href="https://github.com/onllm-dev/onwatch" rel="noopener noreferrer"&gt;github.com/onllm-dev/onwatch&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What provider should we add next? MiniMax and Kimi are the most requested.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>antigravity</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
