Chris Kesler

Posted on Mar 8

OpenClaw Model Manager: A GUI for the Power Users Who Hate Waiting

#ai #llm #showdev #tooling

How I built a standalone web dashboard to tame OpenClaw's CLI—and what it taught me about AI infrastructure in the real world.

The Problem Nobody Talks About

OpenClaw is genuinely powerful. It runs a local AI gateway that routes your conversations through any combination of models—Anthropic, OpenRouter, Google, local Ollama models—with fallback chains, auth profiles, aliases, and session management baked right in. Once it's configured, it mostly just works.

But "mostly just works" hides a lot of friction.

Want to know if your gateway is running? openclaw gateway status. Want to change your primary model? Edit a JSON config file, then restart the gateway. Want to see which provider is in a rate-limit cooldown? Good luck—dig through auth-profiles.json manually. Want to check if your dual RTX 3060s can actually run that 34B parameter model? Open a calculator.

The tools are all there. They're just scattered, CLI-only, and invisible when you need them most.

That's why I built OpenClaw Model Manager.

What It Is

OpenClaw Model Manager is a standalone web dashboard that wraps OpenClaw's existing CLI and config files with a clean, dark-themed UI.

It runs as its own Express server on port 18800—completely independent of the OpenClaw gateway itself. That's intentional: the manager doesn't go down when the gateway does, and it's exactly how you bring the gateway back up when it crashes. It's not a replacement for OpenClaw; it's a control panel for it.

No build step. No framework. No dependencies beyond Express and ws. Just static HTML, CSS, and JavaScript backed by a thin Node server that shells out to the same openclaw CLI you'd use in your terminal. Bind it to 0.0.0.0:18800 and it's accessible over Tailscale or your local network from any device—your phone, a laptop on the couch, or a remote machine across the country.

The Features (And Why They Matter)

🚨 Provider Failover Panel

This one was born from real pain.

During development, switching primary models repeatedly triggered Anthropic's rate limiter. The gateway put Anthropic on a 5-minute cooldown—but the first fallback in my chain was also an Anthropic model. So both were blocked. The error messages were confusing, the fix required editing a JSON file, and there was no visibility into what was happening.

The Failover Panel solves this. It lives at the top of the Health tab so you see it immediately when something's wrong:

Red borders appear the moment any provider enters cooldown.
Countdown timers show exactly how long until each provider recovers.
"Switch To ⚡" button hot-swaps your primary model to any ready provider instantly—no gateway restart, no JSON editing, no waiting.
"Clear Cooldown" resets a provider's error state immediately if you know the rate limit has lifted.

Why it matters: When you're in the middle of a workflow and your primary model goes down, you need one click to fix it—not a terminal and a config file.

📊 Live System Health

The Health tab shows a plain-English summary of what's happening on your machine right now:

GPU VRAM bars: Utilization percentage and temperature for each card, refreshed every 3 seconds via nvidia-smi.
RAM usage: Total and available memory.
Model offload detection: Plainly shows whether your current model is running entirely on GPU, split across GPU and CPU, or running in CPU-only mode.

🦙 Local Model Compatibility & The "12GB Trap"

The Local Models tab connects to your Ollama instance and lists every model you have installed, alongside an honest assessment of whether your hardware can run it:

✅ Fits on GPU: Full VRAM available; runs fast.
⚠️ Partial GPU: Model is larger than available VRAM; will split to system RAM (slower).
💻 CPU only: Too large for GPU; will be incredibly slow.
❌ Won't fit in RAM: Model exceeds your total system memory.

The Math: VRAM requirements are estimated as Model Size × 1.2 (accounting for a 20% overhead for the KV cache at short contexts). For a 24GB AI Lab (like my 2× RTX 3060 setup), this gives you a realistic picture of which 7B, 13B, and 34B models are actually usable day-to-day before you even try to load them.

🔗 Drag-and-Drop Fallback Chains

Your fallback chain is the safety net that keeps conversations going when your primary model fails. OpenClaw tries each model in order until one works.

The Fallbacks tab lets you manage it visually. Drag tiles to reorder them, see your balance of cloud vs. local models at a glance, and hit save. It writes directly to your openclaw.json config.

Why it matters: The right fallback order is the difference between a 500ms retry to OpenRouter and a 25-minute wait for Anthropic to recover.

🌐 Remote Connection Manager

If you run OpenClaw on more than one machine, the Connection Manager lets you add any instance—local or remote—and switch between them with a single dropdown. Switch connections over Tailscale, and every tab immediately updates to show data for that specific remote instance.

The Architecture

The architecture is built for resilience. The server stays alive independent of the gateway. Local operations shell out to the openclaw CLI, while remote operations hit the remote Model Manager's HTTP API. No custom protocol, no agents, no magic—just HTTP all the way down.

Getting Started

If you're running multiple OpenClaw instances, dealing with provider rate limits, or trying to figure out which Ollama models will actually run on your GPUs, this dashboard is for you.

It's free, it's open source, and it took one morning to build from scratch.

Quick Start

git clone [https://github.com/chriskesler35/openclaw-model-manager](https://github.com/chriskesler35/openclaw-model-manager)
cd openclaw-model-manager
npm install
npm start
Open http://localhost:18800 in your browser.

(Built with Express, WebSockets, and stubbornness. Dark mode only.)

Check out the full repository here: github.com/chriskesler35/openclaw-model-manager

DEV Community