Adnan Sattar

Posted on May 8 • Originally published at Medium on May 8

NemoClaw for the Enterprise: Installing NemoClaw and Bootstrapping the Sandbox (Part 2)

#cybersecurity #nemoclaw #openshell #devsecops

The substrate is ready. Now we move the agent into its cell and try not to bulldoze it on the way in.

In Part 1 we turned a fresh VPS into something an AI agent can safely live on; rootless user, Tailscale mesh, UFW, no public attack surface. That was the safe house. Empty.

This article puts the tenant inside.

The stack you're about to install is layered in a way that confuses people the first time they meet it, and the most expensive failure mode, running one perfectly innocent-looking command on the wrong day, quietly nukes everything you set up. So we're going to slow down on the mental model, install carefully, and treat the bootstrap as a one-shot operation. Because that's exactly what it is.

By the end of this guide you'll have:

The NemoClaw CLI installed and authenticated against an inference provider
A running NemoClaw sandbox (k3s + OpenShell + OpenClaw, all the way down)
A working nemoclaw connect shell into the sandboxed agent
The OpenClaw dashboard reachable from your laptop over Tailscale
A clear mental model of which command lives at which layer and which command will silently destroy your state if you run it twice

No Matrix yet. No skills, no policies.

Just a clean install with the failure modes labelled.

The Agent Cell

What You're Building

Here's the updated architecture, picking up where Part 1 left off:

┌────────────────────────────────────────────────────────────────────┐
│ Your Tailnet (Private)                                             │
│                                                                    │
│  [Your Laptop] ───SSH/HTTPS───▶ VPS (openclaw user)                │
│                                  │                                 │
│                                  │ Docker engine                   │
│                                  ▼                                 │
│                  ┌────────────────────────┐                        │
│                  │ openshell-cluster-     │                        │
│                  │ nemoclaw (Docker ctr)  │                        │
│                  │                        │                        │
│                  │ ┌────────────────────┐ │                        │
│                  │ │ k3s (single-node)  │ │                        │
│                  │ │                    │ │                        │
│                  │ │ ┌────────────────┐ │ │                        │
│                  │ │ │ NemoClaw       │ │ │                        │
│                  │ │ │ sandbox pod    │ │ │                        │
│                  │ │ │                │ │ │                        │
│                  │ │ │ OpenClaw ──────┼─┼─┼──▶ inference API       │
│                  │ │ │ agent          │ │ │                        │
│                  │ │ └────────────────┘ │ │                        │
│                  │ └────────────────────┘ │                        │
│                  └────────────────────────┘                        │
└────────────────────────────────────────────────────────────────────┘

Architecture

Four layers, top to bottom: the Docker engine on your VPS, a single Docker container running a self-contained k3s cluster (openshell-cluster-nemoclaw), a Kubernetes pod inside that cluster running the OpenShell sandbox, and inside that pod the OpenClaw agent itself.

This sounds like overkill for a single-VPS deployment. It isn't.

The k3s layer is what gives you cheap, repeatable sandbox lifecycle create, destroy, reset, snapshot without dragging Docker plumbing into agent execution. The OpenShell sandbox is what enforces the network and filesystem policies we'll write in Part 4. And the OpenClaw agent on top is just the workload.

The 60-Second Mental Model

Three commands, three layers. Internalise this before you type anything:

nemoclaw … — runs on the host. The orchestrator. Knows about Docker, k3s, and the sandbox lifecycle.
openshell … — runs on the host. Talks directly to the gateway and lets you manage policies, providers, port forwards. Lower-level than nemoclaw.
openclaw … — runs inside the sandbox. Manages plugins, skills, sessions. The agent's own CLI.

If you ever find yourself typing openclaw plugins install on the host, you're at the wrong layer. If you find yourself typing nemoclaw onboard twice, stop reading and go make coffee, we'll get to that one in a minute.

60-Second Mental Model

Step 1: Install the NemoClaw CLI

NemoClaw ships as an npm package. From your openclaw user on the VPS:

npm install -g nemoclaw

If npm complains about permissions, you skipped a step in Part 1. npm install -g should not need sudo if your user owns its npm prefix. Fix that before continuing rather than papering over it with sudo, which will create root-owned files in places you'll regret later.

Verify the install:

nemoclaw -h

You should see the help banner showing version 0.1.x or newer, with sections for Sandbox Management, Policy Presets, Services, and Troubleshooting. If the version reads v0.0.x, upgrade. There are real lifecycle bugs in the early-zero releases that bit several of us during early-2026 deployments.

Step 2: Pick an Inference Provider

NemoClaw is provider-agnostic but nudges you toward NVIDIA's Nemotron family. In practice, the cleanest path for a fresh deployment is OpenRouter:

Single API key, dozens of models behind it
Per-token billing, no commitments
Direct access to nvidia/nemotron-3-super-120b-a12b, which is what NemoClaw is tuned around
If you later want to swap to Anthropic or OpenAI, it's a one-line change

If you're an enterprise with a direct NVIDIA NIM contract, point at your NIM endpoint instead. The flow is identical. onboard will ask which provider you want and what credentials to use.

Step 3: Onboard (and Why You'll Only Do This Once)

This is the section where I get to be slightly insufferable about a warning, because it has cost more than one of my evenings.

nemoclaw onboard is the bootstrap command. It does five things in one shot: configures your inference provider, generates the gateway's mTLS certificates, pulls the OpenShell container images, brings up the k3s cluster inside Docker, and creates the sandbox pod with default policies attached.

It is a create-from-scratch command. Not idempotent. Not a "rerun to update" command.

Hard rule. Never run nemoclaw onboard against an existing sandbox. It will recreate everything from zero: your provider config, your policies, your sessions, your installed skills, your Matrix tokens (Part 3), all of it. There is no confirmation prompt that adequately captures how destructive this is.

If you need to change a policy, use nemoclaw <name> policy-add or openshell policy set. If you need to change a provider, use openshell provider create and openshell inference set. Treat onboard like mkfs: useful exactly once.

OK. With that out of the way, run it:

nemoclaw onboard

The CLI walks you through:

Sandbox name — accept the default (nemoclaw-sandbox) unless you have a reason. The companion commands all default to it, and overriding the name buys you nothing but typing.
Inference provider — pick openrouter (or nvidia, anthropic, openai per your choice in Step 2).
Model — for OpenRouter + Nemotron: nvidia/nemotron-3-super-120b-a12b. NemoClaw will validate it exists by issuing a tiny test completion before continuing.
API key — paste it. It gets written to ~/.nemoclaw/credentials.json with mode 600.

The CLI then hands off to OpenShell to bootstrap the gateway and sandbox. Don't kill the terminal. This takes anywhere from two to seven minutes depending on your VPS network.

Kill Switch

Step 4: Watch the Bootstrap

While onboard runs, what's actually happening:

Docker pulls the OpenShell gateway image and starts the openshell-cluster-nemoclaw container.
Inside that container, k3s comes up as a single-node cluster.
OpenShell deploys its gateway pod into the cluster and waits for it to report healthy.
NemoClaw applies the default network policy preset and creates the sandbox pod (nemoclaw-sandbox) in the openshell namespace.
The sandbox pod pulls its OpenClaw image and starts the agent.

If you want to follow along live, open a second SSH session to the VPS and tail the relevant logs:

# Container-level: is k3s healthy?
docker logs -f openshell-cluster-nemoclaw

# Sandbox-level: is the agent coming up?
nemoclaw nemoclaw-sandbox logs --follow

The second command will fail until the sandbox pod exists, which is fine. Retry it once onboard reports the sandbox is created.

When onboard finishes, you'll see something like:

✓ Gateway running at https://127.0.0.1:8080
✓ Sandbox 'nemoclaw-sandbox' is healthy
✓ Default policies applied

Bootstrap

Step 5: Verify the Gateway and Recognise the False Alarm

Sanity-check the gateway:

nemoclaw status
openshell status

Both should show the gateway as running and the sandbox as healthy.

A wart worth knowing: on a slow VPS, nemoclaw connect immediately after onboard will sometimes greet you with:

Gateway process started but is not responding

This is almost always a timing race condition, not a real failure. OpenShell's health check fires before the gateway has finished initialising mTLS. Wait thirty seconds, retry, and it's there. If it's still failing after a minute, then break out the diagnostics:

openshell doctor check
openshell doctor logs --lines 200

doctor check validates that Docker, k3s, and the gateway are all in expected states. doctor logs pulls the gateway container's stdout. Between them, you'll see the actual cause of any genuine failure.

Step 6: Connect to the Sandbox

The moment of truth:

nemoclaw nemoclaw-sandbox connect

The first connection negotiates an SSH session over the gateway's mTLS tunnel and drops you into a shell inside the sandbox pod. The prompt changes; the hostname changes; you're now executing commands inside an isolated environment whose filesystem and network access are governed by OpenShell policies.

Inside the sandbox, run the obvious sanity checks:

whoami
pwd # /sandbox
ls -la
openclaw --version

You'll find yourself as a non-root user inside /sandbox, with the OpenClaw binary on your PATH and a bare home directory. This is intentional. The sandbox starts almost empty by design; skills, plugins, and credentials get layered in deliberately rather than inherited from the host.

Try poking at the network from inside:

curl -s -o /dev/null -w "%{http_code}\n" https://google.com

You'll likely get a connection-refused or DNS failure, depending on the default policy. That's the policy engine working as advertised. Part 4 covers how to write policies that grant the agent exactly the network access it needs and nothing more.

Type exit to drop back to the host. The sandbox keeps running.

End to End Workflow

Step 7: Reach the OpenClaw Dashboard

OpenClaw exposes a web UI for chatting with the agent and managing sessions. NemoClaw maps it to 127.0.0.1:18789 on the VPS. From the host it's local-only by design. There's no public listener, and there shouldn't be.

To reach it from your laptop, use your Tailscale-resolved hostname:

http://<your-vps-tailscale-hostname>:18789

If you enabled MagicDNS in Part 1, that's something like http://openclaw-staging:18789. From any device on your tailnet, the dashboard loads. From anywhere else on the internet, the connection won't even establish. The port isn't exposed past localhost, and the firewall would drop it anyway.

The first time you load the dashboard it'll ask for a gateway auth token. That brings us to the one operational wart you should know about now, before it surprises you in production.

Reach the OpenClaw Dashboard

A Known Wart: The Gateway Doesn't Survive Host Reboots Cleanly

If you reboot the VPS, lose network on the host long enough for the gateway to give up, or restart Docker, the gateway process drops and the dashboard refuses your existing session. The sandbox pod is fine. The agent state is fine. The OpenClaw container in k3s is fine. It's just that the gateway's auth token gets rotated on restart and the dashboard doesn't pick up the new one automatically.

The recovery is mechanical: pull the new token out of the sandbox config and paste it into the dashboard.

docker exec openshell-cluster-nemoclaw \
  kubectl exec -n openshell nemoclaw-sandbox -- \
  python3 -c "import json; print(json.load(open('/sandbox/.openclaw/openclaw.json'))['gateway']['auth']['token'])"

That command is a worked example of the four-layer mental model: Docker → k3s → sandbox pod → file inside the pod. Read it left to right and you'll see each exec peeling off one layer.

Copy the printed token, paste it into the dashboard's auth prompt, and you're back. Keep this command in a paste-buffer somewhere; you will need it more than once.

A proper systemd-based autostart that watches for network changes and refreshes the dashboard auth automatically is doable, and it's on the roadmap for a later article in this series. For now, the manual recovery is twenty seconds and worth the explicitness. It forces you to notice when the gateway has restarted, which on a security-sensitive deployment is information you actually want.

Verification Checklist

Before moving on:

nemoclaw status shows the sandbox healthy and the gateway running
nemoclaw nemoclaw-sandbox connect drops you into a /sandbox shell
From inside the sandbox, openclaw --version returns a version string
From your laptop, http://<vps-tailscale-host>:18789 loads the OpenClaw dashboard
From anywhere not on your tailnet, the dashboard is unreachable
openshell doctor check reports green across the board
~/.nemoclaw/credentials.json exists with mode 600

If any of those fail, fix them before Part 3. Matrix layered on top of a flaky bootstrap will produce confusing failures that look like Matrix problems and aren't.

Where You Are Now

You have a four-layer agent stack running on a hardened VPS that has no public attack surface. The agent is alive, sandboxed, reachable from your laptop over Tailscale, and gated behind mTLS. It can't yet be talked to over a real chat protocol, can't reach external APIs beyond what the default policy allows, and has no skills installed. That's deliberate. We're laying down each capability one article at a time, and verifying it works before piling the next one on top.

Private AI Agent Layer Stack

The thing worth pausing on: this is already a defensible deployment. Even with no extra hardening, an attacker who somehow compromised your VPS would still have to escape the rootless openclaw user, then escape the sandbox container, then escape the k3s namespace isolation, before they could touch the host kernel. Each layer is breakable in theory. Stacking them is what makes the practical attack vanishingly expensive.

Defense in Depth

What's Next

Part 3. Matrix as the Control Channel. The default OpenClaw control surface is the dashboard you just loaded. That's fine for development. For a deployment where the messages flowing into the agent are, by definition, high-privilege instructions, you want the channel itself to be end-to-end encrypted and authenticated. Telegram doesn't cut it. Matrix does, but the install path has a few sharp edges around OIDC migration and device key conflicts that I'm going to walk through in detail, because the docs don't, and the failure modes are genuinely confusing the first time you hit them.

Part 4. Policy Engineering. OpenShell's policy engine is the actual reason to run NemoClaw rather than vanilla OpenClaw. We'll write per-domain network policies, set up filesystem allowlists, and walk through the live-update flow with openshell policy set --wait.

Part 5. Skills, Plugins, and Model Switching. ClawHub, the docker-cp-then-kubectl-cp pattern for getting skills into the sandbox safely, and how to flip between Nemotron and Claude Sonnet without a restart.

I'm collecting deployment war stories for Part 3's appendix. If this broke in a way I didn't cover, wrong CLI version, weird VPS provider, a doctor check red I haven't seen, drop the error in the comments. The Matrix article gets sharper for every one of these I see.

DEV Community