ArshTechPro

Posted on Mar 19

NemoClaw: NVIDIA's Open Source Stack for Running AI Agents You Can Actually Trust

#ai #agents #agentskills

AI agents have crossed a threshold. They're no longer chatbots that answer questions and forget you exist. The new generation can remember context across sessions, spawn sub-agents, write their own code to learn new skills, and keep executing tasks long after you close your laptop. Tools like OpenClaw have made it possible for a single developer to spin up an autonomous assistant that works like a small team.

That's exciting. It's also terrifying if you think about it for more than five seconds.

A long-running agent with persistent shell access, live credentials, and the ability to rewrite its own tooling is a fundamentally different threat model than a stateless chatbot. Every prompt injection becomes a potential credential leak. Every third-party skill the agent installs is an unreviewed binary with filesystem access. Every sub-agent it spawns can inherit permissions it was never meant to have.

The agents are ready. The infrastructure to trust them has been missing — until now.

What Is NemoClaw?

NemoClaw is an open source stack from NVIDIA that wraps OpenClaw (the popular always-on AI assistant) with enterprise-grade privacy and security controls. It's built on top of two key NVIDIA projects:

OpenShell — an open source runtime (part of the NVIDIA Agent Toolkit) that acts as a governance layer between your agent and your infrastructure. Think of it as a browser sandbox, but for AI agents. It controls what the agent can see, do, and where its inference requests go.

Nemotron — NVIDIA's family of open models that can run locally on your own hardware for enhanced privacy and cost efficiency.

The whole point: you get the productivity of autonomous agents without giving up control.

Why Should You Care?

If you're building with or deploying AI agents, you've likely hit the "trust trilemma." You need three things simultaneously: safety, capability, and autonomy. With existing approaches, you can only reliably get two at a time.

Safe + autonomous but the agent can't access the tools and data it needs → it can't finish the job.
Capable + safe but gated on constant approvals → you're babysitting it.
Capable + autonomous with full access → a long-running process policing itself, with guardrails living inside the same process they're supposed to guard.

That last scenario is the critical failure mode. Tools like Claude Code and Cursor ship with valuable internal guardrails, but those protections live inside the agent. A compromised agent can potentially override them.

NemoClaw solves this by moving the control point outside the agent entirely. The agent literally cannot override the security policies because they're enforced at the infrastructure level, not the prompt level.

How It Works Under the Hood

NemoClaw's architecture has four main components:

1. The Plugin (CLI)

A TypeScript CLI that orchestrates everything. You use nemoclaw commands on your host machine to launch, connect to, and manage sandboxed agents.

2. The Blueprint

A versioned Python artifact that handles sandbox creation, policy configuration, and inference setup. It follows a four-stage lifecycle: resolve the artifact → verify its digest → plan resources → apply through the OpenShell CLI.

3. The Sandbox

This is where the magic happens. It's not generic container isolation — it's purpose-built for long-running, self-evolving agents. The sandbox provides:

Landlock + seccomp + network namespace isolation — the agent runs in a locked-down environment at the OS level.
Filesystem restrictions — the agent can only read/write inside /sandbox and /tmp. Everything else is off-limits.
Network egress control — unauthorized outbound connections are blocked. If the agent tries to reach an unlisted host, OpenShell blocks it and surfaces the request for your approval.
Process protection — privilege escalation and dangerous syscalls are blocked at sandbox creation time.
Live policy updates — network and inference policies can be hot-reloaded at runtime as you approve new permissions.

4. The Inference Layer

Inference requests from the agent never leave the sandbox directly. OpenShell intercepts every call and routes it through a privacy router. This router decides — based on your policy, not the agent's preferences — whether a request goes to a local model (like Nemotron running on your GPU) or to a cloud-based frontier model.

The default setup routes through NVIDIA's cloud API using nvidia/nemotron-3-super-120b-a12b. Local inference via Ollama and vLLM is experimental but available.

Getting Started (One Command, Seriously)

Prerequisites

You need a Linux machine (Ubuntu 22.04+) with Docker, Node.js 20+, and at least 8 GB of RAM (16 GB recommended). macOS is supported via Colima or Docker Desktop on Apple Silicon.

Install and Onboard

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

That's it. The script installs Node.js if needed, then runs a guided onboarding wizard that creates a sandbox, configures inference, and applies security policies. It'll prompt you for an NVIDIA API key (grab one free from their website).

When it finishes, you'll see something like:

──────────────────────────────────────────────────
Sandbox      my-assistant (Landlock + seccomp + netns)
Model        nvidia/nemotron-3-super-120b-a12b (NVIDIA Cloud API)
──────────────────────────────────────────────────
Run:         nemoclaw my-assistant connect
Status:      nemoclaw my-assistant status
Logs:        nemoclaw my-assistant logs --follow
──────────────────────────────────────────────────

Connect and Chat

# Open a shell inside the sandbox
nemoclaw my-assistant connect

# Launch the interactive TUI
sandbox@my-assistant:~$ openclaw tui

# Or use the CLI for a single message
sandbox@my-assistant:~$ openclaw agent --agent main --local -m "hello" --session-id test

Key Commands Cheat Sheet

Command	What It Does
`nemoclaw onboard`	Interactive setup wizard
`nemoclaw <name> connect`	Shell into a sandbox
`nemoclaw <name> status`	Check sandbox health
`nemoclaw <name> logs --follow`	Stream logs
`nemoclaw start` / `stop`	Manage auxiliary services
`openshell term`	Launch OpenShell TUI for monitoring

The Protection Model Explained

Here's what's actually enforced — and when:

Layer	What It Guards	When Applied
Network	Blocks unauthorized outbound connections	Hot-reloadable at runtime
Filesystem	No reads/writes outside `/sandbox` and `/tmp`	Locked at sandbox creation
Process	Blocks privilege escalation and dangerous syscalls	Locked at sandbox creation
Inference	Reroutes model API calls to controlled backends	Hot-reloadable at runtime

The critical design choice: filesystem and process restrictions are locked at creation time. The agent can't unlock them mid-session, even if compromised. Network and inference policies can be updated live — but only by you, from outside the sandbox.

When the agent hits a constraint, it can reason about why it's blocked and propose a policy update. You see the request in the OpenShell TUI and make the final call. Full audit trail of every allow and deny decision.

What Makes This Different from Just Using Docker?

Fair question. You could run an agent in a Docker container with restrictive policies. But NemoClaw/OpenShell gives you several things Docker alone doesn't:

Agent-aware policy engine — it evaluates actions at the binary, destination, method, and path level. An agent can install a verified skill but can't execute an unreviewed binary.
Privacy routing — inference is intercepted and routed based on policy, keeping sensitive context on-device when needed.
Live policy updates — approve new network destinations or inference providers without restarting the sandbox.
Skill verification — as the agent evolves and learns new capabilities, each new skill is subject to the same controls.
Operator approval flow — blocked actions surface in a TUI for human review, not just silent failures.

Hardware Support

NemoClaw runs on a range of NVIDIA hardware:

GeForce RTX PCs/laptops — your everyday dev machine with a GPU
RTX PRO workstations — for heavier local inference workloads
DGX Spark — NVIDIA's compact AI workstation
DGX Station — for enterprise-scale local deployment

Current Limitations (It's Alpha)

The project is in alpha, and the README says so clearly. A few things to be aware of:

Interfaces, APIs, and behavior may change without notice.
The openclaw nemoclaw plugin commands are under active development — use the nemoclaw host CLI as the primary interface.
Local inference (Ollama, vLLM) is experimental, especially on macOS.
On machines with less than 8 GB RAM, the sandbox image (~2.4 GB compressed) can trigger the OOM killer during setup. Adding swap helps.
Setup may require manual workarounds on some platforms.

Where to Go from Here

GitHub repo: github.com/NVIDIA/NemoClaw — 9k+ stars, Apache 2.0 licensed

Top comments (1)

ArshTechPro • Mar 19

NemoClaw is an open source stack from NVIDIA that wraps OpenClaw (the popular always-on AI assistant) with enterprise-grade privacy and security controls. It's built on top of two key NVIDIA projects:

OpenShell — an open source runtime (part of the NVIDIA Agent Toolkit) that acts as a governance layer between your agent and your infrastructure. Think of it as a browser sandbox, but for AI agents. It controls what the agent can see, do, and where its inference requests go.

Nemotron — NVIDIA's family of open models that can run locally on your own hardware for enhanced privacy and cost efficiency.