Ajeet Singh Raina

Posted on Feb 23

Running OpenClaw on NVIDIA Jetson Thor with Docker Model Runner: A Complete Guide

#ai #docker #llm #tutorial

What if you could run your own AI-powered Discord bot completely local, no cloud APIs, no subscription fees on an NVIDIA Jetson Thor? That's exactly what we did. In this guide, I'll walk you through setting up OpenClaw, an open-source AI agent framework, powered by Docker Model Runner running Qwen3 8B locally on NVIDIA Jetson Thor.

The result? A fully functional Discord bot that responds to messages using a locally hosted LLM, with zero data leaving your network.

Prerequisites

Before we begin, make sure you have the following ready:

NVIDIA Jetson Thor with Docker Engine installed
Docker Model Runner plugin enabled
Node.js v22+ installed
A Discord account with server admin access
Basic familiarity with the terminal

Step 1: Install OpenClaw

OpenClaw provides a one-liner installer that detects your OS and sets everything up via npm:

curl -fsSL https://openclaw.ai/install.sh | bash

You should see output confirming the installation:

🦞 OpenClaw Installer
✓ Detected: linux
✓ Node.js v22.22.0 found
✓ npm configured for user installs
🦞 OpenClaw installed successfully (2026.2.21-2)!

Choose Custom Provider.

Please Note: The right URL would like https://localhost:12434/v1. We will change it later point of time

Step 2: Verify Docker Model Runner

Docker Model Runner lets you run LLMs locally as part of Docker's ecosystem. First, let's check what models are available:

docker model ls

MODEL NAME           PARAMETERS  QUANTIZATION     ARCHITECTURE  SIZE
llama3.2:3B-Q4_K_M   3.21 B      IQ2_XXS/Q4_K_M  llama         1.87 GiB
qwen3:8B-Q4_K_M      8.19 B      IQ2_XXS/Q4_K_M  qwen3         4.68 GiB
smollm2              361.82 M    IQ2_XXS/Q4_K_M  llama         256.35 MiB

We'll use Qwen3 8B as our primary model — it offers a solid balance of intelligence and performance for the Jetson Thor's capabilities.

Verify the API Endpoint

Docker Model Runner exposes an OpenAI-compatible API on port 12434:

curl -s http://localhost:12434/v1/models | jq .

{
  "object": "list",
  "data": [
    { "id": "ai/smollm2", "object": "model", "owned_by": "docker" },
    { "id": "ai/llama3.2:3B-Q4_K_M", "object": "model", "owned_by": "docker" },
    { "id": "ai/qwen3:8B-Q4_K_M", "object": "model", "owned_by": "docker" }
  ]
}

Test a Chat Completion

curl -s http://localhost:12434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3:8B-Q4_K_M",
    "messages": [{"role": "user", "content": "Hello, say hi in one sentence"}],
    "max_tokens": 500
  }' | jq '.choices[0].message.content'

"Hello! How can I assist you today?"

If you see a response, your model runner is working perfectly.

Important: Configure Context Size

By default, Docker Model Runner may use a 4096-token context window, which is too small for OpenClaw (minimum 16,000 tokens required). Bump it up:

docker model configure --context-size 32768 ai/qwen3:8B-Q4_K_M

Verify the configuration:

docker model configure show ai/qwen3:8B-Q4_K_M

[
  {
    "Backend": "llama.cpp",
    "Model": "ai/qwen3:8B-Q4_K_M",
    "Config": {
      "context-size": 32768
    }
  }
]

Step 3: A Note on Qwen3's Thinking Mode

Qwen3 has a built-in "thinking" mode that uses tokens for chain-of-thought reasoning before generating a visible response. If you set max_tokens too low (e.g., 50), you might get an empty content field because all tokens were consumed by reasoning_content.

The fix is simple: use a higher max_tokens value (500+), or disable thinking mode by adding /nothink as a system prompt. For OpenClaw usage with 32K+ context, this won't be an issue.

Step 4: Configure OpenClaw

Run the setup wizard:

openclaw setup

OpenClaw should auto-detect Docker Model Runner. The key configuration in ~/.openclaw/openclaw.json should look like this:

{
  "models": {
    "mode": "merge",
    "providers": {
      "dmr": {
        "baseUrl": "http://localhost:12434/v1",
        "apiKey": "dmr-local",
        "api": "openai-completions",
        "models": [
          {
            "id": "ai/qwen3:8B-Q4_K_M",
            "name": "Qwen3 8B (64K context)",
            "contextWindow": 65536,
            "maxTokens": 65536
          },
          {
            "id": "ai/llama3.2:3B-Q4_K_M",
            "name": "Llama 3.2 3B",
            "contextWindow": 32768,
            "maxTokens": 32768
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "dmr/ai/qwen3:8B-Q4_K_M"
      }
    }
  }
}

Pro Tip: Make sure the baseUrl uses /v1 (not /engines/v1). The /engines/v1 endpoint may report incorrect context window sizes, causing OpenClaw to reject the model with a "context window too small" error.

Please Note: During the OpenClaw installer, if you chose Discord then it might ask for Discord Bot. Keep it ready before you proceed further.

Step 5: Create a Discord Bot

Now for the fun part — connecting OpenClaw to Discord.

Create the Application

Go to the Discord Developer Portal
Click New Application and name it (e.g., "OpenClaw Bot")
Click Create

Configure the Bot

Click Bot in the left sidebar
Scroll to Privileged Gateway Intents
Enable Message Content Intent — this is critical for the bot to read messages
Click Save Changes
Scroll up and click Reset Token to generate a bot token
Copy the token — you'll need it next

Fix the Install Link (Avoid "Code Grant" Errors)

This is a common gotcha. Go to Installation in the left sidebar and set the Install Link to None. This prevents the dreaded "Integration requires code grant" error when trying to invite the bot.

Generate the Invite URL

Go to OAuth2 → URL Generator
Under Scopes, check only bot
Under Bot Permissions, check: Send Messages, Read Message History, View Channels
Copy the generated URL at the bottom
Open it in your browser, select your server, and click Authorize

Or use this URL template directly (replace YOUR_CLIENT_ID):

https://discord.com/oauth2/authorize?client_id=YOUR_CLIENT_ID&permissions=68608&scope=bot

Step 6: Add Discord to OpenClaw

Add the Discord channel with your bot token:

openclaw channels add --channel discord --token YOUR_DISCORD_BOT_TOKEN

Or edit the config directly:

nano ~/.openclaw/openclaw.json

Add the Discord section under channels:

"channels": {
  "discord": {
    "enabled": true,
    "token": "YOUR_DISCORD_BOT_TOKEN",
    "groupPolicy": "open",
    "streamMode": "off"
  }
}

Step 7: Start the Gateway

openclaw gateway --verbose

You should see the bot come online:

[gateway] agent model: dmr/ai/qwen3:8B-Q4_K_M
[gateway] listening on ws://127.0.0.1:18789
[discord] [default] starting provider (@OpenClaw Bot)
[discord] logged in to discord as 1475353419764994181

Step 8: Pair Your Discord Account

OpenClaw uses a pairing system for DMs. Send a direct message to your bot on Discord (e.g., "Hello"). The bot will respond with a pairing code:

OpenClaw: access not configured.
Your Discord user id: 663426992733159434
Pairing code: 9TRNA3AL

Approve the pairing in your terminal:

openclaw pairing approve --channel discord 9TRNA3AL

Now send another message — and watch Qwen3 respond through your Discord bot, running entirely on your Jetson Thor!

The Architecture

Here's what's happening under the hood:

Discord (your messages)
        │
        ▼
┌─────────────────┐
│  OpenClaw       │
│  Gateway        │
│  (WebSocket)    │
│  Port 18789     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Docker Model   │
│  Runner         │
│  Port 12434     │
│  (OpenAI API)   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Qwen3 8B       │
│  (llama.cpp)    │
│  NVIDIA GPU     │
└─────────────────┘

Everything runs locally on the Jetson Thor. Your messages go from Discord → OpenClaw Gateway → Docker Model Runner → Qwen3 on the GPU, and the response flows back the same way. No cloud, no API keys (except Discord's bot token), no per-token costs.

Troubleshooting

Here are some issues we encountered and how to fix them:

"Model context window too small (4096 tokens)"

This happens when Docker Model Runner defaults to 4096 context. Fix it with:

docker model configure --context-size 32768 ai/qwen3:8B-Q4_K_M

Also ensure your OpenClaw config uses baseUrl: "http://localhost:12434/v1" (not /engines/v1).

"Integration requires code grant" when inviting the bot

Go to the Discord Developer Portal → Installation → set Install Link to None. Then use a clean invite URL with only the bot scope.

Empty responses from Qwen3

Qwen3's thinking mode can consume all tokens before generating a visible response. Increase max_tokens to 500+ or use /nothink as a system prompt.

Bot not appearing in Discord server

Make sure you've authorized the bot using the OAuth2 URL with the bot scope. Check the gateway logs for [discord] logged in to discord as ... to confirm the bot is connected.

What's Next?

With OpenClaw running on Jetson Thor, you now have a foundation for building powerful local AI agents. Some ideas to explore:

Add more channels: Connect Telegram, WhatsApp, or Slack alongside Discord
Install skills: Extend your bot with OpenClaw skills for image generation, web browsing, and more
Run as a service: Use systemctl to keep the gateway running 24/7
Try different models: Swap between Qwen3 8B and Llama 3.2 3B depending on your speed vs. quality needs
Build custom skills: Create modular capability packages that teach your bot new tricks

The beauty of this setup is that it's entirely self-hosted. Your conversations stay on your hardware, your models run on your GPU, and you have complete control over the experience.

Conclusion

Running OpenClaw with Docker Model Runner on NVIDIA Jetson Thor demonstrates the power of edge AI. In under 30 minutes, we went from a bare Jetson Thor to a fully functional Discord bot powered by a locally running 8B parameter model. No cloud dependencies, no recurring API costs, and complete data privacy.

The combination of Docker's containerized model management, OpenClaw's multi-channel agent framework, and NVIDIA's GPU acceleration makes this setup both practical and powerful. Whether you're building a personal assistant, a community bot, or an edge AI prototype, this stack gives you everything you need.

Happy hacking! 🦞

DEV Community