Running OpenClaw on NVIDIA Jetson Thor with Docker Model Runner: A Complete Guide

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MyrinNew
    Senior Member
    • Feb 2024
    • 5168

    #1

    Running OpenClaw on NVIDIA Jetson Thor with Docker Model Runner: A Complete Guide

    What if you could run your own AI-powered Discord bot — completely local, no cloud APIs, no subscription fees — on an NVIDIA Jetson Thor? That's exactly what we did. In this guide, I'll walk you through setting up OpenClaw, an open-source AI agent framework, powered by Docker Model Runner running Qwen3 8B locally on NVIDIA Jetson Thor.


    The result? A fully functional Discord bot that responds to messages using a locally hosted LLM, with zero data leaving your network.


    Prerequisites

    Before we begin, make sure you have the following ready:
    • NVIDIA Jetson Thor with Docker Engine installed
    • Docker Model Runner plugin enabled
    • Node.js v22+ installed
    • A Discord account with server admin access
    • Basic familiarity with the terminal


    Step 1: Install OpenClaw

    OpenClaw provides a one-liner installer that detects your OS and sets everything up via npm:






    curl -fsSL https://openclaw.ai/install.sh | bash







    You should see output confirming the installation:






    🦞 OpenClaw Installer
    ✓ Detected: linux
    ✓ Node.js v22.22.0 found
    ✓ npm configured for user installs
    🦞 OpenClaw installed successfully (2026.2.21-2)!



















    Choose Custom Provider.





    Please Note: The right URL would like https://localhost:12434/v1. We will change it later point of time


    Step 2: Verify Docker Model Runner

    Docker Model Runner lets you run LLMs locally as part of Docker's ecosystem. First, let's check what models are available:






    docker model ls











    MODEL NAME PARAMETERS QUANTIZATION ARCHITECTURE SIZE
    llama3.2:3B-Q4_K_M 3.21 B IQ2_XXS/Q4_K_M llama 1.87 GiB
    qwen3:8B-Q4_K_M 8.19 B IQ2_XXS/Q4_K_M qwen3 4.68 GiB
    smollm2 361.82 M IQ2_XXS/Q4_K_M llama 256.35 MiB







    We'll use Qwen3 8B as our primary model — it offers a solid balance of intelligence and performance for the Jetson Thor's capabilities.


    Verify the API Endpoint

    Docker Model Runner exposes an OpenAI-compatible API on port 12434:






    curl -s http://localhost:12434/v1/models | jq .











    {
    "object": "list",
    "data": [
    { "id": "ai/smollm2", "object": "model", "owned_by": "docker" },
    { "id": "ai/llama3.2:3B-Q4_K_M", "object": "model", "owned_by": "docker" },
    { "id": "ai/qwen3:8B-Q4_K_M", "object": "model", "owned_by": "docker" }
    ]
    }







    Test a Chat Completion





    curl -s http://localhost:12434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "qwen3:8B-Q4_K_M",
    "messages": [{"role": "user", "content": "Hello, say hi in one sentence"}],
    "max_tokens": 500
    }' | jq '.choices[0].message.content'











    "Hello! How can I assist you today?"







    If you see a response, your model runner is working perfectly.


    Important: Configure Context Size

    By default, Docker Model Runner may use a 4096-token context window, which is too small for OpenClaw (minimum 16,000 tokens required). Bump it up:






    docker model configure --context-size 32768 ai/qwen3:8B-Q4_K_M







    Verify the configuration:






    docker model configure show ai/qwen3:8B-Q4_K_M











    [
    {
    "Backend": "llama.cpp",
    "Model": "ai/qwen3:8B-Q4_K_M",
    "Config": {
    "context-size": 32768
    }
    }
    ]







    Step 3: A Note on Qwen3's Thinking Mode

    Qwen3 has a built-in "thinking" mode that uses tokens for chain-of-thought reasoning before generating a visible response. If you set max_tokens too low (e.g., 50), you might get an empty content field because all tokens were consumed by reasoning_content.


    The fix is simple: use a higher max_tokens value (500+), or disable thinking mode by adding /nothink as a system prompt. For OpenClaw usage with 32K+ context, this won't be an issue.


    Step 4: Configure OpenClaw

    Run the setup wizard:






    openclaw setup







    OpenClaw should auto-detect Docker Model Runner. The key configuration in ~/.openclaw/openclaw.json should look like this:






    {
    "models": {
    "mode": "merge",
    "providers": {
    "dmr": {
    "baseUrl": "http://localhost:12434/v1",
    "apiKey": "dmr-local",
    "api": "openai-completions",
    "models": [
    {
    "id": "ai/qwen3:8B-Q4_K_M",
    "name": "Qwen3 8B (64K context)",
    "contextWindow": 65536,
    "maxTokens": 65536
    },
    {
    "id": "ai/llama3.2:3B-Q4_K_M",
    "name": "Llama 3.2 3B",
    "contextWindow": 32768,
    "maxTokens": 32768
    }
    ]
    }
    }
    },
    "agents": {
    "defaults": {
    "model": {
    "primary": "dmr/ai/qwen3:8B-Q4_K_M"
    }
    }
    }
    }







    Pro Tip: Make sure the baseUrl uses /v1 (not /engines/v1). The /engines/v1 endpoint may report incorrect context window sizes, causing OpenClaw to reject the model with a "context window too small" error.


    Please Note: During the OpenClaw installer, if you chose Discord then it might ask for Discord Bot. Keep it ready before you proceed further.

















    Step 5: Create a Discord Bot

    Now for the fun part — connecting OpenClaw to Discord.


    Create the Application

    1. Go to the Discord Developer Portal
    2. Click New Application and name it (e.g., "OpenClaw Bot")
    3. Click Create


    Configure the Bot

    1. Click Bot in the left sidebar
    2. Scroll to Privileged Gateway Intents
    3. Enable Message Content Intent — this is critical for the bot to read messages
    4. Click Save Changes
    5. Scroll up and click Reset Token to generate a bot token
    6. Copy the token — you'll need it next


    Fix the Install Link (Avoid "Code Grant" Errors)

    This is a common gotcha. Go to Installation in the left sidebar and set the Install Link to None. This prevents the dreaded "Integration requires code grant" error when trying to invite the bot.


    Generate the Invite URL

    1. Go to OAuth2URL Generator
    2. Under Scopes, check only bot
    3. Under Bot Permissions, check: Send Messages, Read Message History, View Channels
    4. Copy the generated URL at the bottom
    5. Open it in your browser, select your server, and click Authorize


    Or use this URL template directly (replace YOUR_CLIENT_ID):






    Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.








    Step 6: Add Discord to OpenClaw

    Add the Discord channel with your bot token:






    openclaw channels add --channel discord --token YOUR_DISCORD_BOT_TOKEN







    Or edit the config directly:






    nano ~/.openclaw/openclaw.json







    Add the Discord section under channels:






    "channels": {
    "discord": {
    "enabled": true,
    "token": "YOUR_DISCORD_BOT_TOKEN",
    "groupPolicy": "open",
    "streamMode": "off"
    }
    }







    Step 7: Start the Gateway





    openclaw gateway --verbose







    You should see the bot come online:






    [gateway] agent model: dmr/ai/qwen3:8B-Q4_K_M
    [gateway] listening on ws://127.0.0.1:18789
    [discord] [default] starting provider (@OpenClaw Bot)
    [discord] logged in to discord as 1475353419764994181







    Step 8: Pair Your Discord Account

    OpenClaw uses a pairing system for DMs. Send a direct message to your bot on Discord (e.g., "Hello"). The bot will respond with a pairing code:






    OpenClaw: access not configured.
    Your Discord user id: 663426992733159434
    Pairing code: 9TRNA3AL







    Approve the pairing in your terminal:






    openclaw pairing approve --channel discord 9TRNA3AL







    Now send another message — and watch Qwen3 respond through your Discord bot, running entirely on your Jetson Thor!


    The Architecture

    Here's what's happening under the hood:






    Discord (your messages)


    ┌─────────────────┐
    │ OpenClaw │
    │ Gateway │
    │ (WebSocket) │
    │ Port 18789 │
    └────────┬────────┘


    ┌─────────────────┐
    │ Docker Model │
    │ Runner │
    │ Port 12434 │
    │ (OpenAI API) │
    └────────┬────────┘


    ┌─────────────────┐
    │ Qwen3 8B │
    │ (llama.cpp) │
    │ NVIDIA GPU │
    └─────────────────┘







    Everything runs locally on the Jetson Thor. Your messages go from Discord → OpenClaw Gateway → Docker Model Runner → Qwen3 on the GPU, and the response flows back the same way. No cloud, no API keys (except Discord's bot token), no per-token costs.


    Troubleshooting

    Here are some issues we encountered and how to fix them:


    "Model context window too small (4096 tokens)"

    This happens when Docker Model Runner defaults to 4096 context. Fix it with:






    docker model configure --context-size 32768 ai/qwen3:8B-Q4_K_M







    Also ensure your OpenClaw config uses baseUrl: "http://localhost:12434/v1" (not /engines/v1).


    "Integration requires code grant" when inviting the bot

    Go to the Discord Developer Portal → Installation → set Install Link to None. Then use a clean invite URL with only the bot scope.


    Empty responses from Qwen3

    Qwen3's thinking mode can consume all tokens before generating a visible response. Increase max_tokens to 500+ or use /nothink as a system prompt.


    Bot not appearing in Discord server

    Make sure you've authorized the bot using the OAuth2 URL with the bot scope. Check the gateway logs for [discord] logged in to discord as ... to confirm the bot is connected.


    What's Next?

    With OpenClaw running on Jetson Thor, you now have a foundation for building powerful local AI agents. Some ideas to explore:
    • Add more channels: Connect Telegram, WhatsApp, or Slack alongside Discord
    • Install skills: Extend your bot with OpenClaw skills for image generation, web browsing, and more
    • Run as a service: Use systemctl to keep the gateway running 24/7
    • Try different models: Swap between Qwen3 8B and Llama 3.2 3B depending on your speed vs. quality needs
    • Build custom skills: Create modular capability packages that teach your bot new tricks


    The beauty of this setup is that it's entirely self-hosted. Your conversations stay on your hardware, your models run on your GPU, and you have complete control over the experience.


    Conclusion

    Running OpenClaw with Docker Model Runner on NVIDIA Jetson Thor demonstrates the power of edge AI. In under 30 minutes, we went from a bare Jetson Thor to a fully functional Discord bot powered by a locally running 8B parameter model. No cloud dependencies, no recurring API costs, and complete data privacy.


    The combination of Docker's containerized model management, OpenClaw's multi-channel agent framework, and NVIDIA's GPU acceleration makes this setup both practical and powerful. Whether you're building a personal assistant, a community bot, or an edge AI prototype, this stack gives you everything you need.


    Happy hacking! 🦞




    More...
Working...