Agentic AI / Generative AI

Build a More Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw

Use NVIDIA DGX Spark to deploy OpenClaw and NemoClaw end-to-end, from model serving to Telegram connectivity, with full control over your runtime environment.

Apr 17, 2026

By Patrick Moorhead and Edward Li

Discuss (0)

AI-Generated Summary

Dislike

NVIDIA NemoClaw is an open-source stack that enables secure, on-premises deployment of autonomous AI assistants using NVIDIA Nemotron 3 Super models, orchestrated by NVIDIA OpenShell and OpenClaw for sandboxed execution and tool integration.
The tutorial guides users through deploying NemoClaw on NVIDIA DGX Spark, covering hardware prerequisites, Docker and Ollama setup, model download, sandbox configuration, and integration with Telegram for remote access.
Key security features include network and filesystem isolation managed by OpenShell, real-time policy approval for external access, and full local inference to ensure that no data leaves the device during agent operation.

AI-generated content may summarize information incompletely. Verify important information. Learn more

Agents are evolving from question-and-answer systems into long-running autonomous assistants that read files, call APIs, and drive multi-step workflows. However, deploying an agent to execute code and use tools without proper isolation raises real risks—especially when using third-party cloud infrastructure due to data privacy and control.

NVIDIA NemoClaw is an open-source reference stack that orchestrates NVIDIA OpenShell to run OpenClaw, a self-hosted gateway that connects messaging platforms to AI coding agents powered by open models like NVIDIA Nemotron. NemoClaw adds guided onboarding, lifecycle management, image hardening, and a versioned blueprint, providing a complete pipeline from model inference to more secure, interactive agent deployment.

This tutorial walks through a NemoClaw deployment on NVIDIA DGX Spark—from configuring the runtime environment and serving the model locally, to installing the NemoClaw stack and connecting it to Telegram for remote access. You’ll build a local, sandboxed AI assistant that runs on your hardware and is accessible from any Telegram client.

Video 1. A walkthrough on how to set up your autonomous long-running agent

Quick links to the model and code

Access the following resources for the tutorial:

🧠 Software and models:

NemoClaw with NVIDIA Nemotron 3 Super and Telegram on DGX Spark: An end-to-end guide for setting up NemoClaw with local inference.
NVIDIA Nemotron 3 Super 120B on NVIDIA Build: The model used for the tutorial.

🛠️ Code and documentation:

NVIDIA NemoClaw documentation: Complete reference for configuration, policies, and advanced deployment.
NVIDIA NemoClaw on GitHub: Source code and community contributions.
NVIDIA DGX Spark: Hardware specifications and developer resources.

Prerequisites

For full setup instructions, visit the DGX Spark Playbook for NemoClaw, or get started with no hardware needed.

If you intend to use another device, NemoClaw is tested and validated on devices listed under alternative deployments in the documentation. Check for API/VLLM capability.

Before beginning setup, ensure the following requirements are met:

Hardware: DGX Spark (GB10) system running Ubuntu 24.04 LTS with the latest NVIDIA drivers.
Docker: Version 28.x or higher, with the NVIDIA container runtime configured (covered in the next section).
Ollama: Installed as the local model-serving engine.
Telegram bot token: Created through Telegram’s @BotFather (detailed in the Telegram integration section).

Estimated time: Approximately 20–30 minutes of active setup, plus 15–30 minutes for the initial model download (~87 GB), depending on network bandwidth.

The following commands verify system readiness:

head -n 2 /etc/os-release    # Expected: Ubuntu 24.04
nvidia-smi                     # Expected: NVIDIA GB10 GPU
docker info --format '{{.ServerVersion}}'  # Expected: 28.x+

The NemoClaw components

Before building a sandboxed assistant, it’s important to understand the software used in this environment.

Component	What it is	What it does	When to use It
NVIDIA NemoClaw	Reference stack with Orchestration layer and Installer	Installs OpenClaw and OpenShell with policies and inference.	Fastest way to create an always-on assistant in a more secure sandbox.
NVIDIA OpenShell	Security runtime and gateway	Enforces safety boundaries (sandboxing), manages credentials, and proxies network/API calls.	When you need a “walled garden” to run agents without exposing sensitive information or enabling unrestricted web access.
OpenClaw	Multi-channel agent framework	Lives inside the sandbox. Manages chat platforms (Slack/Discord), memory, and tool integration.	When you need to create a long-lived agent connected to messaging apps and persistent memory.
NVIDIA Nemotron 3 Super 120B	Agent-optimized LLM (120B Parameters)	Provides the “brain” with high instruction-following and multi-step reasoning capabilities.	For production-grade assistants who need to use tools and follow complex workflows.
NVIDIA NIM / Ollama	Inference deployments	Runs the Nemotron model locally	If you have a GPU and want to run the LLM locally

Table 1. Architectural components of the NVIDIA NemoClaw stack

Security note: While OpenShell provides robust isolation, remember that no sandbox offers complete protection against advanced prompt injection. Always deploy on isolated systems when testing new tools.

Let’s get started.

Configure the runtimes

DGX Spark requires several Docker configuration steps to support GPU-accelerated containers with the appropriate isolation settings. Start by registering the NVIDIA container runtime with Docker:

sudo nvidia-ctk runtime configure --runtime=docker

Next, set the cgroup namespace mode to host. This configuration is required for DGX Spark to work correctly with containerized workloads:

sudo python3 -c "
import json, os
path = '/etc/docker/daemon.json'
d = json.load(open(path)) if os.path.exists(path) else {}
d['default-cgroupns-mode'] = 'host'
json.dump(d, open(path, 'w'), indent=2)
"

Restart Docker to apply the changes and verify that the NVIDIA runtime is functioning:

sudo systemctl restart docker
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

The output should display the GB10 GPU. To avoid requiring sudo for subsequent Docker commands, add the current user to the Docker group:

sudo usermod -aG docker $USER
newgrp docker

Install Ollama

Ollama is a lightweight model-serving engine for running large language models locally. Install it using the official installer:

  curl -fsSL https://ollama.com/install.sh | sh

By default, Ollama listens only on localhost. Because the NemoClaw agent runs inside a sandbox, with its own network namespace, it must reach Ollama across network boundaries. Configure Ollama to listen on all interfaces:

  sudo mkdir -p /etc/systemd/system/ollama.service.d                                                                                                                                                                                            
  printf '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0"\n' | \     
    sudo tee /etc/systemd/system/ollama.service.d/override.conf
  sudo systemctl daemon-reload                                                                                                                                                                                                                  
  sudo systemctl restart ollama

Verify that Ollama is running and reachable on all interfaces:

  curl http://0.0.0.0:11434

Important: Only start Ollama through systemd. A manually started Ollama process doesn’t pick up the OLLAMA_HOST=0.0.0.0 override, and the NemoClaw sandbox won’t reach the inference server.

sudo systemctl restart ollama

Next, pull the Nemotron 3 Super 120B model. The download is about 87 GB:

ollama pull nemotron-3-super:120b

Once the download completes, pre-load the model weights into GPU memory to avoid cold-start latency on the first agent interaction:

ollama run nemotron-3-super:120b

After the model loads and presents a prompt, exit the session with /bye. The weights will remain cached in memory. Confirm that the model is available:

ollama list
# You should see something like 
NAME                     ID              SIZE     MODIFIED                                                                                                                                                                                        nemotron-3-super:120b    95acc78b3ffd    86 GB    2 weeks ago

Install NemoClaw

With the foundation in place, install NemoClaw with a single command:

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

The installer provides Node.js dependencies, the OpenShell runtime, and the NemoClaw CLI, then launches an onboarding wizard. The wizard prompts for the following configuration choices:

Sandbox name: Specify a lowercase alphanumeric name with hyphens (for example, my-assistant). This name is used in all subsequent commands.
Inference provider: Select Local Ollama (option 7) to route inference to the local Ollama instance.
Model: Select nemotron-3-super:120b (option 1).
Policy presets: Press Y if you accept the default policies. These presets configure filesystem and network restrictions for the sandbox.
Telegram integration: You can optionally configure your Telegram bot during step 5 of the onboarding wizard.

At the end of the onboarding process, the installer displays a tokenized Web UI URL in the format http://127.0.0.1:18789/#token=<long-token-here>. Record this URL, as it is required to access the web dashboard in the future and won’t be shown again.

If the nemoclaw command isn’t recognized after installation, reload the shell environment to enable it for all future sessions:

source ~/.bashrc

Verify the setup

Connect to the sandbox and verify that the agent can reach the inference backend:

nemoclaw my-assistant connect

This command returns model information confirming that the sandboxed environment can communicate with Ollama. Next, send a test message through the agent:

openclaw agent --agent main --local -m "hello" --session-id test

If the configuration is correct, NVIDIA Nemotron 3 Super generates a response. Note that inference with the 120B model typically takes 30–90 seconds per response—this is expected for a model of this size running local inference.

The interactive terminal UI provides a more conversational testing experience:

openclaw tui

Use Ctrl+C to exit the terminal UI when finished.

Accessing the Web UI

To access the web dashboard locally, exit the sandbox and open the tokenized URL recorded during onboarding:

exit

Then navigate to http://127.0.0.1:18789/#token=<long-token-here> in a browser.

Remote access from another machine. If you’re accessing DGX Spark over the network rather than directly, additional configuration is required. First, determine the Spark’s IP address:

hostname -I | awk '{print $1}'

Start port forwarding through the Spark’s terminal session:

openshell forward start 18789 my-assistant --background

From your remote machine, create an SSH tunnel to the Spark:

ssh -L 18789:127.0.0.1:18789 <your-user>@<your-spark-ip>

With the tunnel active, open http://127.0.0.1:18789/#token=<long-token-here> in a browser on the remote machine.

Note: Only use 127.0.0.1. A localhost may result in an “origin not allowed” error.

Connect to Telegram

Telegram integration extends the assistant beyond the local terminal, making it accessible from any device with a Telegram client.

Create the Telegram bot

Open Telegram and search for @BotFather to manage your bots. Start a conversation and use the /newbot command. @BotFather guides you through naming the bot and provides an API token upon completion. Save this token for the configuration step below.

Note: If you configured Telegram during the NemoClaw onboarding wizard, Telegram is already running inside the sandbox.

If you didn’t configure Telegram during onboarding, rerun the onboarding wizard with the token set. This rebuilds the sandbox with Telegram baked in. The bot token is registered with the OpenShell gateway and doesn’t enter the sandbox directly.

export TELEGRAM_BOT_TOKEN=<your-bot-token>
nemoclaw onboard

Verify the integration

Open Telegram, locate the bot, and send a message. On first contact, OpenClaw requires pairing. The bot will respond with a pairing code:

OpenClaw: access not configured. 
Your Telegram user id: <your-id> 
Pairing code: <CODE>

Approve the pairing from inside the sandbox:

nemoclaw my-assistant connect  
openclaw pairing approve telegram <CODE>
exit

Send another message in Telegram. After the inference latency window, the bot should return a response generated by NVIDIA Nemotron 3 Super.

At this point, the deployment is complete. An AI assistant is running entirely on NVIDIA DGX Spark, sandboxed by OpenShell, powered by a 120B open model, and accessible remotely through Telegram. All inference occurs locally no data leaves the device, and there are no external service dependencies at runtime.

What commands can I reference for deployment?

The following commands are useful for ongoing management of the NemoClaw deployment.

Command	Description
`nemoclaw my-assistant connect`	Open a shell session inside the sandbox.
`nemoclaw my-assistant status`	Display sandbox status.
`nemoclaw my-assistant logs --follow`	Stream live sandbox logs.
`nemoclaw list`	List all configured sandboxes.
`nemoclaw start / nemoclaw stop`	Start or stop auxiliary services (Telegram bridge, etc.).
`openshell forward start 18789 my-assistant --background`	Enable port forwarding for remote Web UI access.

Table 2. Commands for orchestrating, monitoring, and accessing NemoClaw agent environments

Commands for a clean uninstall

For cleanup and uninstallation, NemoClaw provides an uninstaller at ~/.nemoclaw/source/uninstall.sh. Refer to the instructions page for details on cleanup flags and troubleshooting common issues.

Extending agent access with policy approvals

By default, the sandbox restricts the agent to a limited set of network endpoints. When you ask the agent to do something that requires an external service.

For example, fetching a webpage or calling a third-party API, OpenShell blocks the request and the agent reports that network access isn’t available.

To see this in action, open the OpenShell TUI in one terminal on the host:

openshell term

In a second terminal, connect to the sandbox and start a conversation:

openclaw tui

Ask the agent to do something like “use curl to fetch https://httpbin.org/get“.

The agent attempts the request, OpenShell blocks it, and the TUI displays the blocked connection with the destination host, port, and the binary that initiated it.

From the TUI, you can approve the request for the current session, or deny it to keep the endpoint blocked.

When you want to permanently add an endpoint, use a policy preset from the host:

nemoclaw my-assistant policy-add

This approval flow gives you real-time visibility and control over what the agent can access without modifying the base policy or restarting the sandbox.

Get started

Start building with NVIDIA NemoClaw today.

Stay up to date on NVIDIA NemoClaw by subscribing to NVIDIA news and following NVIDIA AI on LinkedIn, X, Discord, and YouTube.

Visit the NemoClaw page for resources to get started. Explore NemoClaw on GitHub and Playbook available on build.nvidia.com.

Engage with Nemotron livestreams, tutorials, and the developer community on the NVIDIA forum and Discord.

Discuss (0)

About the Authors

About Patrick Moorhead
Patrick Moorhead is a member of the technical marketing engineering team at NVIDIA. His interests include building AI agents and machine learning. He is currently studying data science at Baylor University.

View all posts by Patrick Moorhead

About Edward Li
Edward Li is a technical marketing engineer with NVIDIA Enterprise Computing. He is a recent graduate of the University of Pennsylvania School of Engineering and Applied Science. He holds a bachelor’s degree and a master’s degree in Computer Science with a concentration in Data Science. At NVIDIA, Edward is passionate about data science, AI, and ML and is working on solutions to bring generative AI to enterprises.

View all posts by Edward Li