Agents are evolving from question-and-answer systems into long-running autonomous assistants that read files, call APIs, and drive multi-step workflows. However, deploying an agent to execute code and use tools without proper isolation raises real risks—especially when using third-party cloud infrastructure due to data privacy and control.
NVIDIA NemoClaw is an open-source reference stack that orchestrates NVIDIA OpenShell to run OpenClaw, a self-hosted gateway that connects messaging platforms to AI coding agents powered by open models like NVIDIA Nemotron. NemoClaw adds guided onboarding, lifecycle management, image hardening, and a versioned blueprint, providing a complete pipeline from model inference to more secure, interactive agent deployment.
This tutorial walks through a NemoClaw deployment on NVIDIA DGX Spark—from configuring the runtime environment and serving the model locally, to installing the NemoClaw stack and connecting it to Telegram for remote access. You’ll build a local, sandboxed AI assistant that runs on your hardware and is accessible from any Telegram client.
Quick links to the model and code
Access the following resources for the tutorial:
🧠 Software and models:
- NemoClaw with NVIDIA Nemotron 3 Super and Telegram on DGX Spark: An end-to-end guide for setting up NemoClaw with local inference.
- NVIDIA Nemotron 3 Super 120B on NVIDIA Build: The model used for the tutorial.
🛠️ Code and documentation:
- NVIDIA NemoClaw documentation: Complete reference for configuration, policies, and advanced deployment.
- NVIDIA NemoClaw on GitHub: Source code and community contributions.
- NVIDIA DGX Spark: Hardware specifications and developer resources.
Prerequisites
For full setup instructions, visit the DGX Spark Playbook for NemoClaw, or get started with no hardware needed.
If you intend to use another device, NemoClaw is tested and validated on devices listed under alternative deployments in the documentation. Check for API/VLLM capability.
Before beginning setup, ensure the following requirements are met:
- Hardware: DGX Spark (GB10) system running Ubuntu 24.04 LTS with the latest NVIDIA drivers.
- Docker: Version 28.x or higher, with the NVIDIA container runtime configured (covered in the next section).
- Ollama: Installed as the local model-serving engine.
- Telegram bot token: Created through Telegram’s @BotFather (detailed in the Telegram integration section).
Estimated time: Approximately 20–30 minutes of active setup, plus 15–30 minutes for the initial model download (~87 GB), depending on network bandwidth.
The following commands verify system readiness:
head -n 2 /etc/os-release # Expected: Ubuntu 24.04
nvidia-smi # Expected: NVIDIA GB10 GPU
docker info --format '{{.ServerVersion}}' # Expected: 28.x+
The NemoClaw components
Before building a sandboxed assistant, it’s important to understand the software used in this environment.
| Component | What it is | What it does | When to use It |
| NVIDIA NemoClaw | Reference stack with Orchestration layer and Installer | Installs OpenClaw and OpenShell with policies and inference. | Fastest way to create an always-on assistant in a more secure sandbox. |
| NVIDIA OpenShell | Security runtime and gateway | Enforces safety boundaries (sandboxing), manages credentials, and proxies network/API calls. | When you need a “walled garden” to run agents without exposing sensitive information or enabling unrestricted web access. |
| OpenClaw | Multi-channel agent framework | Lives inside the sandbox. Manages chat platforms (Slack/Discord), memory, and tool integration. | When you need to create a long-lived agent connected to messaging apps and persistent memory. |
| NVIDIA Nemotron 3 Super 120B | Agent-optimized LLM (120B Parameters) | Provides the “brain” with high instruction-following and multi-step reasoning capabilities. | For production-grade assistants who need to use tools and follow complex workflows. |
| NVIDIA NIM / Ollama | Inference deployments | Runs the Nemotron model locally | If you have a GPU and want to run the LLM locally |
Security note: While OpenShell provides robust isolation, remember that no sandbox offers complete protection against advanced prompt injection. Always deploy on isolated systems when testing new tools.
Let’s get started.
Configure the runtimes
DGX Spark requires several Docker configuration steps to support GPU-accelerated containers with the appropriate isolation settings. Start by registering the NVIDIA container runtime with Docker:
sudo nvidia-ctk runtime configure --runtime=docker
Next, set the cgroup namespace mode to host. This configuration is required for DGX Spark to work correctly with containerized workloads:
sudo python3 -c "
import json, os
path = '/etc/docker/daemon.json'
d = json.load(open(path)) if os.path.exists(path) else {}
d['default-cgroupns-mode'] = 'host'
json.dump(d, open(path, 'w'), indent=2)
"
Restart Docker to apply the changes and verify that the NVIDIA runtime is functioning:
sudo systemctl restart docker
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
The output should display the GB10 GPU. To avoid requiring sudo for subsequent Docker commands, add the current user to the Docker group:
sudo usermod -aG docker $USER
newgrp docker
Install Ollama
Ollama is a lightweight model-serving engine for running large language models locally. Install it using the official installer:
curl -fsSL https://ollama.com/install.sh | sh
By default, Ollama listens only on localhost. Because the NemoClaw agent runs inside a sandbox, with its own network namespace, it must reach Ollama across network boundaries. Configure Ollama to listen on all interfaces:
sudo mkdir -p /etc/systemd/system/ollama.service.d
printf '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0"\n' | \
sudo tee /etc/systemd/system/ollama.service.d/override.conf
sudo systemctl daemon-reload
sudo systemctl restart ollama
Verify that Ollama is running and reachable on all interfaces:
curl http://0.0.0.0:11434
Important: Only start Ollama through systemd. A manually started Ollama process doesn’t pick up the OLLAMA_HOST=0.0.0.0 override, and the NemoClaw sandbox won’t reach the inference server.
sudo systemctl restart ollama
Next, pull the Nemotron 3 Super 120B model. The download is about 87 GB:
ollama pull nemotron-3-super:120b
Once the download completes, pre-load the model weights into GPU memory to avoid cold-start latency on the first agent interaction:
ollama run nemotron-3-super:120b
After the model loads and presents a prompt, exit the session with /bye. The weights will remain cached in memory. Confirm that the model is available:
ollama list
# You should see something like
NAME ID SIZE MODIFIED nemotron-3-super:120b 95acc78b3ffd 86 GB 2 weeks ago
Install NemoClaw
With the foundation in place, install NemoClaw with a single command:
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
The installer provides Node.js dependencies, the OpenShell runtime, and the NemoClaw CLI, then launches an onboarding wizard. The wizard prompts for the following configuration choices:
- Sandbox name: Specify a lowercase alphanumeric name with hyphens (for example,
my-assistant). This name is used in all subsequent commands. - Inference provider: Select Local Ollama (option 7) to route inference to the local Ollama instance.
- Model: Select nemotron-3-super:120b (option 1).
- Policy presets: Press Y if you accept the default policies. These presets configure filesystem and network restrictions for the sandbox.
- Telegram integration: You can optionally configure your Telegram bot during step 5 of the onboarding wizard.
At the end of the onboarding process, the installer displays a tokenized Web UI URL in the format http://127.0.0.1:18789/#token=<long-token-here>. Record this URL, as it is required to access the web dashboard in the future and won’t be shown again.
If the nemoclaw command isn’t recognized after installation, reload the shell environment to enable it for all future sessions:
source ~/.bashrc
Verify the setup
Connect to the sandbox and verify that the agent can reach the inference backend:
nemoclaw my-assistant connect
This command returns model information confirming that the sandboxed environment can communicate with Ollama. Next, send a test message through the agent:
openclaw agent --agent main --local -m "hello" --session-id test
If the configuration is correct, NVIDIA Nemotron 3 Super generates a response. Note that inference with the 120B model typically takes 30–90 seconds per response—this is expected for a model of this size running local inference.
The interactive terminal UI provides a more conversational testing experience:
openclaw tui
Use Ctrl+C to exit the terminal UI when finished.
Accessing the Web UI
To access the web dashboard locally, exit the sandbox and open the tokenized URL recorded during onboarding:
exit
Then navigate to http://127.0.0.1:18789/#token=<long-token-here> in a browser.
Remote access from another machine. If you’re accessing DGX Spark over the network rather than directly, additional configuration is required. First, determine the Spark’s IP address:
hostname -I | awk '{print $1}'
Start port forwarding through the Spark’s terminal session:
openshell forward start 18789 my-assistant --background
From your remote machine, create an SSH tunnel to the Spark:
ssh -L 18789:127.0.0.1:18789 <your-user>@<your-spark-ip>
With the tunnel active, open http://127.0.0.1:18789/#token=<long-token-here> in a browser on the remote machine.
Note: Only use 127.0.0.1. A localhost may result in an “origin not allowed” error.
Connect to Telegram
Telegram integration extends the assistant beyond the local terminal, making it accessible from any device with a Telegram client.
Create the Telegram bot
Open Telegram and search for @BotFather to manage your bots. Start a conversation and use the /newbot command. @BotFather guides you through naming the bot and provides an API token upon completion. Save this token for the configuration step below.
Note: If you configured Telegram during the NemoClaw onboarding wizard, Telegram is already running inside the sandbox.
If you didn’t configure Telegram during onboarding, rerun the onboarding wizard with the token set. This rebuilds the sandbox with Telegram baked in. The bot token is registered with the OpenShell gateway and doesn’t enter the sandbox directly.
export TELEGRAM_BOT_TOKEN=<your-bot-token>
nemoclaw onboard
Verify the integration
Open Telegram, locate the bot, and send a message. On first contact, OpenClaw requires pairing. The bot will respond with a pairing code:
OpenClaw: access not configured.
Your Telegram user id: <your-id>
Pairing code: <CODE>
Approve the pairing from inside the sandbox:
nemoclaw my-assistant connect
openclaw pairing approve telegram <CODE>
exit
Send another message in Telegram. After the inference latency window, the bot should return a response generated by NVIDIA Nemotron 3 Super.
At this point, the deployment is complete. An AI assistant is running entirely on NVIDIA DGX Spark, sandboxed by OpenShell, powered by a 120B open model, and accessible remotely through Telegram. All inference occurs locally no data leaves the device, and there are no external service dependencies at runtime.
What commands can I reference for deployment?
The following commands are useful for ongoing management of the NemoClaw deployment.
| Command | Description |
|---|---|
nemoclaw my-assistant connect | Open a shell session inside the sandbox. |
nemoclaw my-assistant status | Display sandbox status. |
nemoclaw my-assistant logs --follow | Stream live sandbox logs. |
nemoclaw list | List all configured sandboxes. |
nemoclaw start / nemoclaw stop | Start or stop auxiliary services (Telegram bridge, etc.). |
openshell forward start 18789 my-assistant --background | Enable port forwarding for remote Web UI access. |
Commands for a clean uninstall
For cleanup and uninstallation, NemoClaw provides an uninstaller at ~/.nemoclaw/source/uninstall.sh. Refer to the instructions page for details on cleanup flags and troubleshooting common issues.
Extending agent access with policy approvals
By default, the sandbox restricts the agent to a limited set of network endpoints. When you ask the agent to do something that requires an external service.
For example, fetching a webpage or calling a third-party API, OpenShell blocks the request and the agent reports that network access isn’t available.
To see this in action, open the OpenShell TUI in one terminal on the host:
openshell term
In a second terminal, connect to the sandbox and start a conversation:
openclaw tui
Ask the agent to do something like “use curl to fetch https://httpbin.org/get“.
The agent attempts the request, OpenShell blocks it, and the TUI displays the blocked connection with the destination host, port, and the binary that initiated it.
From the TUI, you can approve the request for the current session, or deny it to keep the endpoint blocked.
When you want to permanently add an endpoint, use a policy preset from the host:
nemoclaw my-assistant policy-add
This approval flow gives you real-time visibility and control over what the agent can access without modifying the base policy or restarting the sandbox.
Get started
Start building with NVIDIA NemoClaw today.
Stay up to date on NVIDIA NemoClaw by subscribing to NVIDIA news and following NVIDIA AI on LinkedIn, X, Discord, and YouTube.
Visit the NemoClaw page for resources to get started. Explore NemoClaw on GitHub and Playbook available on build.nvidia.com.
Engage with Nemotron livestreams, tutorials, and the developer community on the NVIDIA forum and Discord.