Agentic AI / Generative AI

Optimize Supply Chain Decision Systems Using NVIDIA cuOpt Agent Skills

Decorative image.

Modern supply chains operate under the constant pressures of fluctuating demand, volatile costs, constrained capacity, and interdependent decision-making. Traditionally, specialized operations research (OR) teams solved these problems by translating business questions into mathematical models. This process can take weeks and often produces fragile solutions that struggle to adapt when conditions change.

Agentic AI is changing this paradigm. Combining the reasoning capabilities of LLMs with the computational power of GPU-accelerated solvers, AI agents can interpret business problems expressed in natural language and translate them into rigorous, optimized decisions in seconds.

At the heart of this approach are agent skills—an open format for extending agents with specialized knowledge and workflows. Skills serve as a packaging mechanism, dynamically loading the correct procedural context and improving agent performance on specific tasks.

This post outlines core NVIDIA cuOpt agent skills, their significance, and how they work together to accelerate a multi-period supply chain planning use case by converting natural language business problems into mathematical models and solving them with the NVIDIA cuOpt decision optimization solver. 

How to use the NVIDIA cuOpt agent skills  

NVIDIA cuOpt is a GPU-accelerated decision optimization engine that solves linear programming (LP), mixed-integer programming (MIP), and routing problems orders of magnitude faster than CPU-based solvers. By making cuOpt available as an agent skill, the LLM can hand off the mathematical heavy-lifting to the GPU, while focusing on understanding the business problem, gathering data, and returning actionable results.

The following steps outline how to set up and use the NVIDIA cuOpt supply chain agent reference workflow, which uses cuOpt agent skills to perform GPU-accelerated supply chain optimization using agent-driven workflows.

Step 1: Set up the environment

Provision a system with an NVIDIA GPU and install the NVIDIA Container Toolkit, which enables GPU access inside containerized workloads. Either run this on your own infrastructure or deploy a Brev Launchable for a preconfigured GPU environment in the cloud with NVIDIA CUDA, Docker, and other pre-requisites already installed. 

Then, install the cuOpt agent package along with its dependencies. The demo application is already containerized, ensuring reproducibility and simplifying deployment across development, staging, and production environments.

Step 2: Initialize the agent

The agent uses MiniMax M2.5 as its reasoning model. Use the publicly hosted endpoints or, for best performance, deploy the NVIDIA NIM locally. 

The rest of the deployment process is straightforward. Since the app is containerized, a simple Docker Compose command launches the UI and the Phoenix tracing on their specific ports, which you can open in new tabs. 

The source code includes a few skills that the agent can use. These skills act as well-defined function signatures that the LLM can invoke. Each encapsulates a specific optimization capability (e.g., production planning, inventory optimization, or route optimization) along with input/output schemas. Registering skills this way enables the LLM to discover and call them dynamically based on user intent.

Step 3: Provide the supply chain data

Supply the agent with the domain-specific data required for optimization. For a multi-period planning problem, this typically includes:

  • Demand forecasts by product, region, and time period.
  • Production capacity and unit costs across facilities.
  • Inventory holding costs and storage limits.
  • Transportation costs and lead times.
  • Business constraints such as service-level agreements or minimum production runs.

In a production deployment, this data pulls directly from planning systems. For demonstration purposes, the reference workflow uses mock datasets mirroring real-world structure.

Step 4: Invoke the agent skills

Prompt the agent with a natural language operational goal, such as “Generate a 12-week production and inventory plan that minimizes total cost while meeting forecasted demand across all distribution centers.”

Under the hood, the workflow uses LangChain Deep Agents to spawn a hierarchy of sub-agents, each responsible for a portion of the workflow. The orchestrating agent reasons about the goal, decomposes it into steps, and delegates tasks. One sub-agent may extract and validate input data, another may formulate the mathematical model, and another may invoke the cuOpt skill. 

When the cuOpt skill is called, the agent passes a structured payload containing decision variables, objective function, and constraints to the cuOpt solver.

Step 5: Retrieve and act on the solution

cuOpt executes the optimization on the GPU using massive parallelism to evaluate the solution space faster than traditional CPU solvers. Once a solution is found, the agent receives the optimized decision variables (e.g., how much of each product to produce in each period, how much inventory to carry, or where to ship it) and translates them back into a human-readable summary. This often includes key metrics such as total cost, capacity utilization, and constraint slack.

The result is an actionable plan that decision-makers can review, refine through follow-up prompts, or push directly into downstream execution systems. 

Watch the following tutorial to learn more about how to set up and run the cuOpt supply chain agent reference workflow using the NVIDIA Brev launchable.

Video 1.  End-to-end supply chain decision optimization using NVIDIA cuOpt agent skills

Extendible agentic architecture

The cuOpt supply chain agent reference workflow is a simplified starting point. You can extend it with additional agent skills and orchestration patterns to better suit your production enterprise workloads. The architecture diagram below shows an extensible pattern for adding enterprise-grade coordination, governance, reliability, and robustness around the core agent workflow.

Get started with this cuOpt agent workflow on GitHub. Follow the quickstart guide to run the example locally, or use an NVIDIA Brev Launchable to spin up a GPU instance in the cloud with a pre-loaded Jupyter Notebook that guides you through deploying this example.

Technical prerequisites:

Get started

Deploy NVIDIA cuOpt Agent reference workflow using the NVIDIA NeMo Agent Toolkit and use built-in optimization skills, or create your own. Run structured queries and integrate domain-specific constraints into workflows, and extend the cuOpt skills to benchmark metrics and optimize your own domain-specific use cases. 

Stay up to date on NVIDIA cuOpt by subscribing to NVIDIA news and following NVIDIA AI on LinkedIn, X, Discord, and YouTube.

Visit the cuOpt get started page for resources on Google Colab, NVIDIA API Catalog, GitHub, and NVIDIA AI Enterprise

Engage with the developer community on the NVIDIA forum and Discord.

Discuss (0)

Tags