NVIDIA Documentation Center

Welcome to the NVIDIA Documentation Center where you can explore the latest
technical information and product documentation.

Certified OEM Systems Documentation

NVIDIA tests and certifies partner systems that enable enterprises to confidently deploy hardware optimized for accelerated workloads—from desktop to data center to edge.

NVIDIA-Certified Systems

Systems certified by NVIDIA for accelerated computing from leading partners.


NGC-Ready Systems

Systems tested with previous generations of NVIDIA GPUs.


Cloudera Data Platform (CDP) Documentation

The integration of NVIDIA RAPIDS into the Cloudera Data Platform (CDP) provides transparent GPU acceleration of data analytics workloads using Apache Spark. This documentation describes the integration and suggested reference architectures for deployment.

Data Center Documentation

Documentation for managing and running containerized GPU applications in the data center using Kubernetes, Docker, and LXC.

NVIDIA Cloud-Native Technologies

NVIDIA cloud-native technologies enable developers to build and run GPU-accelerated containers using Docker and Kubernetes.


NVIDIA Data Center GPU Drivers

NVIDIA Data Center GPU drivers are used in Data Center GPU enterprise deployments for AI, HPC, and accelerated computing workloads. Documentation includes release notes, supported platforms, and cluster setup and deployment.



NVIDIA Data Center GPU Manager (DCGM) is a suite of tools for managing and monitoring NVIDIA Data Center GPUs in cluster environments.


NVIDIA System Management

NVIDIA System Management is a software framework for monitoring server nodes, such as NVIDIA DGX servers, in a data center.



NVIDIA Topology-Aware GPU Selection (NVTAGS) intelligently and automatically assigns GPUs to MPI processes, which reduces overall GPU-to-GPU communication time for Message Passing Interface (MPI) applications.


Deep Learning Performance Documentation

GPUs accelerate machine learning operations by performing calculations in parallel. Many operations, especially those representable as matrix multipliers will see good acceleration right out of the box. Even better performance can be achieved by tweaking operation parameters to efficiently use GPU resources. The performance documents present the tips that we think are most widely useful.

GPU Management and Deployment Documentation

This documentation should be of interest to cluster admins and support personnel of enterprise GPU deployments. It includes monitoring and management tools and application programming interfaces (APIs), in-field diagnostics and health monitoring, and cluster setup and deployment.

Networking Documentation

Documentation for InfiniBand and Ethernet networking solutions to achieve faster results and insight by accelerating HPC, AI, Big Data, Cloud, and Enterprise workloads over NVIDIA Networking.

Networking Solutions

End-to-end networking solutions with smart adapters, switches, cables, and management software that reduce latency, increase efficiency, enhance security, and simplify data center automation so applications run faster.


Networking Ethernet Software

Ethernet networking software documentation for NVIDIA Cumulus Linux, NVIDIA NetQ, and NVIDIA Cumulus VX networking solutions.


NVIDIA AI Enterprise Documentation

NVIDIA AI Enterprise is an end-to-end, cloud-native suite of AI and data analytics software, optimized, certified and supported by NVIDIA to run on VMware vSphere with NVIDIA-Certified Systems.

NVIDIA Base Command Platform Documentation

NVIDIA Base Command Platform is a world-class infrastructure solution for businesses and their data scientists who need a premium AI development experience.

NVIDIA Bright Cluster Manager Documentation

NVIDIA Bright Cluster Manager offers fast deployment and end-to-end management for heterogeneous HPC and AI server clusters at the edge, in the data center and in multi/hybrid-cloud environments. It automates provisioning and administration for clusters ranging in size from a single node to hundreds of thousands, supports CPU-based and NVIDIA GPU-accelerated systems, and orchestration with Kubernetes.

NVIDIA Clara Documentation

NVIDIA Clara is an open, scalable computing platform that enables developers to build and deploy medical imaging applications into hybrid (embedded, on-premises, or cloud) computing environments to create intelligent instruments and automate healthcare workflows.

NVIDIA Clara Holoscan

NVIDIA Clara Holoscan is a hybrid computing platform for medical devices that combines hardware systems for low-latency sensor and network connectivity, optimized libraries for data processing and AI, and core microservices to run surgical video, ultrasound, medical imaging, and other applications anywhere, from embedded to edge to cloud.


NVIDIA Clara Parabricks

NVIDIA Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Google's DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows.


NVIDIA Clara Viz

NVIDIA Clara Viz is a platform for visualization of 2D/3D medical imaging data. The core of this platform is the Clara Viz SDK, which is designed to enable developers to incorporate high performance volumetric visualization of medical images in applications with an easy-to-use API.


NVIDIA CloudXR SDK Documentation

CloudXR is NVIDIA's solution for streaming virtual reality (VR), augmented reality (AR), and mixed reality (MR) content from any OpenVR XR application on a remote server--desktop, cloud, data center, or edge.

NVIDIA CUDA Libraries Documentation

Documentation for CUDA Libraries, including cuBLAS, cuSOLVER, cuSPARSE, cuFFT, cuRAND, nvJPEG, and NPP.


The cuBLAS library is an implementation of Basic Linear Algebra Subprograms (BLAS) on the NVIDIA CUDA runtime. It enables the user to access the computational resources of NVIDIA GPUs.


NVIDIA cuFFT Library

The NVIDIA CUDA Fast Fourier Transform (cuFFT) library consists of two components: cuFFT and cuFFTW. The cuFFT library provides high performance on NVIDIA GPUs, and the cuFFTW library is a porting tool to use the Fastest Fourier Transform in the West (FFTW) on NVIDIA GPUs.


NVIDIA cuFFTDx Library

The cuFFT Device Extensions (cuFFTDx) library enables you to perform Fast Fourier Transform (FFT) calculations inside your CUDA kernel. Fusing FFT with other operations can decrease the latency and improve the performance of your application.



The NVIDIA CUDA Random Number Generation (cuRAND) library provides an API for simple and efficient generation of high-quality pseudorandom and quasirandom numbers.



The cuSOLVER library is a high-level package based on cuBLAS and cuSPARSE libraries. It provides Linear Algebra Package (LAPACK)-like features such as common matrix factorization and triangular solve routines for dense matrices.



The cuSPARSE library contains a set of basic linear algebra subroutines used for handling sparse matrices. It’s implemented on the NVIDIA CUDA runtime and is designed to be called from C and C++.



The cuSPARSELt library provides high-performance, structured, matrix-dense matrix multiplication functionality. cuSPARSELt allows users to exploit the computational resources of the latest NVIDIA GPUs.



The cuTENSOR library is a first-of-its-kind, GPU-accelerated tensor linear algebra library, providing high-performance tensor contraction, reduction, and element-wise operations. cuTENSOR is used to accelerate applications in the areas of deep learning training and inference, computer vision, quantum chemistry, and computational physics.



NVIDIA Performance Primitives (NPP) is a library of functions for performing CUDA-accelerated 2D image and signal processing. This library is widely applicable for developers in these areas and is written to maximize flexibility while maintaining high performance.


nvJPEG Library

The nvJPEG Library provides high-performance, GPU-accelerated JPEG encoding and decoding functionality. This library is intended for image formats commonly used in deep learning and hyperscale multimedia applications.


nvJPEG2000 Library

The nvJPEG2000 library provides high-performance, GPU-accelerated JPEG2000 decoding functionality. This library is intended for JPEG2000 formatted images commonly used in deep learning, medical imaging, remote sensing, and digital cinema applications.


NVIDIA CUDA Toolkit Documentation

The NVIDIA CUDA Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications.


Find archived online documentation for CUDA Toolkit.


NVIDIA cuDNN Documentation

The NVIDIA CUDA Deep Neural Network (cuDNN) library is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration.

NVIDIA cuOpt Documentation

NVIDIA cuOpt is an Operations Research optimization API using AI to help developers create complex, real-time fleet routing workflows on NVIDIA GPUs.

NVIDIA DALI Documentation

The NVIDIA Data Loading Library (DALI) is a collection of highly optimized building blocks, and an execution engine, for accelerating the pre-processing of input data for deep learning applications. DALI provides both the performance and the flexibility for accelerating different data pipelines as a single library. This single library can then be easily integrated into different deep learning training and inference applications.

NVIDIA Data Science Workbench Documentation

NVIDIA Data Science Workbench is a productivity tool for GPU-enabled workstations to improve manageability, reproducibility, and usability for data scientists, data engineers, and AI developers. Users have fast and convenient access to a plethora of data science tools and CLIs while also benefiting from easy installation and updating stack software.

NVIDIA DeepStream NVAIE Documentation

NVIDIA AI Enterprise is an end-to-end, cloud-native suite of AI and data analytics software, optimized, certified, and supported by NVIDIA with NVIDIA-Certified Systems.

NVIDIA DeepStream SDK Documentation

The NVIDIA DeepStream SDK delivers a complete streaming analytics toolkit for situational awareness through computer vision, intelligent video analytics (IVA), and multi-sensor processing.

DGX Documentation

NVIDIA DGX is the integrated software and hardware system that supports the commitment to AI research with an optimized combination of compute power, software, and deep learning performance. It is purpose-built to meet the demands of enterprise AI and data science, delivering the fastest start in AI development, effortless productivity, and revolutionary performance—for insights in hours instead of months.


DGX Systems provides integrated hardware, software, and tools for running GPU-accelerated applications such as deep learning, AI analytics, and interactive visualization.



The DGX Zone is for DGX users and Ops teams to find supplemental information and instructions for configuring and using DGX Systems. It includes topics beyond those covered in the User's Guides.


NVIDIA DIGITS Documentation

The NVIDIA Deep Learning GPU Training System (DIGITS) can be used to rapidly train highly accurate deep neural networks (DNNs) for image classification, segmentation, and object-detection tasks. DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best-performing model from the results browser for deployment.

NVIDIA DRIVE Platform Documentation

The NVIDIA DRIVE Platform provides a comprehensive software and hardware solution for the development of autonomous vehicles.


The NVIDIA EGX platform delivers the power of accelerated AI computing to the edge with a cloud-native software stack (EGX stack), a range of validated servers and devices, Helm charts, and partners who offer EGX through their products and services.

NVIDIA Fleet Command

NVIDIA Fleet Command brings secure edge AI to enterprises of any size. Transform NVIDIA-certified servers into secure edge appliances and connect them to the cloud in minutes. From the cloud, deploy and manage applications from the NGC Catalog or your NGC Private Registry, update system software over-the-air and manage systems remotely with nothing but a browser and internet connection.

NVIDIA GameWorks Documentation

Documentation for GameWorks-related products and technologies, including libraries (NVAPI, OpenAutomate), code samples (DirectX, OpenGL), and developer tools (Nsight, NVIDIA System Profiler).

NVIDIA GPUDirect Storage (GDS) Documentation

NVIDIA GPUDirect Storage (GDS) enables the fastest data path between GPU memory and storage by avoiding copies to and from system memory, thereby increasing storage input/output (IO) bandwidth and decreasing latency and CPU utilization.

NVIDIA HPC SDK Documentation

The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries, and development tools used for developing HPC applications for the NVIDIA platform.

NVIDIA Isaac Documentation

NVIDIA Isaac is a developer toolbox for accelerating the development and deployment of AI-powered robots. The SDK includes Isaac applications, GEMs (robot capabilities), a Robot Engine, and NVIDIA Isaac Sim.

NVIDIA Jetson Software Documentation

The NVIDIA JetPack SDK, which is the most comprehensive solution for building AI applications, along with L4T and L4T Multimedia, provides the Linux kernel, bootloader, NVIDIA drivers, flashing utilities, sample filesystem, and more for the Jetson platform.


The JetPack SDK is the most comprehensive solution for building AI applications. The JetPack installer can be used to flash the Jetson Developer Kit with the latest OS image and to install developer tools, libraries and APIs, samples, and documentation.


Jetson Linux

NVIDIA Jetson Linux supports development on the Jetson platform.



The L4T APIs provide additional functionality to support application development. The APIs enable flexibility by providing better control over the underlying hardware blocks.



This archives section provides access to previously released JetPack, L4T, and L4T Multimedia documentation versions.


NVIDIA LaunchPad Documentation

With NVIDIA LaunchPad, enterprises can get immediate, short-term access to NVIDIA AI running on private accelerated compute infrastructure to power critical AI initiatives.

NVIDIA Maxine Documentation

NVIDIA Maxine is a GPU-accelerated SDK with state-of-the-art AI features for developers to build virtual collaboration and content creation applications such as video conferencing and live streaming. Maxine’s AI SDKs, such as Video Effects, Audio Effects, and Augmented Reality (AR) are highly optimized and include modular features that can be chained into end-to-end pipelines to deliver the highest performance possible on GPUs, both on PCs and in data centers.

NVIDIA Modulus Documentation

NVIDIA Modulus is a Physics-Informed Neural Networks (PINNs) toolkit that enables you to get started with AI-driven physics simulations and leverage a powerful framework to implement your domain knowledge to solve complex nonlinear physics problems with real-world applications.

NVIDIA Morpheus Documentation

NVIDIA Morpheus is an open AI application framework that provides cybersecurity developers with a highly optimized AI pipeline and pre-trained AI capabilities and allows them to instantaneously inspect all IP traffic across their data center fabric.

NVIDIA NCCL Documentation

The NVIDIA Collective Communications Library (NCCL) is a library of multi-GPU collective communication primitives that are topology-aware and can be easily integrated into applications. Collective communication algorithms employ many processors working in concert to aggregate data. NCCL is not a full-blown parallel programming framework; rather, it’s a library focused on accelerating collective communication primitives.

NVIDIA NeMo Documentation

NVIDIA Neural Modules (NeMo) is a flexible, Python-based toolkit enabling data scientists and researchers to build state-of-the-art speech and language deep learning models composed of reusable building blocks that can be safely connected together for conversational AI applications.

NVIDIA NGC Documentation

NVIDIA NGC is the hub for GPU-optimized software for deep learning, machine learning, and HPC that provides containers, models, model scripts, and industry solutions so data scientists, developers and researchers can focus on building solutions and gathering insights faster.


A platform to accelerate AI, HPC and Visualization GPU workflows and thus accelerate time to solution.



The NGC Catalog is a curated set of GPU-optimized software. It consists of containers, pre-trained models, Helm charts for Kubernetes deployments and industry-specific AI toolkits with software development kits (SDKs). The content provided by NVIDIA and third-party ISVs simplify the building, customizing and integration of GPU-optimized software into workflows, accelerating the time to solutions for users.


Deploy Assets from NGC

NVIDIA tests NGC containers running AI, ML and DL workloads on NVIDIA GPUs on leading public clouds and on-prem servers through its NVIDIA certification programs. NVIDIA certified data center and edge servers, together with public cloud platforms, enable easy deployment of any NGC asset, in environments certified for performance and scalability by NVIDIA.


Private Registry

The NGC private registry provides you with a secure space to store and share custom containers, models, resources and helm charts within your enterprise. Take advantage of the deployment patterns you love from the Catalog -- but with your bespoke assets.


NVIDIA NGX Documentation

NVIDIA NGX makes it easy to integrate pre-built, AI-based features into applications with the NGX SDK, NGX Core Runtime and NGX Update Module. The NGX infrastructure updates the AI-based features on all clients that use it.

NVIDIA Nsight Developer Tools Documentation

NVIDIA Nsight Developer Tools is a comprehensive tool suite spanning across desktop and mobile targets which enable developers to build, debug, profile, and develop class-leading and cutting-edge software that utilizes the latest visual computing hardware from NVIDIA.

NVIDIA Nsight Systems

NVIDIA Nsight Systems is a system-wide performance analysis tool designed to visualize an application’s algorithms, help you identify the largest opportunities to optimize, and tune to scale efficiently across any quantity or size of CPUs and GPUs; from a large server to our smallest SoC.


NVIDIA Nsight Compute

NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging via a user interface and command-line tool.


NVIDIA Nsight Graphics

NVIDIA Nsight Graphics is a standalone developer tool that enables you to debug, profile, and export frames built with Direct3D, Vulkan, OpenGL, OpenVR, and the Oculus SDK.


NVIDIA Nsight Deep Learning Designer

NVIDIA Nsight Deep Learning Designer is a tool with an integrated development environment that helps developers efficiently design and develop deep neural networks for in-app inference.


NVIDIA Nsight Visual Studio Edition (VSE)

NVIDIA Nsight Visual Studio Edition (VSE) is an application development environment for heterogeneous platforms that brings GPU computing into Microsoft Visual Studio.


NVIDIA Nsight Visual Studio Code Edition (VSCE)

NVIDIA Nsight Visual Studio Code Edition (VSCE) is an application development environment for heterogeneous platforms that brings CUDA development for GPUs into Microsoft Visual Studio Code.


NVIDIA Nsight Integration

NVIDIA Nsight Integration is a Visual Studio extension that enables you to access the power of Nsight Compute, Nsight Graphics, and Nsight Systems from within Visual Studio.


NVIDIA Nsight Eclipse Edition

NVIDIA Nsight Eclipse Edition is a unified CPU plus GPU integrated development environment (IDE) for developing CUDA® applications on Linux and Mac OS X for the x86, POWER and ARM platforms.


NVIDIA Nsight Perf SDK

The NVIDIA Nsight Perf SDK is a toolbox for collecting and analyzing GPU performance data, directly from application code.



NVIDIA CUDA-GDB is a console-based debugging interface you can use from the command-line on your local system or any remote system on which you have Telnet or SSH access.


NVIDIA Compute Sanitizer

NVIDIA Compute Sanitizer is a functional correctness checking tool suite included in the CUDA Toolkit. This suite contains multiple tools that can perform different types of checks.



NVIDIA CUPTI (CUDA Profiling Tools Interface) is a set of APIs that enables the creation of profiling and tracing tools that target CUDA applications.



The NVIDIA Tools Extension (NVTX) library is a set of functions that a developer can use to provide additional information to tools. The additional information is used by the tool to improve analysis and visualization of data.


Legacy Developer Tools

NVIDIA Nsight Tegra, Visual Studio Edition

NVIDIA Nsight Tegra, Visual Studio Edition brings the raw development power and efficiency of Microsoft Visual Studio to Android, giving you the right tools for the job.


NVIDIA System Profiler

NVIDIA System Profiler is a system trace and multi-core CPU call stack sampling profiler, providing an interactive view of the system behavior to help you optimize the application performance on Jetson devices.



NVIDIA Perfkit is a comprehensive suite of performance tools to help debug and profile OpenGL and Direct3D applications.


NVIDIA nvprof

The nvprof profiling tool enables you to collect and view profiling data from the command-line. Nvprof enables the collection of a timeline of CUDA-related activities on both CPU and GPU, including kernel execution, memory transfers, memory set and CUDA API calls and events or metrics for CUDA kernels.


NVIDIA Visual Profiler

The NVIDIA Visual Profiler is a graphical profiling tool that displays a timeline of your application's CPU and GPU activity. It includes an automated analysis engine to identify optimization opportunities.



CUDA-MEMCHECK is a functional correctness checking suite included in the CUDA toolkit. The memcheck tool is capable of precisely detecting and attributing out of bounds and misaligned memory access errors in CUDA applications.


NVIDIA Omniverse Documentation

NVIDIA Omniverse is a cloud-native, multi-GPU, real-time simulation and collaboration platform for 3D production pipelines based on Pixar's Universal Scene Description (USD) and NVIDIA RTX.

NVIDIA Optimized Frameworks Documentation

NVIDIA Optimized Frameworks such as Kaldi, NVIDIA Optimized Deep Learning Framework (powered by Apache MXNet), NVCaffe, PyTorch, and TensorFlow (which includes DLProf and TF-TRT) offer flexibility with designing and training custom (DNNs for machine learning and AI applications.

NVIDIA RAPIDS Documentation

The RAPIDS data science framework is a collection of libraries for running end-to-end data science pipelines completely on the GPU. The interaction is designed to have a familiar look and feel to working in Python, but utilizes optimized NVIDIA CUDA primitives and high-bandwidth GPU memory under the hood.

NVIDIA Ray-Tracing Documentation

Reference documentation, examples, and tutorials for the NVIDIA OptiX ray-tracing engine, the Iray rendering system, and the Material Definition Language (MDL).


NVIDIA Iray rendering technology represents a comprehensive approach to state-of-the-art rendering for design visualization.


NVIDIA Iray Server

NVIDIA Iray Server is a network-attached rendering solution for Iray-compatible applications.


NVIDIA Material Definition Language (MDL)

NVIDIA Material Definition Language (MDL) is a domain-specific language that describes the appearance of scene elements for a rendering process.



NVIDIA IndeX is a 3D volumetric, interactive visualization SDK used by scientists and researchers to visualize and interact with massive datasets.



The NVIDIA OptiX ray-tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures.


NVIDIA Riva Speech Skills Documentation

NVIDIA Riva is an SDK for building multimodal conversational systems. Riva is used for building and deploying AI applications that fuse vision, speech, sensors, and services together to achieve conversational AI use cases that are specific to a domain of expertise. It offers a complete workflow to build, train, and deploy AI systems that can use visual cues such as gestures and gaze along with speech in context.

NVIDIA TAO Toolkit Documentation

The NVIDIA TAO Toolkit eliminates the time-consuming process of building and fine-tuning DNNs from scratch for IVA applications.

NVIDIA TensorRT Documentation

NVIDIA TensorRT is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. The core of NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA GPUs. TensorRT takes a trained network, which consists of a network definition and a set of trained parameters, and produces a highly optimized runtime engine that performs inference for that network.

NVIDIA Transformer Engine Documentation

Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs to provide better performance with lower memory utilization in both training and inference, and an FP8 automatic-mixed-precision-like API that can be used seamlessly with your model code.

NVIDIA Triton Inference Server Documentation

NVIDIA Triton Inference Server (formerly TensorRT Inference Server) provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.

NVIDIA Video Technologies Documentation

Reference documentation, APIs, and samples for NVIDIA video technology SDKs on Windows and Linux platforms.

NVIDIA Optical Flow SDK

The NVIDIA Optical Flow SDK provides a comprehensive set of APIs, samples, and documentation on Windows and Linux platforms for fully hardware-accelerated optical flow, which can be used for computing the relative motion of pixels between images.


NVIDIA Video Codec SDK

The NVIDIA Video Codec SDK provides a comprehensive set of APIs, samples, and documentation for fully hardware-accelerated video encoding, decoding, and transcoding on Windows and Linux platforms.


NVIDIA Virtual GPU (vGPU) Software Documentation

NVIDIA virtual GPU (vGPU) software is a graphics virtualization platform that extends the power of NVIDIA GPU technology to virtual desktops and apps, offering improved security, productivity, and cost-efficiency.

NVIDIA Virtual Reality Capture and Replay (VCR) SDK Documentation

The NVIDIA Virtual Reality Capture and Replay (VCR) SDK enables developers and users to accurately capture and replay VR sessions for performance testing, scene troubleshooting, and more.

Unified Compute Framework Documentation

Unified Compute Framework (UCF) is a low-code framework for developing cloud-native, real-time, and multimodal AI applications. It features low-code design tools for microservices and applications, as well as a collection of optimized microservices and sample applications.