NVIDIA CUDA-X Libraries

NVIDIA CUDA-X™ Libraries, built on CUDA®, is a collection of libraries that deliver dramatically higher performance—compared to CPU-only alternatives—across application domains, including AI and high-performance computing.

NVIDIA libraries run everywhere from resource-constrained IoT devices to self-driving cars to the largest supercomputers on the planet. As a result, users receive highly optimized implementations of an ever-expanding set of algorithms. Whether building a new application or accelerating an existing application, developers can tap NVIDIA libraries for the easiest way to get started with GPU acceleration.

Partner Libraries
Computational Lithography
Image and Video Libraries
Deep Learning Core

CUDA Math Libraries

GPU-accelerated math libraries lay the foundation for compute-intensive applications in areas such as molecular dynamics, computational fluid dynamics, computational chemistry, medical imaging, and seismic exploration.

cuBLAS

GPU-accelerated basic linear algebra (BLAS) library.

Learn More

cuFFT

GPU-accelerated library for Fast Fourier Transform implementations.

Learn More

cuRAND

GPU-accelerated random number generation.

Learn More

cuSOLVER

GPU-accelerated dense and sparse direct solvers.

Learn More

cuSPARSE

GPU-accelerated BLAS for sparse matrices.

Learn More

cuTENSOR

GPU-accelerated tensor linear algebra library.

Learn More

cuDSS

GPU-accelerated direct sparse solver library.

Learn More

$Decorative image of CUDA math library$

CUDA Math API

GPU-accelerated standard mathematical function APIs.

Learn More

AmgX

GPU-accelerated linear solvers for simulations and implicit unstructured methods.

Learn More

NVIDIA Math Libraries in Python

Enabling GPU-accelerated math operations for the Python ecosystem.

nvmath-python

nvmath-python (Beta) is an open source library that provides high-performance access to the core mathematical operations in the NVIDIA math libraries.

Learn More

Scientific Computing Library

For applications requiring neural networks that respect mathematical symmetries—specifically, equivariance with respect to 3D geometric data such as molecular structures, proteins, and materials.

cuEquivariance

An open-source Python library designed to accelerate the construction and execution of equivariant neural networks, particularly those handling data rotations and translations in 3D space.

Learn More

Parallel Algorithm Libraries

GPU-accelerated libraries of highly efficient parallel algorithms for several operations in C++ and for use with graphs when studying relationships in natural sciences, logistics, travel planning, and more.

Thrust

GPU-accelerated library of C++ parallel algorithms and data structures.

Learn More

Computational Lithography Library

Targeting the modern-day challenges of nanoscale computational lithography.

cuLitho

Library with optimized tools and algorithms to accelerate computational lithography and the manufacturing of semiconductors using GPUs.

Learn More

Quantum Libraries

Enabling simulation, HPC integration and AI for quantum computing.

cuQuantum

NVIDIA cuQuantum is a set of highly optimized libraries for accelerating quantum computing simulations.

Get Started

cuPQC

SDK of optimized libraries for accelerating post-quantum cryptography (PQC) workflows.

Explore Docs

Data Processing Libraries

GPU-accelerated libraries to accelerate data processing workflows for tabular, text, and image data.

RAPIDS cuDF

Accelerate tabular data, including pandas and Polars, with zero code changes.

Explore Docs

RAPIDS cuML

Speed up ML algorithms in scikit-learn, UMAP, and HDBSCAN with zero code changes.

Explore Docs

RAPIDS cuGraph

Scale up and speed up graph analytics with GPU-accelerated NetworkX.

Explore Docs

NVIDIA cuVS

Apply cuVS algorithms to accelerate vector search for data mining and semantic search applications – including world-class performance from the GPU-native nearest neighbors algorithm CAGRA.

Learn More

NeMo Curator

NVIDIA NeMo Curator improves generative AI model accuracy by processing text, image, and video data at scale for training and customization. It also provides pre-built pipelines for generating synthetic data to customize and evaluate generative AI systems.

Learn More

Morpheus

Open application framework that optimizes cybersecurity AI pipelines for analyzing large volumes of real-time data.

Learn More

GPU Direct Storage

NVIDIA GPUDirect® Storage creates a direct data path between local or remote storage, such as NVMe or NVMe over Fabrics (NVMe-oF), and GPU memory.

Learn More

Dask

Expand data science pipelines to multiple nodes with RAPIDS on Dask.

Go to GitHub

RAPIDS Accelerator for Apache Spark

Accelerate your existing Apache Spark applications with minimal code changes.

Go to GitHub

Image and Video Libraries

GPU-accelerated libraries for image and video decoding, encoding, and processing that use CUDA and specialized hardware components of GPUs.

RAPIDS cuCIM

Accelerate input/output (IO), computer vision, and image processing of n-dimensional, especially biomedical images.

Explore Docs

CV-CUDA

Open-source library for high-performance, GPU-accelerated pre- and post-processing in vision AI pipelines.

Learn More

NVIDIA DALI

Portable, open-source library for decoding and augmenting images and videos to accelerate deep learning applications.

Learn More

nvJPEG

High-performance GPU-accelerated library for JPEG decoding.

Learn More

NVIDIA Performance Primitives

GPU-accelerated image, video, and signal processing functions.

Learn More

NVIDIA Video Codec SDK

Hardware-accelerated video encode and decode on Windows and Linux.

Learn More

NVIDIA Optical Flow SDK

Exposes the latest hardware capability of NVIDIA GPUs dedicated to computing the relative motion of pixels between images.

Learn More

Communication Libraries

Performance-optimized multi-GPU and multi-node communication primitives.

NVSHMEM

OpenSHMEM standard for GPU memory, with extensions for improved performance on GPUs.

Learn More

NCCL

Open-source library for fast multi-GPU, multi-node communication that maximizes bandwidth while maintaining low latency.

Learn More

Deep Learning Core Libraries

GPU-accelerated libraries for deep learning applications that use CUDA and specialized hardware components of GPUs.

NVIDIA CUTLASS

Optimized abstractions for high-performance linear algebra and tensor operations across CUDA's execution and memory hierarchy.

Learn More

NVIDIA TensorRT-LLM

High-performance deep learning inference optimizer and runtime for production deployment.

Learn More

NVIDIA cuDNN

GPU-accelerated library of primitives for deep neural networks.

Learn More

Partner Libraries

OpenCV

GPU-accelerated open-source library for computer vision, image processing, and machine learning, now supporting real-time operation.

Learn More

FFmpeg

Open-source multimedia framework with a library of plug-ins for audio and video processing.

Learn More

ArrayFire

GPU-accelerated open-source library for matrix, signal, and image processing.

Learn More

MAGMA

GPU-accelerated linear algebra routines for heterogeneous architectures, by Magma.

Learn More

IMSL Fortran Numerical Library

GPU-accelerated open-source Fortran library with functions for math, signal and image processing, and statistics, by RogueWave.

Learn More

Gunrock

Library for graph-processing designed specifically for the GPU.

Learn More

CHOLMOD

GPU-accelerated functions for sparse direct solvers, included in the SuiteSparse linear algebra package, authored by Prof.

Learn More

Triton Ocean SDK

Real-time visual simulation of oceans, water bodies in games, simulation, and training applications, by Triton.

Learn More

CUVIlib

Primitives for accelerating imaging applications in medical, industrial, and defense domains.

Learn More

Resources

Documentation

Training

Community

Get Started

Members of the NVIDIA Developer Program get early access to all CUDA library releases and the NVIDIA online bug reporting and feature request system.

Join the Developer Program