NVIDIA CUDA-X Libraries

Built on the foundation of NVIDIA® CUDA®, NVIDIA CUDA-X™ is a powerful suite of libraries designed to deliver industry-leading GPU acceleration across AI and high-performance computing use cases—from generative AI and autonomous machines to climate modeling and financial forecasting. Whether you're deploying on resource-constrained IoT devices or the world's largest supercomputers, NVIDIA CUDA-X libraries provide highly optimized implementations of complex algorithms that far outperform CPU-only alternatives. For developers looking to build or scale applications, CUDA-X offers the most efficient and accessible path to maximizing hardware performance across any domain.

Parallel Algorithm Libraries
Data Processing Libraries
Image and Video Libraries
Communication Libraries
Partner Libraries

CUDA Math Libraries

GPU-accelerated math libraries lay the foundation for compute-intensive applications in areas such as molecular dynamics, computational fluid dynamics, computational chemistry, medical imaging, and seismic exploration.

cuBLAS

GPU-accelerated basic linear algebra (BLAS) library.

Learn More

cuFFT

GPU-accelerated library for Fast Fourier Transform implementations.

Learn More

cuRAND

GPU-accelerated random number generation.

Learn More

cuSOLVER

GPU-accelerated dense and sparse direct solvers.

Learn More

cuSPARSE

GPU-accelerated BLAS for sparse matrices.

Learn More

cuTENSOR

GPU-accelerated tensor linear algebra library.

Learn More

cuDSS

GPU-accelerated direct sparse solver library.

Learn More

$Decorative image of CUDA math library$

CUDA Math API

GPU-accelerated standard mathematical function APIs.

Learn More

AmgX

GPU-accelerated linear solvers for simulations and implicit unstructured methods.

Learn More

nvmath-python

Enabling GPU-accelerated math operations for the Python ecosystem. nvmath-python (Beta) is an open source library that provides high-performance access to the core mathematical operations in the NVIDIA math libraries.

Learn More

Scientific Computing Libraries

For applications requiring neural networks that respect mathematical symmetries—specifically, equivariance with respect to 3D geometric data, such as molecular structures, proteins, and materials.

cuEquivariance

An open source Python library designed to accelerate the construction and execution of geometry-aware neural networks, particularly those handling data rotations and translations in 3D space.

Learn More

NVIDIA ALCHEMI

A collection of domain-specific NVIDIA NIM™ microservices and toolkit for accelerating chemical and materials discovery (e.g., battery materials, catalysts, OLEDs, beauty formulations).

Learn More

cuLitho

Targeting the modern-day challenges of nanoscale computational lithography, this library optimizes tools and algorithms to accelerate computational lithography and the manufacturing of semiconductors using GPUs.

Learn More

cuEST

Accelerates industrial-scale quantum chemistry via a flexible API and high-performance building blocks for first-principles electronic structure calculations on GPUs.

Learn More

Physics Libraries

GPU-accelerated physics libraries and frameworks to speed up simulations across domains including computational physics, multiphysics, quantum physics, and weather modeling.

NVIDIA Warp

A purpose-built, open source Python framework that delivers GPU acceleration for computational physics, AI, and optimization workflows, enabling kernel-based programs for simulation AI, robotics, and ML.

Learn More

NVIDIA PhysicsNeMo

An open source Python framework for building, training, and fine-tuning AI physics models at scale.

Learn More

NVIDIA Earth-2

A comprehensive family of open models, libraries, and frameworks that democratize global access to professional-grade weather and climate AI.

Learn More

Quantum Computing Libraries

Enabling simulation, HPC integration and AI for quantum computing.

cuQuantum

A set of highly optimized libraries for accelerating quantum computing simulations.

Get Started

cuPQC

SDK of optimized libraries for accelerating post-quantum cryptography (PQC) workflows.

Learn More

CUDA-Q QEC

Libraries for simulating and implementing noise-resilient quantum algorithms and error mitigation.

Explore Docs

CUDA-Q Solvers

GPU-accelerated solvers for hybrid quantum-classical optimization and variational workloads.

Explore Docs

Deep Learning Core Libraries

GPU-accelerated libraries for deep learning applications that use CUDA and specialized hardware components of GPUs.

NVIDIA cuDNN

GPU-accelerated library of deep neural network building blocks ("primitives") for deep learning.

Learn More

NVIDIA TensorRT™ and TensorRT LLM

High-performance deep learning inference optimizer and runtime for production deployment.

Learn More

CUTLASS

Modular C++ templates and Python DSLs for building high-performance kernels targeting NVIDIA Tensor Cores.

Learn More

FlashInfer

GPU-accelerated kernel library, accessible via Python API for inference, optimizing attention, MoEs, GEMMs, comms, and other neural network operations.

Explore Docs

Parallel Algorithm Libraries

The CUDA Core Compute Libraries (CCCL) provide GPU-accelerated algorithms in C++ and Python. They provide optimized parallel primitives to solve complex challenges in natural sciences, logistics, travel planning, and more.

Thrust

Powerful data-parallel library based on the C++ STL that lets developers implement complex GPU-accelerated algorithms in a high-level API without sacrificing performance.

Learn More

CUB

Cooperative primitives for CUDA kernel authoring, providing warp-wide, block-wide, and device-wide collective primitives across the CUDA programming model.

Learn More

cuda.compute

Pythonic interface for high-performance, device-level CCCL algorithms, enabling developers to leverage CUDA parallel processing directly within Python workflows.

Explore Docs

cuda.parallel

Standardized primitives for distributed and local parallel patterns such as sort, scan, and reduction, optimized for the latest NVIDIA GPU architectures.

Explore Docs

Data Processing Libraries

GPU-accelerated libraries to accelerate data processing workflows for tabular, text, and image data.

cuDF

Accelerate tabular data, including pandas, Polars, and Apache Spark with zero code changes.

Learn more

cuVS

Accelerate vector search for data mining and semantic search applications—including world-class performance from the GPU-native nearest neighbors algorithm CAGRA.

Learn More

cuML

Speed up ML algorithms in scikit-learn, UMAP, HDBSCAN, and Apache Spark with zero code changes.

Learn More

cuOpt

Open source, GPU-accelerated decision optimization engine designed to tackle large-scale problems with millions of variables and constraints, enabling accelerated decision-making.

Learn More

cuGraph

Scale up and speed up graph analytics with GPU-accelerated NetworkX.

Explore Docs

NeMo Curator

Improves generative AI model accuracy by processing text, image, and video data at scale for training and customization, with pre-built pipelines for generating synthetic data.

Learn More

Morpheus

Open application framework that optimizes cybersecurity AI pipelines for analyzing large volumes of real-time data.

Learn More

nvComp

High-throughput GPU-accelerated compression and decompression library that minimizes storage footprint and speeds up data transfer rates for AI training, HPC, data science, and analytics applications.

Learn More

GPU Direct Storage

NVIDIA GPUDirect Storage creates a direct data path between local or remote storage, such as NVMe or NVMe over Fabrics (NVMe-oF), and GPU memory.

Learn More

Dask

Expand data science pipelines to multiple nodes with NVIDIA RAPIDS on Dask.

Go to GitHub

Image and Video Libraries

GPU-accelerated libraries for image and video decoding, encoding, and processing that use CUDA and specialized hardware components of GPUs.

nvImageCodec

GPU-accelerated image codec library with a unified interface for high-throughput image encoding and decoding, built as an extensible framework for a wide array of codec plugins.

Learn More

NVIDIA DALI

GPU-accelerated library for data loading and pre-processing to accelerate deep learning applications for image, video, and audio modalities.

Learn More

CV-CUDA

Open source library for high-performance, GPU-accelerated pre- and post-processing in vision AI pipelines.

Learn More

cuCIM

Open source, accelerated computer vision and image processing library for multidimensional images in biomedical, geospatial, material, healthcare, and remote sensing use cases.

Explore Docs

NVIDIA Performance Primitives (NPP)

GPU-accelerated library of highly optimized primitives for CUDA-based 2D image and signal processing, including filtering, color conversion, and image manipulation.

Learn More

NVIDIA Video Codec SDK

Hardware-accelerated video encode and decode on Windows and Linux.

Learn More

NVIDIA Optical Flow SDK

Exposes the latest hardware capability of NVIDIA GPUs dedicated to computing the relative motion of pixels between images.

Learn More

Communication Libraries

Performance-optimized multi-GPU and multi-node communication primitives.

NVSHMEM

Based on the OpenSHMEM one-sided communication model, provides a partitioned global address space across GPU memories.

Learn More

NCCL

Open source library for fast multi-GPU, multi-node communication that maximizes bandwidth while maintaining low latency.

Learn More

NIXL

Low-latency inference transfer library, moving KV cache and tensors between GPUs, memory tiers, and storage.

Learn More

Partner Libraries

OpenCV

GPU-accelerated open-source library for computer vision, image processing, and machine learning, now supporting real-time operation.

Learn More

FFmpeg

Open-source multimedia framework with a library of plug-ins for audio and video processing.

Learn More

ArrayFire

GPU-accelerated open-source library for matrix, signal, and image processing.

Learn More

MAGMA

GPU-accelerated linear algebra routines for heterogeneous architectures, by Magma.

Learn More

IMSL Fortran Numerical Library

GPU-accelerated open-source Fortran library with functions for math, signal and image processing, and statistics, by RogueWave.

Learn More

Gunrock

Library for graph-processing designed specifically for the GPU.

Learn More

CHOLMOD

GPU-accelerated functions for sparse direct solvers, included in the SuiteSparse linear algebra package, authored by Prof.

Learn More

Triton Ocean SDK

Real-time visual simulation of oceans, water bodies in games, simulation, and training applications, by Triton.

Learn More

CUVIlib

Primitives for accelerating imaging applications in medical, industrial, and defense domains.

Learn More

CuPy

Open source array library for GPU-accelerated computing with Python, providing a NumPy/SciPy-compatible interface.

Learn More

Resources

Documentation

Training

Community

Get Started

Members of the NVIDIA Developer Program get early access to all CUDA library releases and the NVIDIA online bug reporting and feature request system.

Join the Developer Program