CUDA Primitives Power Data Science on GPUs
NVIDIA provides a suite of machine learning and analytics software libraries to accelerate end-to-end data science pipelines entirely on GPUs. This work is enabled by over 15 years of CUDA development. GPU-accelerated libraries abstract the strengths of low-level CUDA primitives. Numerous libraries like linear algebra, advanced math, and parallelization algorithms lay the foundation for an ecosystem of compute-intensive applications.
With NVIDIA’s libraries, you get highly efficient implementations of algorithms that are regularly extended and optimized. Whether you are building a new application or trying to speed up an existing application, NVIDIA’s libraries provide the easiest way to get started with GPUs. You can download NVIDIA CUDA-X AI libraries as part of the CUDA Toolkit and NVIDIA RAPIDS.
Linear Algebra and Math libraries
cuBLAS
A fast GPU-accelerated implementation of the standard basic linear algebra subroutines (BLAS)
cuSPARSE
Provides GPU-accelerated basic linear algebra subroutines for sparse matrices
cuSOLVER
A collection of dense and sparse direct solvers to accelerate Linear Optimization applications and more
Parallel Algorithm Libraries
NCCL
Implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs
Thrust
Provides a flexible, high-level interface for GPU programming to enhance developer productivity
RAPIDS
Much of the new data science developer work is focused on hardening an open source project called RAPIDS. RAPIDS, part of CUDA-X AI, relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
RAPIDS also focuses on common data preparation tasks for ETL, analytics and machine learning. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs.
RAPIDS Workflow
Libraries
- ANALYTICS and ETL - cuDF is a DataFrame manipulation library based on Apache Arrow that accelerates loading, filtering, and manipulation of data for model training data preparation. The Python bindings of the core-accelerated CUDA DataFrame manipulation primitives mirror the pandas interface for seamless onboarding of pandas users.
- MACHINE LEARNING - cuML is a collection of GPU-accelerated machine learning libraries that will provide GPU versions of all machine learning algorithms available in scikit-learn.
- GRAPH ANALYTICS - cuGRAPH is a collection of graph analytics libraries that seamlessly integrate into the RAPIDS data science platform.
RAPIDS Features
Hassle-Free Integration
Accelerate your Python data science toolchain with minimal code changes and no new tools to learn.
Top Model Accuracy
Increase machine learning model accuracy by iterating on models faster and deploying them more frequently.
Reduced Training Time
Drastically improve your productivity with near-interactive data science.
Open Source
Customizable, extensible, interoperable - the open-source software is supported by NVIDIA and built on Apache Arrow.
Get Started
Experience the accelerated machine learning and data science on GPUs with RAPIDS.
RAPIDS Webpage