CUDA PRIMITIVES POWER DATA SCIENCE ON GPUs

NVIDIA provides a suite of machine learning and analytics software libraries to accelerate end-to-end data science pipelines entirely on GPUs. This work is enabled by over 15 year of CUDA development. GPU-accelerated libraries abstract the strengths of low-level CUDA primitives. Libraries like linear algebra, advanced math, parallelization algorithms, and more lay the foundation for an ecosystem of compute-intensive applications.

With NVIDIA’s libraries, you get highly efficient implementations of algorithms that are regularly extended and optimized. Whether you are building a new application or trying to speed up an existing application, NVIDIA’s libraries provide the easiest way to get started with GPUs. You can download NVIDIA libraries as part of the CUDA Toolkit.

Linear Algebra and Math libraries

cuBLAS

A fast GPU-accelerated implementation of the standard basic linear algebra subroutines (BLAS)

cuSPARSE

Provides GPU-accelerated basic linear algebra subroutines for sparse matrices

cuSOLVER

A collection of dense and sparse direct solvers to accelerate Linear Optimization applications and more

Parallel Algorithm Libraries

NCCL

Implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs

Thrust

Provides a flexible, high-level interface for GPU programming to enhance developer productivity


RAPIDS

Much of the new data science developer work is focused on hardening an open source project called RAPIDS. RAPIDS relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs.

RAPIDS Workflow

LIBRARIES

  • ANALYTICS - cuDF is a DataFrame manipulation library based on Apache Arrow that accelerates loading, filtering, and manipulation of data for model training data preparation. The Python bindings of the core-accelerated CUDA DataFrame manipulation primitives mirror the pandas interface for seamless onboarding of pandas users.
  • MACHINE LEARNING - cuML is a collection of GPU-accelerated machine learning libraries that will provide GPU versions of all machine learning algorithms available in scikit-learn.
  • GRAPH ANALYTICS - nvGRAPH is a collection of graph analytics libraries that seamlessly integrate into the RAPIDS data science platform.

RAPIDS Features

Hassle-Free Integration

Accelerate your Python data science toolchain with minimal code changes and no new tools to learn.

Top Model Accuracy

Increase machine learning model accuracy by iterating on models faster and deploying them more frequently.

Reduced Training Time

Drastically improve your productivity with near-interactive data science.

Open Source

Customizable, extensible, interoperable - the open-source software is supported by NVIDIA and built on Apache Arrow.


RAPIDS OPEN SOURCE CONTRIBUTORS


RAPIDS RECOMMENDED HARDWARE CONFIGURATIONS

RAPIDS Deployment Stage Recommended GPU Configuration Minimum CPU Cores Minimum Main Memory Boot Drive Local Data Storage Networking Connections
Experimentation Pascal or later GPU 4 2 x GPU memory 500GB SDD Optional 1GbE
Experimentation 1 x Quadro GV100 6 64 GB 500GB SSD 1TB SSD 1GbE / 10GbE
Development 2 x Quadro GV100 & NVLINK 10 128 GB 500GB SSD 2TB SSD 1GbE / 10GbE
Development & Work Group 4 x V100 & NVLINK (DGX Station) 20 256 GB 500GB SSD 4TB SSD 1GbE / 10GbE
Work Group & Production 4 x V100 SXM2 & NVLINK 20 256 GB 500GB SSD 4TB SSD 10GbE / 100GbE/IB
Production 8 x V100 SXM2 & NVLINK 40 .5 TB 500GB SSD 4TB SSD or NVMe 10GbE / 100GbE/IB
Large Scale Production 16 x v100 SXM3 & NVSWITCH 56 1TB 900GB SSD 10TB SSD or NVMe 40 GbE / 100Bge/IB
CSP Instance Recommended Configuration
Amazon EC2 p3.2xlarge, p3.8xlarge, p3.16xlarge
Microsoft Azure NC6, NC12, NC24, NC24r, ND
Google Cloud Platform NVIDIA Tesla V100, 1,2,4,8 GPUs
Oracle Cloud VM GPU 3.1,3.2,3.4, 8M GPU 3.8

Get Started

Experience the accelerated machine learning and data science on GPUs with RAPIDS.

RAPIDS Webpage