NVIDIA cuPyNumeric

Bringing zero-code-change scaling to multi-GPU and multi-node (MGMN) accelerated computing

Python is a powerful and user-friendly programming language widely adopted by researchers and scientists for data science, machine learning (ML), and productive numerical computing. NumPy is the de facto standard math and matrix library, providing a simple and easy-to-use programming model with interfaces that correspond closely to the mathematical needs of scientific applications.

As data sizes and computational complexities grow, CPU-based Python and NumPy programs need help meeting the speed and scale demanded by cutting-edge research.

Distributed accelerated computing offers an infrastructure for efficiently solving and testing hypotheses in data-driven problems. Whether analyzing data generated by recording the scattering of high-energy electron beams, building new methodologies to solve complex computational fluid dynamics problems, or building ML models, researchers are increasingly seeking ways to effortlessly scale their programs.

NVIDIA cuPyNumeric aspires to be a drop-in replacement library for NumPy, bringing distributed and accelerated computing on the NVIDIA platform to the Python community. It allows researchers and scientists to write their research programs productively using native Python language and familiar tools without having to worry about parallel computing or distributed computing. cuPyNumeric and Legate can then easily scale their programs from single-CPU computers to MGMN supercomputers, without changing the code.
Download the latest release of cuPyNumeric today.

Download the latest release of cuPyNumeric beta today.

Download Now

Legate

Legate is an abstraction layer that runs on top of the CUDA® runtime system, together providing scalable implementations of popular domain-specific APIs. NVIDIA cuPyNumeric layers on top of Legate, like many other libraries.

Legate democratizes computing by making it possible for all programmers to leverage the power of large clusters of CPUs and GPUs by running the same code that runs on a desktop or a laptop at scale. Using this technology, scientists and researchers can develop and test programs on moderately sized datasets on local machines and then immediately scale up to larger datasets deployed on many nodes in the cloud or on a supercomputer without any code modifications.

Getting Started With Legate

Key Benefits

The NVIDIA cuPyNumeric library on Legate:

Supports native Python language and NumPy interface without constraints
Transparently accelerates and scales existing NumPy workflows
Provides a seamless drop-in replacement for NumPy
Provides automatic parallelism and acceleration for multiple nodes across CPUs and GPUs
Scales from one CPU up to thousands of GPUs optimally
Requires little to no code changes, allowing faster completion of scientific tasks
Is freely available. Get started with the installation guide and tutorial.

cuPyNumeric Performance

Weak Scaling of Richard-Lucy Devonvolution on NVIDIA DGX SuperPOD — Weak Scaling of Richard-Lucy Deconvolution on DGX SuperPOD

Processing 10TB Microscopy Image Data as a Single NumPy Array

This multi-view lattice light-sheet microscopy example produces tens of terabytes (TB) of raw image data per day. Up until now, all processing has happened offline, after all the data has been collected. By moving all the preprocessing and reconstruction operations to GPUs and using cuPyNumeric on Legate, the data can be visualized in real time as it’s processed.

Get started with cuPyNumeric today.

Download Now