Legate

Legate is a framework and runtime with an ecosystem of libraries that democratize distributed accelerated computing for everyone. It’s designed to enable all users, including researchers, scientists, data engineers, and AI innovators, to scale their applications or scientific projects effortlessly without having to deal with the complexity of Parallel and Distributed computing. With Legate and Legate libraries, users can write sequential programs on their laptop and execute that very same code that runs on a CPU to run on a workstations with GPUs, GPU instances in public clouds, or multi-GPU/multi-node supercomputers. Using this technology, AI/ML innovators, scientific computing, and data scientists can develop and test programs on moderately sized datasets on local machines. They can then immediately scale up to larger datasets deployed on many nodes in the cloud, or on a supercomputer, without any code modifications.

Benefits

Productivity

Heighten research and developer efficiency by utilizing implicit parallelism effortlessly, eliminating the learning curve of GPU programming, parallel processing, and distributed computing.

Scalability

Scale from a single laptop to a distributed parallel architecture at data center scale with thousands of nodes and tens of thousands of CPUs and GPUs.

Flexibility

Transparently extend popular tools like NumPy, JAX, and RAPIDS™ on a common foundation that remains compatible and composable. Enables third parties to build libraries on the common core.

Key Features

Implicit Parallelism

Convert simple sequential order programs into parallel tasks implicitly.

cuPyNumeric provides three levels of implicit parallelism:

Asynchronous execution of independent computations (tasks).
Tasks that can be broken into sub-tasks and executed by different processes.
Kernel-level parallelism using CUDA kernels.

Unified Data Abstraction

Legate framework offers a unified data abstraction for Legate libraries.

Legate runtime manages the data partitioning, distribution, and synchronization; providing a strong guarantee of data coherency.

Different libraries are designed to share in-memory buffers of data without unnecessary copying.

Composable Libraries

Use cuPyNumeric and Legate Sparse for scientific computing.

Legate IO for distributed high performance throughput I/O.

Legate Rapids for data scientists and data engineers.

Legate backend for JAX for AI/ML and large language models (LLMs).

All libraries share a unified and coherent data representation.

Friendly User Interface

A single command to start a distributed launch in supercomputers. Run within Jupyter Notebooks and K8s for cloud-native infrastructure. A DASK bootstrap is available for running on Dask clusters.

Flexible Infrastructure

CPU and GPU, x86, and ARM.
Workstation, supercomputer, and public clouds.

Efficient Communication

GPU Direct for communication.
NVLink, InfiniBand, Slingshot, EFA. Automatically choose the fastest datapath.

Profiling & Debugging

Dataflow and event graphs during each compile run.

Timelines of program’s execution.
Nsight System support.

SDK to Extend

Legate provides C++ and Python APIs that developers can use to build composable libraries on Legate.

Product Overview

As datasets and applications continue to increase in size and complexity, there’s a growing need to harness computational resources, far beyond what a single CPU-only system can provide. Legate provides an abstraction layer and a runtime system that together enable scalable implementations of popular domain-specific APIs. It offers an API similar to Apache Arrow but guarantees stronger data coherence and synchronization to aid library developers.

Legate enables major use cases, including AI/ML, scientific computing, and advanced data analytics to run efficiently in a large-scale, multi-GPU/multi-node infrastructure.

Legate distributed framework and runtime for accelerated computing

Scientific Computing

Legate provides augmented alternatives to popular scientific computing libraries like NumPy, SciPy, HDF5, and others, with the addition of multi-GPU multi-node support seamlessly. You can use familiar language tools to write research programs and let Legate scale it from CPU to GPU, to multi-GPU multi-Node, without changing any code.

Learn More

Data Science & Analytics

Legate Dataframe helps data scientists scale their applications to work with larger and text-heavy datasets in demanding workloads, enhancing pre-existing CPU and GPU-accelerated data science and AI libraries.

Learn More About Machine Learning and Analytics

AI/ML (coming soon…)

NVIDIA’s Legate backend for JAX supports popular and future models. Whether you’re building on LLMs like Llama 3, MoE, or new models, Legate backend for JAX enables flexible parallelisms for AI/DL training to achieve state-of-the-art performance with thousands of GPUs.

Learn More About AI

GTC Videos on Demand

Legate: A Productive Programming Framework for Composable, Scalable, Accelerated Libraries

GTC24 Talk Session

Watch Video (49:39)

How to Create a Distributed GPU-Accelerated Python Library

GTC23 Talk Session

Watch Video (25:21)

Legate: Scaling the Python Ecosystem

GTC21 Talk Session

Watch Video (53:07)

Legate

Benefits

Productivity

Scalability

Flexibility

Key Features

Implicit Parallelism

Unified Data Abstraction

Composable Libraries

Friendly User Interface

Flexible Infrastructure

Efficient Communication

Profiling & Debugging

SDK to Extend

Product Overview

Scientific Computing

Data Science & Analytics

AI/ML (coming soon…)

GTC Videos on Demand

Legate: A Productive Programming Framework for Composable, Scalable, Accelerated Libraries

How to Create a Distributed GPU-Accelerated Python Library

Legate: Scaling the Python Ecosystem

Get Started in Legate

Legate Core Documentation

Legate on GitHub

cuPyNumeric Library

Resources

Download the latest beta release of cuPyNumeric today.

Libraries

Documentation

Blogs

Publications

Presentations