Enhancing Data Movement and Access for GPUs

Whether you are exploring mountains of data, researching scientific problems, training neural networks, or modeling financial markets, you need a computing platform with the highest data throughput. GPUs consume data much faster than CPUs and as the GPU computing horsepower increases, so does the demand for IO bandwidth.

NVIDIA GPUDirect® is a family of technologies, part of Magnum IO, that enhances data movement and access for NVIDIA data center GPUs. Using GPUDirect, network adapters and storage drives can directly read and write to/from GPU memory, eliminating unnecessary memory copies, decreasing CPU overheads and reducing latency, resulting in significant performance improvements. These technologies - including GPUDirect Storage, GPUDirect Remote Direct Memory Access (RDMA), GPUDirect Peer to Peer (P2P) and GPUDirect Video - are presented through a comprehensive set of APIs.

GPUDirect Storage

A Direct Path Between Storage and GPU Memory

As AI, HPC, and data analytics datasets continue to increase in size, the time spent loading data begins to impact application performance. Fast GPUs are increasingly starved by slow IO - the process of loading data from storage to GPU memory for processing.

GPUDirect Storage enables a direct data path between local or remote storage, such as NVMe or NVMe over Fabric (NVMe-oF), and GPU memory. It avoids extra copies through a bounce buffer in the CPU’s memory, enabling a direct memory access (DMA) engine near the NIC or storage to move data on a direct path into or out of GPU memory — all without burdening the CPU.



Direct Communication between NVIDIA GPUs

Remote direct memory access (RDMA) enables peripheral PCIe devices direct access to GPU memory. Designed specifically for the needs of GPU acceleration, GPUDirect RDMA provides direct communication between NVIDIA GPUs in remote systems. This eliminates the system CPUs and the required buffer copies of data via the system memory, resulting in 10X better performance.

GPUDirect RDMA is available in CUDA Toolkit. Contact networking vendors to download any third-party drivers that provide support for GPUDirect RDMA.


GPUDirect Storage Resources

Other Innovations in GPUDirect

GPUDirect for Video

Offers an optimized pipeline for frame-based devices such as frame grabbers, video switchers, HD-SDI capture, and CameraLink devices to efficiently transfer video frames in and out of NVIDIA GPU memory, on Windows only.


GPUDirect Peer to Peer

Enables GPU-to-GPU copies as well as loads and stores directly over the memory fabric (PCIe, NVLink). GPUDirect Peer to Peer is supported natively by the CUDA Driver. Developers should use the latest CUDA Toolkit and drivers on a system with two or more compatible devices.

For more information, please contact gpudirect@nvidia.com.

Join the Developer Program.