Magnum IO networking provides both point-to-point functions like send and receive, and collectives like AllReduce for deep learning training at scale. The collective APIs hide low-level optimizations like topology detection, peer-to-peer copy, and multi-threading to simplify deep learning training. Send/receive can enable users to accelerate giant deep learning models too big to fit in one GPU’s memory. GPUDirect Storage can also help alleviate IO bottlenecks from local or remote storage by bypassing bounce buffers on the CPU host.

High-Performance Computing

To unlock next-generation discoveries, scientists rely on simulation to better understand complex molecules for drug discovery, physics for new sources of energy, and atmospheric data to better predict extreme weather patterns. Magnum IO exposes hardware-level acceleration engines and smart offloads, such as RDMA, GPUDirect, and NVIDIA SHARP, while bolstering the 400Gb/s high bandwidth and ultra-low latency of NVIDIA Quantum 2 InfiniBand networking.

With multi-tenancy, user applications may be unaware of indiscriminate interference from neighboring application traffic. Magnum IO, on the latest NVIDIA Quantum-2 InfiniBand platform, features new and improved capabilities for mitigating the negative impact on a user’s performance. This delivers optimal results, as well as the most efficient high performance computing (HPC) and machine learning deployments at any scale.