Gil Bloch, Mellanox
gtc-dc 2019
We’ll demonstrate how to build a scalable, high-performance, data-centric GPU cluster for artificial intelligence. Mellanox is a leader in high-performance, low-latency network interconnects for both InfiniBand and Ethernet. We’ll present state-of-the-art techniques for distributed machine learning, and explain what special requirements they impose on the system. There will be an overview of interconnect technologies used to scale and accelerate distributed machine learning. This will include RDMA, NVIDIA’S GPUDIRECT technology and in-network computing platform, which is used to accelerate large-scale deployments in HPC and artificial intelligence.