Developer Blog: How to Run NGC Deep Learning Containers with Singularity

Discuss (0)

New scientific breakthroughs are being made possible by the convergence of HPC and AI. It is now necessary to deploy both HPC and AI workloads on the same system.  

The complexity of the software environments needed to support HPC and AI workloads is huge. Application software depends on many interdependent software packages. Just getting a successful build can be a challenge, let alone ensuring the build is optimized to take advantage of the very latest hardware and software capabilities.

Figure 1: ResNet-50 architecture (source)

Containers are a widely adopted method of taming the complexity of deploying HPC and AI software. The entire software environment, from the deep learning framework itself, down to the math and communication libraries are necessary for performance, is packaged into a single bundle. Since workloads inside a container always use the same environment, the performance is reproducible and portable.

NGC, a registry of GPU-optimized software, has been enabling scientists and researchers by providing regularly updated and validated containers of HPC and AI applications. NGC recently announced beta support for using the deep learning containers with the Singularity container runtime, starting with version 19.11. This substantially eases the adoption of AI methodologies by HPC sites using Singularity

This developer blog illustrates how NGC and Singularity dramatically simplify the deployment of deep learning workloads on HPC systems.

Read the blog in its entirety here.