NVIDIA Arm HPC Developer Kit

The NVIDIA Arm HPC Developer Kit is an integrated hardware and software platform for creating, evaluating, and benchmarking HPC, AI, and scientific computing applications on a heterogeneous GPU- and CPU-accelerated computing system. The kit includes an Arm CPU, an NVIDIA A100 Tensor Core GPU server, and the NVIDIA HPC SDK suite of tools.

NVIDIA Arm Scientific Computing Developer Kit

Key Benefits

The validated platform provides quick and easy bring-up and a stable environment for accelerated code execution and evaluation, performance analysis, system experimentation, and system characterization.

  • Delivers a validated system for quick and easy bring-up in familiar HPC environments
  • Offers a stable hardware and software platform for development and performance analysis of accelerated HPC, AI, and scientific computing applications
  • Enables experimentation and characterization of high-performance, NVIDIA-accelerated, Arm server-based system architectures

Platform Support

Hardware Specification

ModelGIGABYTE G242-P32, 2U server
CPU1x Ampere Altra Q80-30 (Arm processor)
Memory512G DDR4 memory
Storage6TB SAS/ SATA 3.5″
Network 2x NVIDIA® BlueField®-2 E-Series DPU, 200GbE/HDR single-port QSFP56, PCIe Gen4 x16, secure boot enabled, crypto disabled, 16GB on-board DDR, 1GbE OOB management

Supported Software

  • CUDA — A parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs).
  • TensorRT — An SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications
  • PyTorch — A GPU accelerated tensor computational framework. Functionality can be extended with common Python libraries such as NumPy and SciPy
  • TensorFlow — An open source platform for machine learning, providing comprehensive tools and libraries in a flexible architecture allowing easy deployment across a variety of platforms and devices.
  • RAPIDS — A suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs.

NVIDIA Arm HPC Developer Kit FAQ

Yes. The NVIDIA Arm HPC Developer Kit, based on GIGABYTE 242-P32, is a 2U rackmount server and comes with rails. The platform has 2x 1600W PSUs and uses a C13 plug.

The GIGABYTE G242-P32 server design does not have enough mechanical space to install the NVlink bridge.

The NVIDIA Arm HPC Developer Kit is a fixed platform configuration, it only supports the Ampere Computing Altra CPU currently shipped with the system.

NVIDIA BlueField-2 as well as NVIDIA ConnectX-6 network adapters have the ability to switch between Infiniband to Ethernet protocols. The DPU included in the NVIDIA Arm HPC Developer Kit supports Virtual Protocol Interconnect (VPI) which allows switching to either Infiniband and Ethernet protocols. Instructions on how to safely manage the Port Type Management/VPI Cards Configuration can be found on the “Using mlxconfig to Set VPI Parameters” section of the NVIDIA Firmware Tools (MFT) package user guide for NVIDIA networking products.

For anything related to hardware, please contact your OEM / Solution Integrator from which you purchased the NVIDIA Arm HPC Developer Kit.

The NVIDIA Arm HPC Developer Kit has been internally tested and qualified using Ubuntu 20.04 and RHEL 8.4 operating systems. We also recommend using GCC 10 or newer.

The NVIDIA Arm HPC Developer Kit requires NVIDIA driver 470.57.02 (or later), NVIDIA CUDA Toolkit 11.4 (or later), and NVIDIA HPC SDK 21.7 (or later) to work correctly.

Please contact your preferred solution partner to understand if they support this particular Arm-based platform.

Additional resources on this topic can be found at:

Several HPC parallel file-systems are available and Arm Ltd has collected working instructions how to deploy and build on Arm-based systems:

We are not aware of a GPFS / IBM Spectrum Scale client working on Arm. Please contact IBM if interested.

NVIDIA HPC SDK, available free of charge on our Developer website, comes with bundled CUDA runtime libraries as well as pre-built OpenMPI. NVIDIA C/C++/Fortran compilers are designed to accept the same flags on any platform. For CUDA C/C++ applications, it is possible to utilize nvcc with several host compilers such as GNU, NVC/NVC++ or Arm HPC C/C++ compiler.

NVIDIA in collaboration with Arm Ltd selected a list of ~20 applications and tested those on a single-node NVIDIA Arm HPC Developer Kit. Results have been presented at GTC Spring 2021, title ”HPC Applications on ARM+NVIDIA A100 [S32758]“. The recording is available on the Nvidia OnDemand platform, free registration to the NVIDIA Developer Program is required.

Arm Ltd maintains a public wiki page containing a list of HPC applications tested on Arm-based HPC systems as well as key software packages required to build and operate a clustered HPC system. If you are a developer of a commercial HPC application and you are interested in enabling and validating it on a Arm-based system, please get in touch with your main NVIDIA contact.

The entire portfolio of NVIDIA Developer Tools (Nsight Systems, Nsight Compute, Nsight Graphics) is supported on Arm-based systems. We currently provide a “CLI only” package which provides the necessary command line utilities and libraries to run the tools. Once a report is generated, it can be exported to a different machine and inspected.

We have engaged with Arm Ltd to provide DevKit customers access to Allinea Forge tools, Arm HPC Compilers and Arm Performance Libraries. These software products require a valid license that will be provided by Arm Ltd. We will be able to facilitate an introduction to an appropriate point of contact at Arm.

We run several community support forums accessible by registered developers (registration is free) at https://forums.developer.nvidia.com/. For example:

Contact GIGABYTE for hardware pricing information.


Get started and download the HPC SDK now.