NVIDIA EGX Stack
From the enterprise to the edge, the NVIDIA EGX™ stack delivers a cloud-native platform for GPU-accelerated machine learning, deep learning, and high-performance computing (HPC). Use the EGX stack to quickly and painlessly run GPU-optimized NGC™ containers on NVIDIA-Certified servers.Get Started
The NVIDIA EGX stack is an optimized software stack that includes NVIDIA drivers, a Kubernetes plug-in, a container runtime, and containerized AI frameworks and applications, including NVIDIA® TensorRT™, NVIDIA Triton™ Inference Server, and the NVIDIA DeepStream SDK. The EGX stack is optimized for NVIDIA-Certified systems. Helm charts for installation are available in the NVIDIA NGC catalog.
Cloud-native technologies include microservices, containerization, and declarative automation. The EGX stack is built with cloud-native technologies and is engineered to run GPU-optimized NGC containers, enabling developer-focused workflows and software-lifecycle agility.
Open-source software fuels technology innovation. NVIDIA contributes to open-source projects and communities, including container runtimes, Kubernetes extensions, and monitoring tools. Open-source software is the foundation for the EGX stack.
Performance and Scale
From edge to enterprise, the EGX stack delivers performance at scale. It supports NVIDIA Ampere A100 Tensor Core GPUs for best performance, including the EGX converged accelerators that enable performance to scale across multiple nodes.
NVIDIA GPU Operator
The NVIDIA GPU Operator uses the Kubernetes operator framework to automate the management of all NVIDIA software components needed to provision GPUs. These components include NVIDIA drivers to enable CUDA®, a Kubernetes device plug-in for GPUs, the NVIDIA container runtime, automatic node labelling, and a NVIDIA Data Center GPU Manager (DCGM)-based monitoring agent.Learn More
NVIDIA Network Operator
The NVIDIA Network Operator leverages Kubernetes custom resource definition (CRD) features and Operator SDK to enable fast networking, remote direct memory access (RDMA) and GPUDirect® in a Kubernetes cluster. It supports RDMA-shared devices, the Mellanox Kubernetes device plug-in, and the GPUDirect RDMA NVIDIA peer memory driver.Learn More
The EGX stack architecture is supported by leading hybrid-cloud platform partners, ensuring that these clusters are configured for GPU workloads and perform in an optimal, consistent fashion.
Build cloud-native applications on the EGX stack.