NVSwitch
Nov 19, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs
Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
6 MIN READ
Nov 01, 2024
3x Faster AllReduce with NVSwitch and TensorRT-LLM MultiShot
Deploying generative AI workloads in production environments where user numbers can fluctuate from hundreds to hundreds of thousands – and where input...
5 MIN READ
Sep 26, 2024
Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance
Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...
8 MIN READ
Aug 12, 2024
NVIDIA NVLink and NVIDIA NVSwitch Supercharge Large Language Model Inference
Large language models (LLM) are getting larger, increasing the amount of compute required to process inference requests. To meet real-time latency requirements...
8 MIN READ
Mar 18, 2024
NVIDIA GB200 NVL72 Delivers Trillion-Parameter LLM Training and Real-Time Inference
What is the interest in trillion-parameter models? We know many of the use cases today and interest is growing due to the promise of an increased capacity for:...
9 MIN READ
Nov 08, 2023
Setting New Records at Data Center Scale Using NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand
Generative AI is rapidly transforming computing, unlocking new use cases and turbocharging existing ones. Large language models (LLMs), such as OpenAI’s GPT...
19 MIN READ
Aug 23, 2022
Upgrading Multi-GPU Interconnectivity with the Third-Generation NVIDIA NVSwitch
Increasing demands in AI and high-performance computing (HPC) are driving a need for faster, more scalable interconnects with high-speed communication between...
13 MIN READ
Dec 07, 2021
GPU Operator 1.9 Adds Support for DGX A100 with DGX OS
Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". NVIDIA GPU Operator...
3 MIN READ
Jul 15, 2021
Kubernetes for Network Engineers
Using the same orchestration on-premise and on the public cloud allows a high level of agility and ease of operations. You can use the same API across bare...
11 MIN READ
Dec 12, 2018
NVIDIA Captures Top Spots on World’s First Industry-Wide AI Benchmark
Today, the MLPerf consortium published its first results for the seven tests that currently comprise this new industry-standard benchmark for machine learning....
10 MIN READ
Mar 27, 2018
NVSwitch: Leveraging NVLink to Maximum Effect
GPUs have been PCIe devices for many generations in client systems, and more recently in servers. The rapid growth in deep learning workloads has driven the...
5 MIN READ