NVSwitch

Jun 16, 2026

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

NVIDIA delivered a clean sweep in MLPerf Training v6.0, the latest edition of industry-standard AI training benchmarks developed by the MLCommons consortium....

11 MIN READ

Feb 04, 2026

Build with Kimi K2.5 Multimodal VLM Using NVIDIA GPU-Accelerated Endpoints

Kimi K2.5 is the newest open vision language model (VLM) from the Kimi family of models. Kimi K2.5 is a general-purpose multimodal model that excels in current...

4 MIN READ

Feb 03, 2026

Accelerating Long-Context Model Training in JAX and XLA

Large language models (LLMs) are rapidly expanding their context windows, with recent models supporting sequences of 128K tokens, 256K tokens, and beyond....

9 MIN READ

Dec 02, 2025

AWS Integrates AI Infrastructure with NVIDIA NVLink Fusion for Trainium4 Deployment

As demand for AI continues to grow, hyperscalers are looking for ways to accelerate deployment of specialized AI infrastructure with the highest performance....

5 MIN READ

Jan 24, 2025

Optimize AI Inference Performance with NVIDIA Full-Stack Solutions

The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing...

9 MIN READ

Nov 19, 2024

Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs

Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...

6 MIN READ

Nov 01, 2024

3x Faster AllReduce with NVSwitch and TensorRT-LLM MultiShot

Deploying generative AI workloads in production environments where user numbers can fluctuate from hundreds to hundreds of thousands – and where input sequence...

5 MIN READ

Sep 26, 2024

Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance

Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...

8 MIN READ

Aug 12, 2024

NVIDIA NVLink and NVIDIA NVSwitch Supercharge Large Language Model Inference

Large language models (LLM) are getting larger, increasing the amount of compute required to process inference requests. To meet real-time latency requirements...

8 MIN READ

An image of the GB200 NVL72 and NVLink spine.

Mar 18, 2024

NVIDIA GB200 NVL72 Delivers Trillion-Parameter LLM Training and Real-Time Inference

What is the interest in trillion-parameter models? We know many of the use cases today and interest is growing due to the promise of an increased capacity for:...

9 MIN READ

Nov 08, 2023

Setting New Records at Data Center Scale Using NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand

Generative AI is rapidly transforming computing, unlocking new use cases and turbocharging existing ones. Large language models (LLMs), such as OpenAI’s GPT...

19 MIN READ

Aug 23, 2022

Upgrading Multi-GPU Interconnectivity with the Third-Generation NVIDIA NVSwitch

Increasing demands in AI and high-performance computing (HPC) are driving a need for faster, more scalable interconnects with high-speed communication between...

13 MIN READ

GPU Operator 1.9 includes support for NVIDIA DGX A100 systems and streamlined installation processes.

Dec 07, 2021

GPU Operator 1.9 Adds Support for DGX A100 with DGX OS

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". NVIDIA GPU Operator...

3 MIN READ

Jul 15, 2021

Kubernetes for Network Engineers

Using the same orchestration on-premise and on the public cloud allows a high level of agility and ease of operations. You can use the same API across bare...

11 MIN READ

Dec 12, 2018

NVIDIA Captures Top Spots on World’s First Industry-Wide AI Benchmark

Today, the MLPerf consortium published its first results for the seven tests that currently comprise this new industry-standard benchmark for machine learning....

10 MIN READ

Mar 27, 2018

NVSwitch: Leveraging NVLink to Maximum Effect

GPUs have been PCIe devices for many generations in client systems, and more recently in servers. The rapid growth in deep learning workloads has driven the...

5 MIN READ