InfiniBand
Apr 01, 2026
NVIDIA Platform Delivers Lowest Token Cost Enabled by Extreme Co-Design
Co-designed hardware, software, and models are key to delivering the highest AI factory throughput and lowest token cost. Measuring this goes far beyond peak...
10 MIN READ
Apr 01, 2026
Accelerate Token Production in AI Factories Using Unified Services and Real-Time AI
In today’s AI factory environment, performance is not theoretical. It is economic, competitive, and existential. A 1% drop in usable GPU time can mean...
8 MIN READ
Feb 02, 2026
Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel
In LLM training, Expert Parallel (EP) communication for hyperscale mixture-of-experts (MoE) models is challenging. EP communication is essentially all-to-all,...
11 MIN READ
Jan 05, 2026
Inside the NVIDIA Vera Rubin Platform: Six New Chips, One AI Supercomputer
Update March 16, 2026: The NVIDIA Vera Rubin platform now has a seventh chip. Learn more about NVIDIA Groq 3 LPX: The Low-Latency Inference Accelerator for the...
63 MIN READ
Dec 11, 2025
NVIDIA Blackwell Enables 3x Faster Training and Nearly 2x Training Performance Per Dollar than Previous-Gen Architecture
AI innovation continues to be driven by three scaling laws: pre-training, post-training, and test-time scaling. Training is foundational to building smarter...
7 MIN READ
Oct 30, 2025
Streamline AI Infrastructure with NVIDIA Run:ai on Microsoft Azure
Modern AI workloads, ranging from large-scale training to real-time inference, demand dynamic access to powerful GPUs. However, Kubernetes environments have...
9 MIN READ
Aug 26, 2025
How Industry Collaboration Fosters NVIDIA Co-Packaged Optics
NVIDIA is reshaping the landscape of data-center connectivity by seamlessly integrating optical and electrical components. But it’s not doing it alone....
8 MIN READ
Aug 18, 2025
Scaling AI Factories with Co-Packaged Optics for Better Power Efficiency
As artificial intelligence redefines the computing landscape, the network has become the critical backbone shaping the data center of the future. Large language...
8 MIN READ
Jul 14, 2025
Enabling Fast Inference and Resilient Training with NCCL 2.27
As AI workloads scale, fast and reliable GPU communication becomes vital, not just for training, but increasingly for inference at scale. The NVIDIA Collective...
9 MIN READ
Jul 10, 2025
InfiniBand Multilayered Security Protects Data Centers and AI Workloads
In today’s data-driven world, security isn't just a feature—it's the foundation. With the exponential growth of AI, HPC, and hyperscale cloud computing, the...
6 MIN READ
Nov 21, 2024
Advancing Ansys Workloads with NVIDIA Grace and NVIDIA Grace Hopper
Accelerated computing is enabling giant leaps in performance and energy efficiency compared to traditional CPU computing. Delivering these advancements requires...
10 MIN READ
Nov 13, 2024
NVIDIA Blackwell Doubles LLM Training Performance in MLPerf Training v4.1
As models grow larger and are trained on more data, they become more capable, making them more useful. To train these models quickly, more performance,...
8 MIN READ
Oct 25, 2024
Advancing Performance with NVIDIA SHARP In-Network Computing
AI and scientific computing applications are great examples of distributed computing problems. The problems are too large and the computations too intensive to...
7 MIN READ
Oct 15, 2024
Powering Next-Generation AI Networking with NVIDIA SuperNICs
In the era of generative AI, accelerated networking is essential to build high-performance computing fabrics for massively distributed AI workloads. NVIDIA...
6 MIN READ
Sep 06, 2024
Enhancing Application Portability and Compatibility across New Platforms Using NVIDIA Magnum IO NVSHMEM 3.0
NVSHMEM is a parallel programming interface that provides efficient and scalable communication for NVIDIA GPU clusters. Part of NVIDIA Magnum IO and based on...
7 MIN READ
Jan 23, 2024
Simplifying Network Operations for AI with NVIDIA Quantum InfiniBand
A common technological misconception is that performance and complexity are directly linked. That is, the highest-performance implementation is also the most...
4 MIN READ