Hopper
Nov 21, 2024
Advancing Ansys Workloads with NVIDIA Grace and NVIDIA Grace Hopper
Accelerated computing is enabling giant leaps in performance and energy efficiency compared to traditional CPU computing. Delivering these advancements requires...
10 MIN READ
Nov 19, 2024
NVIDIA cuDSS Library Removes Barriers to Optimizing the US Power Grid
In the wake of ever-growing power demands, power systems optimization (PSO) of power grids is crucial for ensuring efficient resource management,...
7 MIN READ
Nov 19, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs
Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
6 MIN READ
Nov 14, 2024
Exploring the Case of Super Protocol with Self-Sovereign AI and NVIDIA Confidential Computing
Confidential and self-sovereign AI is a new approach to AI development, training, and inference where the user’s data is decentralized, private, and...
15 MIN READ
Nov 11, 2024
Developing a 172B LLM with Strong Japanese Capabilities Using NVIDIA Megatron-LM
Generative AI has the ability to create entirely new content that traditional machine learning (ML) methods struggle to produce. In the field of natural...
6 MIN READ
Oct 09, 2024
Boosting Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch
The continued growth of LLMs capability, fueled by increasing parameter counts and support for longer contexts, has led to their usage in a wide variety of...
8 MIN READ
Sep 26, 2024
Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance
Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...
8 MIN READ
Sep 24, 2024
NVIDIA GH200 Grace Hopper Superchip Delivers Outstanding Performance in MLPerf Inference v4.1
In the latest round of MLPerf Inference – a suite of standardized, peer-reviewed inference benchmarks – the NVIDIA platform delivered outstanding...
7 MIN READ
Aug 28, 2024
NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1
Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a...
13 MIN READ
Aug 20, 2024
NVIDIA GH200 Superchip Delivers Breakthrough Energy Efficiency and Node Consolidation for Apache Spark
With the rapid growth of generative AI, CIOs and IT leaders are looking for ways to reclaim data center resources to accommodate new AI use cases that promise...
8 MIN READ
Aug 15, 2024
Bringing Confidentiality to Vector Search with Cyborg and NVIDIA cuVS
In the era of generative AI, vector databases have become indispensable for storing and querying high-dimensional data efficiently. However, like all databases,...
7 MIN READ
Aug 02, 2024
Revolutionizing Data Center Efficiency with the NVIDIA Grace Family
The exponential growth in data processing demand is projected to reach 175 zettabytes by 2025. This contrasts sharply with the slowing pace of CPU performance...
16 MIN READ
Jul 02, 2024
Achieving High Mixtral 8x7B Performance with NVIDIA H100 Tensor Core GPUs and NVIDIA TensorRT-LLM
As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to...
9 MIN READ
Jul 01, 2024
How Cutting-Edge Computer Chips are Speeding Up the AI Revolution
Featured in Nature, this post delves into how GPUs and other advanced technologies are meeting the computational challenges posed by AI.
1 MIN READ
Jun 12, 2024
Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates
The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...
7 MIN READ
Jun 12, 2024
NVIDIA Sets New Generative AI Performance and Scale Records in MLPerf Training v4.0
Generative AI models have a variety of uses, such as helping write computer code, crafting stories, composing music, generating images, producing videos, and...
11 MIN READ