H100
Oct 08, 2024
Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy
This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...
7 MIN READ
Oct 02, 2024
AI Uses Zero-Shot Learning to Find Existing Drugs for Treating Rare Diseases
A groundbreaking drug-repurposing AI model could bring new hope to doctors and patients trying to treat diseases with limited or no existing treatment options....
3 MIN READ
Sep 25, 2024
Deploying Accelerated Llama 3.2 from the Edge to the Cloud
Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...
6 MIN READ
Sep 24, 2024
NVIDIA GH200 Grace Hopper Superchip Delivers Outstanding Performance in MLPerf Inference v4.1
In the latest round of MLPerf Inference – a suite of standardized, peer-reviewed inference benchmarks – the NVIDIA platform delivered outstanding...
7 MIN READ
Aug 28, 2024
NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1
Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a...
13 MIN READ
Aug 22, 2024
Jamba 1.5 LLMs Leverage Hybrid Architecture to Deliver Superior Reasoning and Long Context Handling
AI21 Labs has unveiled their latest and most advanced Jamba 1.5 model family, a cutting-edge collection of large language models (LLMs) designed to excel in a...
4 MIN READ
Jul 12, 2024
Boosting Mathematical Optimization Performance and Energy Efficiency on the NVIDIA Grace CPU
Mathematical optimization is a powerful tool that enables businesses and people to make smarter decisions and reach any number of goals—from improving...
4 MIN READ
Jul 02, 2024
Achieving High Mixtral 8x7B Performance with NVIDIA H100 Tensor Core GPUs and NVIDIA TensorRT-LLM
As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to...
9 MIN READ
Jul 02, 2024
Advancing Security for Large Language Models with NVIDIA GPUs and Edgeless Systems
Edgeless Systems introduced Continuum AI, the first generative AI framework that keeps prompts encrypted at all times with confidential computing by combining...
6 MIN READ
Jun 12, 2024
Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates
The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...
7 MIN READ
Jun 10, 2024
Confidential and Self-Sovereign AI: Best Practices for Enhancing Security and Autonomy
Join the webinar on June 11th with NVIDIA and Super Protocol to learn about the benefits of Confidential Computing for Web3 AI.
1 MIN READ
Apr 25, 2024
Announcing Confidential Computing General Access on NVIDIA H100 Tensor Core GPUs
NVIDIA launched the initial release of the Confidential Computing (CC) solution in private preview for early access in July 2023 through NVIDIA LaunchPad....
3 MIN READ
Feb 28, 2024
Optimizing OpenFold Training for Drug Discovery
Predicting 3D protein structures from amino acid sequences has been an important long-standing question in bioinformatics. In recent years, deep...
7 MIN READ
Feb 22, 2024
Benchmarking NVIDIA Spectrum-X for AI Network Performance, Now Available from Supermicro
NVIDIA Spectrum-X is swiftly gaining traction as the leading networking platform tailored for AI in hyperscale cloud infrastructures. Spectrum-X networking...
6 MIN READ
Jan 25, 2024
Advancing Production AI with NVIDIA AI Enterprise
While harnessing the potential of AI is a priority for many of today’s enterprises, developing and deploying an AI model involves time and effort. Often,...
7 MIN READ
Dec 14, 2023
Achieving Top Inference Performance with the NVIDIA H100 Tensor Core GPU and NVIDIA TensorRT-LLM
Best-in-class AI performance requires an efficient parallel computing architecture, a productive tool stack, and deeply optimized algorithms. NVIDIA released...
4 MIN READ