Benchmark

Mar 11, 2025
Efficient ETL with Polars and Apache Spark on NVIDIA Grace CPU
The NVIDIA Grace CPU Superchip delivers outstanding performance and best-in-class energy efficiency for CPU workloads in the data center and in the cloud. The...
7 MIN READ

Feb 14, 2025
Optimizing Qwen2.5-Coder Throughput with NVIDIA TensorRT-LLM Lookahead Decoding
Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents,...
7 MIN READ

Jan 16, 2025
NVIDIA JetPack 6.2 Brings Super Mode to NVIDIA Jetson Orin Nano and Jetson Orin NX Modules
The introduction of the NVIDIA Jetson Orin Nano Super Developer Kit sparked a new age of generative AI for small edge devices. The new Super Mode delivered an...
12 MIN READ

Jan 16, 2025
Accelerating Time Series Forecasting with RAPIDS cuML
Time series forecasting is a powerful data science technique used to predict future values based on data points from the past Open source Python libraries like...
4 MIN READ

Dec 20, 2024
Taking Computational Fluid Dynamics to the Next Level with the NVIDIA H200 Tensor Core GPU
Computational fluid dynamics (CFD) is used in industry and academia to address a wide range of use cases, including external aerodynamics, internal flows, heat...
5 MIN READ

Dec 19, 2024
RAPIDS 24.12 Introduces cuDF on PyPI, CUDA Unified Memory for Polars, and Faster GNNs
RAPIDS 24.12 introduces cuDF packages to PyPI, speeds up groupby aggregations and reading files from AWS S3, enables larger-than-GPU memory queries in the...
8 MIN READ

Dec 17, 2024
Boost Llama 3.3 70B Inference Throughput 3x with NVIDIA TensorRT-LLM Speculative Decoding
Meta's Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only...
8 MIN READ

Nov 19, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs
Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
6 MIN READ

Nov 15, 2024
Streamlining AI Inference Performance and Deployment with NVIDIA TensorRT-LLM Chunked Prefill
In this blog post, we take a closer look at chunked prefill, a feature of NVIDIA TensorRT-LLM that increases GPU utilization and simplifies the deployment...
4 MIN READ

Nov 13, 2024
NVIDIA Blackwell Doubles LLM Training Performance in MLPerf Training v4.1
As models grow larger and are trained on more data, they become more capable, making them more useful. To train these models quickly, more performance,...
8 MIN READ

Nov 08, 2024
5x Faster Time to First Token with NVIDIA TensorRT-LLM KV Cache Early Reuse
In our previous blog post, we demonstrated how reusing the key-value (KV) cache by offloading it to CPU memory can accelerate time to first token (TTFT) by up...
5 MIN READ

Nov 06, 2024
State-of-the-Art Multimodal Generative AI Model Development with NVIDIA NeMo
Generative AI has rapidly evolved from text-based models to multimodal capabilities. These models perform tasks like image captioning and visual question...
6 MIN READ

Oct 31, 2024
Even Faster and More Scalable UMAP on the GPU with RAPIDS cuML
UMAP is a popular dimension reduction algorithm used in fields like bioinformatics, NLP topic modeling, and ML preprocessing. It works by creating a k-nearest...
12 MIN READ

Oct 28, 2024
NVIDIA GH200 Superchip Accelerates Inference by 2x in Multiturn Interactions with Llama Models
Deploying large language models (LLMs) in production environments often requires making hard trade-offs between enhancing user interactivity and increasing...
7 MIN READ

Oct 08, 2024
Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy
This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...
7 MIN READ

Oct 03, 2024
New Reward Model Helps Improve LLM Alignment with Human Preferences
Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...
4 MIN READ