Nsight Tools – Compute
Aug 08, 2024
Improving GPU Performance by Reducing Instruction Cache Misses
GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming...
11 MIN READ
Aug 07, 2024
Optimizing llama.cpp AI Inference with CUDA Graphs
The open-source llama.cpp code base was originally released in 2023 as a lightweight but efficient framework for performing inference on Meta Llama models....
8 MIN READ
Aug 02, 2024
Just Released: Nsight Compute 2024.3
Nsight Compute 2024.3 improves selectively exporting results into a new report, kernel name logging to debug empty reports, and profiling green contexts.
1 MIN READ
Aug 01, 2024
Just Released: CUDA Toolkit 12.6
The release supports GB100 capabilities and new library enhancements to cuBLAS, cuFFT, cuSOLVER, cuSPARSE, as well as the release of Nsight Compute 2024.3.
1 MIN READ
May 22, 2024
Just Released: Nsight Compute 2024.2
Nsight Compute 2024.2 adds Python syntax highlighting and call stacks, a redesigned report header, and source page statistics to make CUDA optimization easier.
1 MIN READ
Apr 19, 2024
Measuring the GPU Occupancy of Multi-stream Workloads
NVIDIA GPUs are becoming increasingly powerful with each new generation. This increase generally comes in two forms. Each streaming multi-processor (SM), the...
11 MIN READ
Mar 27, 2024
Efficient CUDA Debugging: Using NVIDIA Compute Sanitizer with NVIDIA Tools Extension and Creating Custom Tools
NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications....
14 MIN READ
Mar 25, 2024
Building High-Performance Applications in the Era of Accelerated Computing
AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements...
6 MIN READ
Mar 14, 2024
Powerful Shader Insights: Using Shader Debug Info with NVIDIA Nsight Graphics
As ray tracing becomes the predominant rendering technique in modern game engines, a single GPU RayGen shader can now perform most of the light simulation of a...
7 MIN READ
Mar 06, 2024
CUDA Toolkit 12.4 Enhances Support for NVIDIA Grace Hopper and Confidential Computing
The latest release of CUDA Toolkit, version 12.4, continues to push accelerated computing performance using the latest NVIDIA GPUs. This post explains the new...
9 MIN READ
Nov 16, 2023
Unlock the Power of NVIDIA Grace and NVIDIA Hopper Architectures with Foundational HPC Software
High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern...
7 MIN READ
Nov 01, 2023
CUDA Toolkit 12.3 Delivers New Features for Accelerated Computing
The latest release of CUDA Toolkit continues to push the envelope of accelerated computing performance using the latest NVIDIA GPUs. New features of this...
4 MIN READ
Oct 13, 2023
Advanced API Performance: Debugging
NVIDIA offers a large suite of tools for graphics debugging, including NVIDIA Nsight System for CPU debugging, and Nsight Graphics for GPU debugging. Nsight...
7 MIN READ
Oct 12, 2023
Workshop: Model Parallelism: Building and Deploying Large Neural Networks
Learn how to train the largest neural networks and deploy them to production.
1 MIN READ
Sep 28, 2023
NVIDIA H100 System for HPC and Generative AI Sets Record for Financial Risk Calculations
Generative AI is taking the world by storm, from large language models (LLMs) to generative pretrained transformer (GPT) models to diffusion models. NVIDIA is...
7 MIN READ
Sep 25, 2023
New Video Series: CUDA Developer Tools Tutorials
GPU acceleration is enabling faster and more intelligent applications than ever before, and the CUDA Toolkit is key to harnessing acceleration on NVIDIA GPUs....
3 MIN READ