Nsight Tools – Compute

Jan 31, 2025
CUDA Toolkit Now Available for NVIDIA Blackwell
The latest release of the CUDA Toolkit, version 12.8, continues to push accelerated computing performance in data sciences, AI, scientific computing, and...
9 MIN READ

Aug 08, 2024
Improving GPU Performance by Reducing Instruction Cache Misses
GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming...
12 MIN READ

Aug 07, 2024
Optimizing llama.cpp AI Inference with CUDA Graphs
The open-source llama.cpp code base was originally released in 2023 as a lightweight but efficient framework for performing inference on Meta Llama models....
8 MIN READ

Aug 02, 2024
Just Released: Nsight Compute 2024.3
Nsight Compute 2024.3 improves selectively exporting results into a new report, kernel name logging to debug empty reports, and profiling green contexts.
1 MIN READ

Aug 01, 2024
Just Released: CUDA Toolkit 12.6
The release supports GB100 capabilities and new library enhancements to cuBLAS, cuFFT, cuSOLVER, cuSPARSE, as well as the release of Nsight Compute 2024.3.
1 MIN READ

May 22, 2024
Just Released: Nsight Compute 2024.2
Nsight Compute 2024.2 adds Python syntax highlighting and call stacks, a redesigned report header, and source page statistics to make CUDA optimization easier.
1 MIN READ

Apr 19, 2024
Measuring the GPU Occupancy of Multi-stream Workloads
NVIDIA GPUs are becoming increasingly powerful with each new generation. This increase generally comes in two forms. Each streaming multi-processor (SM), the...
11 MIN READ

Mar 27, 2024
Efficient CUDA Debugging: Using NVIDIA Compute Sanitizer with NVIDIA Tools Extension and Creating Custom Tools
NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications....
14 MIN READ

Mar 25, 2024
Building High-Performance Applications in the Era of Accelerated Computing
AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements...
6 MIN READ

Mar 14, 2024
Powerful Shader Insights: Using Shader Debug Info with NVIDIA Nsight Graphics
As ray tracing becomes the predominant rendering technique in modern game engines, a single GPU RayGen shader can now perform most of the light simulation of a...
7 MIN READ

Mar 06, 2024
CUDA Toolkit 12.4 Enhances Support for NVIDIA Grace Hopper and Confidential Computing
The latest release of CUDA Toolkit, version 12.4, continues to push accelerated computing performance using the latest NVIDIA GPUs. This post explains the new...
9 MIN READ

Nov 16, 2023
Unlock the Power of NVIDIA Grace and NVIDIA Hopper Architectures with Foundational HPC Software
High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern...
7 MIN READ

Nov 01, 2023
CUDA Toolkit 12.3 Delivers New Features for Accelerated Computing
The latest release of CUDA Toolkit continues to push the envelope of accelerated computing performance using the latest NVIDIA GPUs. New features of this...
4 MIN READ

Oct 13, 2023
Advanced API Performance: Debugging
NVIDIA offers a large suite of tools for graphics debugging, including NVIDIA Nsight System for CPU debugging, and Nsight Graphics for GPU debugging. Nsight...
7 MIN READ

Oct 12, 2023
Workshop: Model Parallelism: Building and Deploying Large Neural Networks
Learn how to train the largest neural networks and deploy them to production.
1 MIN READ

Sep 28, 2023
NVIDIA H100 System for HPC and Generative AI Sets Record for Financial Risk Calculations
Generative AI is taking the world by storm, from large language models (LLMs) to generative pretrained transformer (GPT) models to diffusion models. NVIDIA is...
7 MIN READ