Technical Walkthrough 0

Leveling up CUDA Performance on WSL2 with New Enhancements

This technical blog post In this post focuses on the current state of the CUDA performance on WSL2, the various performance-centric optimizations that have been… 16 MIN READ
Technical Walkthrough 0

Boosting Productivity and Performance with the NVIDIA CUDA 11.2 C++ Compiler

The 11.2 CUDA C++ compiler incorporates features and enhancements aimed at improving developer productivity and the performance of GPU-accelerated applications. 21 MIN READ
Technical Walkthrough 0

Improving GPU Application Performance with NVIDIA CUDA 11.2 Device Link Time Optimization

CUDA 11.2 features the powerful link time optimization (LTO) feature for device code in GPU-accelerated applications. Device LTO brings the performance… 14 MIN READ
Technical Walkthrough 0

Aiming Faster in Games with Low Computer System Latency

Figure 1. A screenshot from our experimental FPS game, called First Person Science. Players must aim at and click on the green targets to eliminate them. 6 MIN READ
Technical Walkthrough 0

Maximizing Deep Learning Inference Performance with NVIDIA Model Analyzer

You’ve built your deep learning inference models and deployed them to NVIDIA Triton Inference Server to maximize model performance. How can you speed up the… 8 MIN READ
Technical Walkthrough 0

Int4 Precision for AI Inference

INT4 Precision Can Bring an Additional 59% Speedup Compared to INT8 If there’s one constant in AI and deep learning, it’s never-ending optimization to wring… 5 MIN READ