Tag: Performance

AI / Deep Learning

Boosting Productivity and Performance with the NVIDIA CUDA 11.2 C++ Compiler

The 11.2 CUDA C++ compiler incorporates features and enhancements aimed at improving developer productivity and the performance of GPU-accelerated applications. 21 MIN READ
AI / Deep Learning

Improving GPU Application Performance with NVIDIA CUDA 11.2 Device Link Time Optimization

CUDA 11.2 features the powerful link time optimization (LTO) feature for device code in GPU-accelerated applications. Device LTO brings the performance… 14 MIN READ
Graphics / Simulation

Aiming Faster in Games with Low Computer System Latency

Figure 1. A screenshot from our experimental FPS game, called First Person Science. Players must aim at and click on the green targets to eliminate them. 6 MIN READ
AI / Deep Learning

Maximizing Deep Learning Inference Performance with NVIDIA Model Analyzer

You’ve built your deep learning inference models and deployed them to NVIDIA Triton Inference Server to maximize model performance. How can you speed up the… 8 MIN READ
AI / Deep Learning

Int4 Precision for AI Inference

INT4 Precision Can Bring an Additional 59% Speedup Compared to INT8 If there’s one constant in AI and deep learning, it’s never-ending optimization to wring… 5 MIN READ
AI / Deep Learning

MLPerf Inference: NVIDIA Innovations Bring Leading Performance

New TensorRT 6 Features Combine with Open-Source Plugins to Further Accelerate Inference Inference is where AI goes to work. Identifying diseases. 7 MIN READ