Tag: C++

Data Science

NVIDIA Tools Extension API: An Annotation Tool for Profiling Code in Python and C/C++

As PyData leverages much of the static language world for speed including CUDA, we need tools which not only profile and measure across languages but also… 9 MIN READ
AI / Deep Learning

Fast, Flexible Allocation for NVIDIA CUDA with RAPIDS Memory Manager

When I joined the RAPIDS team in 2018, NVIDIA CUDA device memory allocation was a performance problem. RAPIDS cuDF allocates and deallocates memory at high… 24 MIN READ

Detecting Divergence Using PCAST to Compare GPU to CPU Results

Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. 14 MIN READ

Accelerating Standard C++ with GPUs Using stdpar

Historically, accelerating your C++ code with GPUs has not been possible in Standard C++ without using language extensions or additional libraries: CUDA C++… 19 MIN READ
AI / Deep Learning

How to Speed Up Deep Learning Inference Using TensorRT

Introduction to accelerated creating inference engines using TensorRT and C++ with code samples and tutorial links 22 MIN READ
Accelerated Computing

Accelerated Ray Tracing in One Weekend in CUDA

Recent announcements of NVIDIA’s new Turing GPUs, RTX technology, and Microsoft’s DirectX Ray Tracing have spurred a renewed interest in ray tracing. 20 MIN READ