Tag: C/C++


Detecting Divergence Using PCAST to Compare GPU to CPU Results

Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. 14 MIN READ
AI / Deep Learning

Pedestrian-Following Service Robots Made Possible with CUDA Acceleration

A team of researchers from Seoul National University built a pedestrian-following service robot to drive smart shopping carts and other autonomous helpers. 2 MIN READ
AI / Deep Learning

NVIDIA to Benefit from Shift to GPU-powered Deep Learning

Wired discusses Google’s announcement that it is open sourcing its TensorFlow machine learning system – noting the system uses GPUs to both train and run… 2 MIN READ

Performance Portability for GPUs and CPUs with OpenACC

New PGI compiler release includes support for C++ and Fortran applications to run in parallel on multi-core CPUs or GPU accelerators. < 1
Accelerated Computing

CUDA Pro Tip: Optimize for Pointer Aliasing

A CUDA pro tip about pointer aliasing and how to use the restrict keyword to avoid performance problems due to aliasing in C and C++ code on CPUs and GPUs. 6 MIN READ
Accelerated Computing

CUDA Pro Tip: Occupancy API Simplifies Launch Configuration

CUDA 6.5 includes several new runtime functions to assist in configuring kernel launches to achieve maximum GPU occupancy. 4 MIN READ