The PAPI CUDA Component is a hardware performance counter measurement technology for the NVIDIA CUDA platform which provides access to the hardware counters inside the GPU. PAPI CUDA is based on CUDA Performance Tools Interface (CUPTI) support in the NVIDIA driver library. In any environment where the CUPTI-enabled driver is installed, the PAPI CUDA Component can provide detailed performance counter information regarding the execution of GPU kernels.
The PAPI CUDA Component is distributed with the latest releases of PAPI:

Several performance measurement tools leverage PAPI and have CUPTI features integrated to support NVIDIA GPUs. These include TAU, and HPCToolkit.
For more information on PAPI please visit: