Tag: CUDA Fortran

Accelerated Computing

Pro Tip: Pinpointing Runtime Errors in CUDA Fortran

We’ve all been there. Your CUDA Fortran code is humming along and suddenly you get a runtime error: , , usually accompanied by in all caps. In many cases… 4 MIN READ

Customize CUDA Fortran Profiling with NVTX

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the… 5 MIN READ

3 Versatile OpenACC Interoperability Techniques

OpenACC is a high-level programming model for accelerating applications with GPUs and other devices using compiler directives compiler directives to specify… 8 MIN READ

10 Ways CUDA 6.5 Improves Performance and Productivity

CUDA 6.5 adds support for ARM64 systems, callback functions in the cuFFT library, improved developer tools CUDA Fortran, performance optimizations, and much more. 7 MIN READ

Unified Memory: Now for CUDA Fortran Programmers

Unified Memory is a CUDA feature that we've talked a lot about on Parallel Forall. CUDA 6 introduced Unified Memory, which dramatically simplifies GPU… 3 MIN READ
Accelerated Computing

CUDA Pro Tip: How to Call Batched cuBLAS routines from CUDA Fortran

When dealing with small arrays and matrices, one method of exposing parallelism on the GPU is to execute the same cuBLAS call on multiple independent systems… 7 MIN READ