CUDA Fortran for Scientists and Engineers shows how high-performance application developers can leverage the power of GPUs using Fortran.
Technical Walkthrough 0

Pro Tip: Pinpointing Runtime Errors in CUDA Fortran

We’ve all been there. Your CUDA Fortran code is humming along and suddenly you get a runtime error: , , usually accompanied by in all caps. In many cases… 4 MIN READ
Technical Walkthrough 0

Customize CUDA Fortran Profiling with NVTX

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the… 5 MIN READ
Technical Walkthrough 1

3 Versatile OpenACC Interoperability Techniques

OpenACC is a high-level programming model for accelerating applications with GPUs and other devices using compiler directives compiler directives to specify… 8 MIN READ
Technical Walkthrough 0

10 Ways CUDA 6.5 Improves Performance and Productivity

CUDA 6.5 adds support for ARM64 systems, callback functions in the cuFFT library, improved developer tools CUDA Fortran, performance optimizations, and much more. 7 MIN READ
Technical Walkthrough 0

Unified Memory: Now for CUDA Fortran Programmers

Unified Memory is a CUDA feature that we've talked a lot about on Parallel Forall. CUDA 6 introduced Unified Memory, which dramatically simplifies GPU… 3 MIN READ
GPU Pro Tip
Technical Walkthrough 0

CUDA Pro Tip: How to Call Batched cuBLAS routines from CUDA Fortran

When dealing with small arrays and matrices, one method of exposing parallelism on the GPU is to execute the same cuBLAS call on multiple independent systems… 7 MIN READ