Technical Walkthrough 0

CUDA Pro Tip: The Fast Way to Query Device Properties

CUDA applications often need to know the maximum available shared memory per block or to query the number of multiprocessors in the active GPU. One way to do… 3 MIN READ
Technical Walkthrough 0

Pro Tip: Improved GLSL Syntax for Vulkan DescriptorSet Indexing

Sometimes the evolution of programming languages creates situations where "simple" tasks take a bit more complexity to express. Syntax annoyance slows down… 4 MIN READ
CUDA Fortran for Scientists and Engineers shows how high-performance application developers can leverage the power of GPUs using Fortran.
Technical Walkthrough 0

Pro Tip: Pinpointing Runtime Errors in CUDA Fortran

We’ve all been there. Your CUDA Fortran code is humming along and suddenly you get a runtime error: , , usually accompanied by in all caps. In many cases… 4 MIN READ
Figure 1: Server-side analysis and visualization of thermal operating bounds in vehicle design, using Intelligent Light’s FieldView.
Technical Walkthrough 0

Pro Tip: Linking OpenGL for Server-Side Rendering

Visualization is a great tool for understanding large amounts of data, but transferring the data from an HPC system or from the cloud to a local workstation for… 6 MIN READ
GPU Pro Tip
Technical Walkthrough 0

Pro Tip: cuBLAS Strided Batched Matrix Multiply

There’s a new computational workhorse in town. For decades, general matrix-matrix multiply—known as GEMM in Basic Linear Algebra Subroutines (BLAS) libraries… 10 MIN READ
Technical Walkthrough 0

Customize CUDA Fortran Profiling with NVTX

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the… 5 MIN READ