DEVELOPER BLOG

Tag: Pro Tip

HPC

CUDA Pro Tip: The Fast Way to Query Device Properties

CUDA applications often need to know the maximum available shared memory per block or to query the number of multiprocessors in the active GPU. One way to do… 3 MIN READ
Graphics / Simulation

Pro Tip: Improved GLSL Syntax for Vulkan DescriptorSet Indexing

Sometimes the evolution of programming languages creates situations where "simple" tasks take a bit more complexity to express. Syntax annoyance slows down… 4 MIN READ
Accelerated Computing

Pro Tip: Pinpointing Runtime Errors in CUDA Fortran

We’ve all been there. Your CUDA Fortran code is humming along and suddenly you get a runtime error: , , usually accompanied by in all caps. In many cases… 4 MIN READ
Design & Visualization

Pro Tip: Linking OpenGL for Server-Side Rendering

Visualization is a great tool for understanding large amounts of data, but transferring the data from an HPC system or from the cloud to a local workstation for… 6 MIN READ
Accelerated Computing

Pro Tip: cuBLAS Strided Batched Matrix Multiply

There’s a new computational workhorse in town. For decades, general matrix-matrix multiply—known as GEMM in Basic Linear Algebra Subroutines (BLAS) libraries… 10 MIN READ
HPC

Customize CUDA Fortran Profiling with NVTX

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the… 5 MIN READ