# DEVELOPER BLOG

## Tag: Shared Memory

HPC
Mar 17, 2015

### GPU Pro Tip: Fast Histograms Using Shared Atomics on Maxwell

Histograms are an important data representation with many applications in computer vision, data analytics and medical imaging. A histogram is a graphical…

**9 MIN READ**
HPC
Feb 03, 2014

### CUDA Pro Tip: Do The Kepler Shuffle

When writing parallel programs, you will often need to communicate values between parallel threads. The typical way to do this in CUDA programming is to use…

**2 MIN READ**
Accelerated Computing
Jan 01, 2014

### Peer-to-Peer Multi-GPU Transpose in CUDA Fortran (Book Excerpt)

This post is an excerpt from Chapter 4 of the book CUDA Fortran for Scientists and Engineers, by Gregory Ruetsch and Massimiliano Fatica. In this excerpt we…

**12 MIN READ**
Accelerated Computing
Apr 08, 2013

### Finite Difference Methods in CUDA C++, Part 2

In the previous CUDA C++ post we dove in to 3D finite difference computations in CUDA C/C++, demonstrating how to implement the x derivative part of the…

**6 MIN READ**
Accelerated Computing
Apr 01, 2013

### Finite Difference Methods in CUDA Fortran, Part 2

In the last CUDA Fortran post we dove in to 3D finite difference computations in CUDA Fortran, demonstrating how to implement the x derivative part of the…

**6 MIN READ**
Accelerated Computing
Mar 04, 2013

### Finite Difference Methods in CUDA C/C++, Part 1

In the previous CUDA C/C++ post we investigated how we can use shared memory to optimize a matrix transpose, achieving roughly an order of magnitude improvement…

**9 MIN READ**