GTC Silicon Valley-2019 ID:S9734:High-Performance GPGPU Implementation of 2D Histogramming

Mark Roulo(KLATencor)
We'll discuss in depth the design choices that went into a high-performance implementation of a high-throughput, large 2D histogram generator (or joint probability distribution). These histograms can be useful for analyzing multiple, simultaneous phenomena. We'll describe the design options, which include optimizing memory choices for access patterns, re-use, and write locations, and we'll show benchmark results from alternate designs and implementations.

