Using CUDA Warp-Level Primitives

Features, Cooperative Groups, CUDA, Volta

Nadeem Mohammad, posted Jan 15 2018

NVIDIA GPUs execute groups of threads known as warps in SIMT (Single Instruction, Multiple Thread) fashion. Many CUDA programs achieve high performance by taking advantage of warp execution.

Read more

Hybridizer: High-Performance C# on GPUs

Accelerated Computing, CUDA

Nadeem Mohammad, posted Dec 15 2017

Hybridizer is a compiler from Altimesh that lets you program GPUs and other accelerators from C# code or .NET Assembly.

Read more

Hybridizer: High-Performance C# on GPUs

Features, .NET, C++, CUDA

Nadeem Mohammad, posted Dec 13 2017

Hybridizer is a compiler from Altimesh that lets you program GPUs and other accelerators from C# code or .NET Assembly.

Read more

NVIDIA TITAN V Transforms the PC into AI Supercomputer

Artificial Intelligence, Features, CUDA, cuDNN, GeForce, Higher Education/Academia, Machine Learning & Artificial Intelligence

Nadeem Mohammad, posted Dec 08 2017

NVIDIA introduced TITAN V, the world’s most powerful GPU for the PC, driven by the world’s most advanced GPU architecture, NVIDIA Volta.

Read more

NVIDIA SDK Updated With New Releases of TensorRT, CUDA, and More

Accelerated Computing, Artificial Intelligence, Features, Robotics, Cloud, CUDA, cuDNN, Higher Education/Academia, Machine Learning & Artificial Intelligence, TensorRT, Tesla

Nadeem Mohammad, posted Dec 06 2017

At NIPS 2017, NVIDIA announced new software releases for deep learning and HPC developers.  The latest SDK updates include new capabilities and performance optimizations to TensorRT, CUDA toolkit and the new project CUTLASS library.

Read more

CUTLASS: Fast Linear Algebra in CUDA C++

Features, C++, CUBLAS, CUDA, Deep Learning, Libraries, Linear Algebra

Nadeem Mohammad, posted Dec 05 2017

Matrix multiplication is a key computation within many scientific applications, particularly those in deep learning. Many operations in modern deep neural networks are either defined as matrix multiplications or can be cast as such.

Read more

NVIDIA Deep Learning Inference Platform Performance Study

Artificial Intelligence, Cloud, Cluster/Supercomputing, CUDA, Machine Learning & Artificial Intelligence, TensorRT, Tesla

Nadeem Mohammad, posted Dec 04 2017

The NVIDIA deep learning platform spans from the data center to the network’s edge.

Read more

How Jet Built a GPU-Powered Fulfillment Engine with F# and CUDA

Features, .NET, Alea GPU, CUDA, F#

Nadeem Mohammad, posted Nov 30 2017

Have you ever looked at your shopping list and tried to optimize your trip based on things like distance to store, price, and number of items you can buy at each store?

Read more

How Jet.com Built a GPU-Powered Fulfillment Engine with F# and CUDA

Accelerated Computing, CUDA, Retail/Etail

Nadeem Mohammad, posted Nov 30 2017

Have you ever looked at your shopping list and tried to optimize your trip based on things like distance to store, price, and number of items you can buy at each store?

Read more

Deep Learning Helps Reconstruct and Improve Optical Microscopy

Artificial Intelligence, CUDA, cuDNN, GeForce, Healthcare & Life Sciences, Higher Education/Academia, Image Recognition, Machine Learning & Artificial Intelligence

Nadeem Mohammad, posted Nov 22 2017

Researchers from UCLA developed a deep learning approach that could quickly produce more accurate images to aid diagnostic medicine.

Read more