Pro Tip: cuBLAS Strided Batched Matrix Multiply

CUDA Pro Tip, CUBLAS, CUDA, CUDA 8, Deep Learning, Linear Algebra, Machine Learning, Tensors

Nadeem Mohammad, posted Feb 27 2017

There’s a new computational workhorse in town. For decades, general matrix-matrix multiply—known as GEMM in Basic Linear Algebra Subroutines (BLAS) libraries—has been a standard benchmark for computational performance. GEMM is possibly the most optimized and widely used routine in scientific computing. Expert implementations are available for every architecture and quickly achieve the peak performance of […]

Read more

Nsight Visual Studio Edition 5.3 at GDC 2017

Nsight Visual Studio Edition, GameWorks Expert Developer, GameWorks

Robert Bischof, posted Feb 27 2017

NVIDIA Nsight™ Visual Studio Edition 5.3 - being shown at GDC 2017 - will soon be available for download in the NVIDIA Registered Developer Program.

This release adds OpenVR support alongside the current Oculus SDK support for virtual reality development in Direct3D and OpenGL applications. Vive and Oculus demos will be on display at the NVIDIA booth.

We’ll also be showing off Nsight’s frame debugging and profiling on laptops using Windows 10 Hybrid mode.

Read more

NVIDIA GDC Vulkan Driver available now!

Vulkan, VRWorks, GameWorks Expert Developer, Maxwell, pascal, GDC17

Mathias Schott, posted Feb 27 2017

We are happy to announce the immediate availability of the NVIDIA GDC Vulkan developer driver which supports not only that extensions that Khronos just released, but also a set of Vulkan extensions that provide the multi-projection functionality of our Maxwell and Pascal GPU architectures, which is the foundation for technology such as VRWorks, fast voxelization, and single pass cubemap renderin

Read more

Create Realistic Synthetic Faces That Look Older With Deep Learning

Research, CUDA, cuDNN, Higher Education / Academia, Image Recognition, Machine Learning & Artificial Intelligence, Tesla

Nadeem Mohammad, posted Feb 24 2017

Developers from Orange Labs in France developed a deep learning system that can quickly make young faces look older, and older faces look younger. A number of techniques already exist, but they are expensive and time consuming. Using CUDA, Tesla K40 GPUs and cuDNN for the deep learning work, they trained their neural network on

Read more

Self-Taught AI Bot Beat Professional Players at Super Smash Bros

Features, News, CUDA, GeForce, Higher Education / Academia, Image Recognition, Machine Learning & Artificial Intelligence, Media & Entertainment, Tesla

Nadeem Mohammad, posted Feb 24 2017

Students from MIT and New York University developed an AI bot that ended up teaching itself in two weeks to beat professional gamers during the Genesis 4 Super Smash Bros tournament last month. The AI, nicknamed Phillip, was originally trained with CUDA, Tesla K20/TITAN X GPUs and the TensorFlow deep learning framework – but the

Read more