Unified Memory for CUDA Beginners

Features, Beginner, CUDA C++, Kepler, pascal, Unified Memory

Nadeem Mohammad, posted Jun 19 2017

My previous introductory post, “An Even Easier Introduction to CUDA C++“, introduced the basics of CUDA programming by showing how to write a simple program that allocated two arrays of numbers in memory accessible to the GPU and then added them together on the GPU. To do this, I introduced you to Unified Memory, which […]

Read more

GOAI: Open GPU-Accelerated Data Analytics

Features, Analytics, Big Data, Database, Machine Learning, Python

Nadeem Mohammad, posted Jun 12 2017

Recently, Continuum Analytics, H2O.ai, and MapD announced the formation of the GPU Open Analytics Initiative (GOAI). GOAI—also joined by BlazingDB, Graphistry and the Gunrock project from the University of California, Davis—aims to create open frameworks that allow developers and data scientists to build applications using standard data formats and APIs on GPUs. Bringing standard analytics data […]

Read more

Explaining How End-to-End Deep Learning Steers a Self-Driving Car

Features, Autonomous Vehicles, Deep Learning, Drive PX, Visualization

Nadeem Mohammad, posted May 23 2017

As part of a complete software stack for autonomous driving, NVIDIA has created a deep-learning-based system, known as PilotNet, which learns to emulate the behavior of human drivers and can be deployed as a self-driving car controller. PilotNet is trained using road images paired with the steering angles generated by a human driving a data-collection car. It […]

Read more

CUDA 9 Features Revealed: Volta, Cooperative Groups and More

Features, Cooperative Groups, CUBLAS, CUDA, CUDA 9, Deep Learning, Libraries, Tensor Cores

Nadeem Mohammad, posted May 11 2017

Today at the GPU Technology Conference NVIDIA announced CUDA 9, the latest version of CUDA’s powerful parallel computing platform and programming model. In this post I’ll provide an overview of the awesome new features of CUDA 9. Support for the Volta GPU architecture, including the new Tesla V100 accelerator; Cooperative Groups, a new programming model […]

Read more

Inside Volta: The World’s Most Advanced Data Center GPU

Features, Deep Learning, GPU, Tensor Cores, Tesla V100, Volta

Nadeem Mohammad, posted May 10 2017

Today at the 2017 GPU Technology Conference in San Jose, NVIDIA CEO Jen-Hsun Huang announced the new NVIDIA Tesla V100, the most advanced accelerator ever built. From recognizing speech to training virtual personal assistants to converse naturally; from detecting lanes on the road to teaching autonomous cars to drive; data scientists are taking on increasingly […]

Read more