GPU Accelerated Computing with C and C++

Using the CUDA Toolkit you can accelerate your C or C++ applications by updating the computationally intensive portions of your code to run on GPUs. To accelerate your applications, you can call functions from drop-in libraries as well as develop custom applications using languages including C, C++, Fortran and Python. Below you will find some resources to help you get started using CUDA.


Install the free CUDA Tookit on a Linux, Mac or Windows system with one or more CUDA-capable GPUs. Follow the instructions in the CUDA Quick Start Guide to get up and running quickly.

Or, watch the short video below and follow along.

If you do not have a GPU, you can access one of the thousands of GPUs available from cloud service providers including Amazon AWS, Microsoft Azure and IBM SoftLayer. The NVIDIA-maintained CUDA Amazon Machine Image (AMI) on AWS, for example, comes pre-installed with CUDA and is available for use today.

For more detailed installation instructions, refer to the CUDA installation guides. For help with troubleshooting, browse and participate in the CUDA Setup and Installation forum.


You are now ready to write your first CUDA program. The article, Even Easier Introduction to CUDA, introduces key concepts through simple examples that you can follow along.

The video below walks through an example of how to write an example that adds two vectors.

The Programming Guide in the CUDA Documentation introduces key concepts covered in the video including CUDA programming model, important APIs and performance guidelines.


NVIDIA provides hands-on training in CUDA through a collection of self-paced and instructor-led courses. The self-paced online training, powered by GPU-accelerated workstations in the cloud, guides you step-by-step through editing and execution of code along with interaction with visual tools. All you need is a laptop and an internet connection to access the complete suite of free courses and certification options.

The CUDA C Best Practices Guide presents established parallelization and optimization techniques and explains programming approaches that can greatly simplify programming GPU-accelerated applications.

Additional Resources

CODE Samples


The CUDA Toolkit is a free download from NVIDIA and is supported on Windows, Mac, and most standard Linux distributions.

So, now you’re ready to deploy your application?
Register today for free access to NVIDIA TESLA GPUs in the cloud.

Latest News


At GTC Silicon Valley in San Jose, NVIDIA released CUDA-X AI, a collection of NVIDIA’s GPU acceleration libraries that accelerate deep learning, machine learning, and data analysis.

GTC 2019 Silicon Valley Preview: CUDA Training and Posters

At GTC, NVIDIA DLI offers an array of self-paced courses and instructor-led workshops for developers, data scientists, and researchers looking to solve the world’s most challenging problems with accelerated computing.

Top 5 AI Stories of the Week: 3/1

From an AI algorithm that can predict earthquakes to a system that can decode rodent chatter - here are the top 5 AI stories of the week.

CUDA 10.1 Now Available

CUDA 10.1 is now available for download. This version includes a new lightweight GEMM library, new functionality and performance updates to existing libraries, and improvements to the CUDA Graphs API

Blogs: Parallel ForAll

Tips and Tricks: Ray Tracing Best Practices

This post presents best practices for implementing ray tracing in games and other real-time graphics applications. We present these as briefly as possible to help you quickly find key ideas.

Ignacio Llamas Interview: Unearthing Ray Tracing

We spoke with Ignacio Llamas, Director of Real Time Ray Tracing Software at NVIDIA about the introduction of real-time ray tracing in consumer GPUs. Q: How did you get started on ray tracing at NVIDIA?

Using the RAPIDS VM Image for Google Cloud Platform

NVIDIA’s Ty McKercher and Google’s Viacheslav Kovalevskyi and Gonzalo Gasca Meza jointly authored a post on using the new the RAPIDS VM Image for Google Cloud Platform. Following is a short summary.

Pruning Models with NVIDIA Transfer Learning Toolkit

It’s important for the model to make accurate predictions when using a deep learning model for production.