GPU Accelerated Computing with C and C++

Using the CUDA Toolkit you can accelerate your C or C++ applications by updating the computationally intensive portions of your code to run on GPUs. To accelerate your applications, you can call functions from drop-in libraries as well as develop custom applications using languages including C, C++, Fortran and Python. Below you will find some resources to help you get started using CUDA.


Install the free CUDA Tookit on a Linux, Mac or Windows system with one or more CUDA-capable GPUs. Follow the instructions in the CUDA Quick Start Guide to get up and running quickly.

Or, watch the short video below and follow along.

If you do not have a GPU, you can access one of the thousands of GPUs available from cloud service providers including Amazon AWS, Microsoft Azure and IBM SoftLayer. The NVIDIA-maintained CUDA Amazon Machine Image (AMI) on AWS, for example, comes pre-installed with CUDA and is available for use today.

For more detailed installation instructions, refer to the CUDA installation guides. For help with troubleshooting, browse and participate in the CUDA Setup and Installation forum.


You are now ready to write your first CUDA program. The article, Even Easier Introduction to CUDA, introduces key concepts through simple examples that you can follow along.

The video below walks through an example of how to write an example that adds two vectors.

The Programming Guide in the CUDA Documentation introduces key concepts covered in the video including CUDA programming model, important APIs and performance guidelines.


NVIDIA provides hands-on training in CUDA through a collection of self-paced and instructor-led courses. The self-paced online training, powered by GPU-accelerated workstations in the cloud, guides you step-by-step through editing and execution of code along with interaction with visual tools. All you need is a laptop and an internet connection to access the complete suite of free courses and certification options.

The CUDA C Best Practices Guide presents established parallelization and optimization techniques and explains programming approaches that can greatly simplify programming GPU-accelerated applications.

Additional Resources

CODE Samples


The CUDA Toolkit is a free download from NVIDIA and is supported on Windows, Mac, and most standard Linux distributions.

So, now you’re ready to deploy your application?
Register today for free access to NVIDIA TESLA GPUs in the cloud.

Latest News

Upgrade to the newest versions of NVIDIA CUDA-X AI libraries

Learn what’s new in the latest releases of cuDNN, CUDA, TensorRT, DALI, and Nsight Compute.

PGI Community Edition 19.4 Now Available

Features includes NVIDIA V100 Tensor Cores, Full C++17 Language, PCAST Directives, Full OpenACC 2.6, New OS including macOS Mojave & mor

NVIDIA Webinars: Hello AI World and Learn with JetBot

We recently announced two exciting upcoming webinars about the new Jetson Nano. Each presentation will be followed by a live Q&A session where you can ask questions in real-time with the NVIDIA Jetson team. We look forward to you joining us!

Using MATLAB and TensorRT on NVIDIA GPUs

MathWorks recently released MATLAB R2018b which integrates with NVIDIA TensorRT through GPU Coder. With this integration, scientists and engineers can achieve faster inference performance on GPUs from within MATLAB.

Blogs: Parallel ForAll

Video Series: Path Tracing for Quake II in Two Months

You wouldn’t know Quake II is now more than 20 years old when looking at the new RTX version. Path-traced reflections, shadows, and dynamic light sources bring the game’s cavernous environments to life.

Job Statistics with NVIDIA Data Center GPU Manager and SLURM

Resource management software, such as SLURM, PBS, and Grid Engine, manages access for multiple users to shared computational resources.

GPU Support for AI Workloads in Red Hat OpenShift 4

Red Hat OpenShift is an enterprise-grade Kubernetes platform for managing Kubernetes clusters at scale, developed and supported by Red Hat.

Using HashiCorp Nomad to Schedule GPU Workloads

HashiCorp Nomad 0.9 introduces device plugins which support an extensible set of devices for scheduling and deploying workloads.