CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

In GPU-accelerated applications, the sequential part of the workload runs on the CPU – which is optimized for single-threaded performance – while the compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords.

The CUDA Toolkit from NVIDIA provides everything you need to develop GPU-accelerated applications. The CUDA Toolkit includes GPU-accelerated libraries, a compiler, development tools and the CUDA runtime.

Download Now

Applications Developed with CUDA

Thousands of applications developed with CUDA have been deployed to GPUs in embedded systems, workstations, datacenters and in the cloud.

CUDA for all NVIDIA GPU Families

CUDA serves as a common platform across all NVIDIA GPU families so you can deploy and scale your application across GPU configurations.

Desktop Developer

TITAN X for Desktop Development

Data Center Solutions

NVIDIA DGX system for Data Center Solutions

Embedded Applications

Jetson TX2 for Embedded Applications

GPU-Accelerate Cloud

GPU-accelerated cloud solutions

The first GPUs were designed as graphics accelerators, becoming more programmable over the 90s, culminating in NVIDIA's first GPU in 1999. Researchers and scientists rapidly began to apply the excellent floating point performance of this GPU for general purpose computing. In 2003, a team of researchers led by Ian Buck unveiled Brook, the first widely adopted programming model to extend C with data-parallel constructs. Ian Buck later joined NVIDIA and led the launch of CUDA in 2006, the world's first solution for general-computing on GPUs.

Since its inception, the CUDA ecosystem has grown rapidly to include software development tools, services and partner-based solutions. The CUDA Toolkit includes libraries, debugging and optimization tools, a compiler and a runtime library to deploy your application. You'll also find code samples, programming guides, user manuals, API references and other documentation to help you get started.


cuRAND library


NPP library


Math Library

Math Library

cuFFT library


nvGRAPH library


NCCL library


See More Libraries

Tools and Integrations

nSight tool


Visual Profiler tool

Visual Profiler

CUDA GDB debugging tool


CUDA MemCheck tool

CUDA MemCheck

OpenACC directive-based programming model


CUDA Profiling Tools Interface

CUDA Profiling Tools Interface

See More Tools

Domains with CUDA-Accelerated Applications

CUDA accelerates applications across a wide range of domains from image processing, to deep learning, numerical analytics and computational science.

More Applications

Get Started with CUDA

Get started with CUDA by downloading the CUDA Toolkit and exploring introductory resources including videos, code samples, hands-on labs and webinars.

Get Started with CUDA Download Now