Free Tools and Trainings for Developers
Get exclusive access to hundreds of SDKs, technical trainings, and opportunities to connect with millions of like-minded developers, researchers, and students.
Learn more
CUDA Toolkit
Develop, Optimize and Deploy GPU-Accelerated Apps
The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to deploy your application.
Using built-in capabilities for distributing computations across multi-GPU configurations, scientists and researchers can develop applications that scale from single GPU workstations to cloud installations with thousands of GPUs.
Download nowCUDA 12 Features
CUDA 12 introduces support for the NVIDIA Hopper and Ada Lovelace architectures, Arm server processors, Lazy Module and Kernel Loading, revamped Dynamic Parallelism APIs, enhancements to the CUDA graphs API, performance-optimized libraries, and new developer tool capabilities.
Support for the NVIDIA Hopper architecture includes next generation Tensor Cores and Transformer Engine, hi-speed NVLink Switch system, mixed precision modes, 2nd generation Multi-Instance GPU (MIG), advanced memory management, and standard C++/Fortran/Python parallel language constructs.
GTC Digital Webinars
Dive deeper into the latest CUDA features.
Customer Stories
See how developers, scientists, and researchers are using CUDA today.
CUDA Ecosystem
Explore the top compute and graphics packages with built-in CUDA integration.






