The NVIDIA® CUDA® Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications. The CUDA Toolkit includes a compiler for NVIDIA GPUs, math libraries, and tools for debugging and optimizing the performance of your applications. You’ll also find programming guides, user manuals, API reference, and other documentation to help you get started quickly accelerating your application with GPUs.

Check out the CUDA 7 Features and Overview Webinar Recording, CUDA 7 Performance Report and Webinar Recording and Thrust 1.8 in CUDA 7 Webinar Recording

Productivity and Performance Improvements

C++11 support makes it easier for C++ developers to accelerate their applications
  • Write less code with ‘auto’ and ‘lambda’, especially when using the Thrust template library.
New cuSOLVER library of dense and sparse direct solvers
  • Significant acceleration for Computer Vision, CFD, Computational Chemistry, and Linear Optimization applications.
  • Key LAPACK dense solvers 3-6x faster than MKL.
    • Dense solvers include Cholesky, LU, SVD and QR
  • Sparse direct solvers 2-14x faster than CPU-only equivalents.
    • Sparse solvers include direct solvers and eigensolvers
Runtime Compilation enables highly optimized kernels to be generated at runtime.
  • Improve performance by removing conditional logic and only evaluating special cases when necessary.
All developers can download the latest CUDA 7 Toolkit today.
Members of the CUDA Registered Developer Program are notified of the latest developments, able access to pre-release software and can report issues and bugs
Learn More

Learn more about the GPU-accelerated libraries and development tools included in the CUDA Toolkit

GPU-Accelerated Libraries
  • cuFFT – Fast Fourier Transforms Library
  • cuBLAS – Complete BLAS library
  • cuSolver – Collection of dense and sparse direct solvers
  • /cusolver
  • cuSPARSE – Sparse Matrix library
  • cuRAND – Random Number Generator
  • NPP – Thousands of Performance Primitives for Image & Video Processing
  • Thrust – Templated Parallel Algorithms & Data Structures
  • CUDA Math Library – high performance math routines
Development Tools

If you develop applications in languages other than C or C++, please review the Getting Started Page for a language solution that meets your needs.  The CUDA Toolkit complements and fully supports programming with OpenACC directives.


The latest version of the CUDA Toolkit is always available at

CUDA Registered Developers get early access to the next CUDA Toolkit release, and access to NVIDIA’s online bug reporting and feature request system.