CUDA Toolkit 3.2 Downloads

 
Download The CUDA Production Release
Download The CUDA Pre-Production Release
 

Download Quick Links [ Windows ] [ Linux ] [ MacOS ]

Individual code samples from the SDK are also available.

Release Highlights

New and Improved CUDA Libraries

  • CUBLAS performance improved 50% to 300% on Fermi architecture GPUs, for matrix multiplication of all datatypes and transpose variations
  • CUFFT performance tuned for radix-3, -5, and -7 transform sizes on Fermi architecture GPUs, now 2x to 10x faster than MKL
  • New CUSPARSE library of GPU-accelerated sparse matrix routines for sparse/sparse and dense/sparse operations delivers 5x to 30x faster performance than MKL
  • New CURAND library of GPU-accelerated random number generation (RNG) routines, supporting Sobol quasi-random and XORWOW pseudo-random routines at 10x to 20x faster than similar routines in MKL
  • H.264 encode/decode libraries now included in the CUDA Toolkit

CUDA Driver & CUDA C Runtime

  • Support for new 6GB Quadro and Tesla products
  • New support for enabling high performance Tesla Compute Cluster (TCC) mode on Tesla GPUs in Windows desktop workstations

Development Tools

  • Multi-GPU debugging support for both cuda-gdb and Parallel Nsight
  • Expanded cuda-memcheck support for all Fermi architecture GPUs
  • NVCC support for Intel C Compiler (ICC) v11.1 on 64-bit Linux distros
  • Support for debugging GPUs with more than 4GB device memory

Miscellaneous

  • Support for memory management using malloc() and free() in CUDA C compute kernels
  • New NVIDIA System Management Interface (nvidia-smi) support for reporting % GPU busy, and several GPU performance counters

New GPU Computing SDK Code Samples

  • Several code samples demonstrating how to use the new CURAND library, including MonteCarloCURAND, EstimatePiInlineP, EstimatePiInlineQ, EstimatePiP, EstimatePiQ, SingleAsianOptionP, and randomFog
  • Conjugate Gradient Solver, demonstrating the use of CUBLAS and CUSPARSE in the same application
  • Function Pointers, a sample that shows how to use function pointers to implement the Sobel Edge Detection filter for 8-bit monochrome images
  • Interval Computing, demonstrating the use of interval arithmetic operators using C++ templates and recursion
  • Simple Printf, demonstrating best practices for using both printf and cuprintf in compute kernels
  • Bilateral Filter, an edge-preserving non-linear smoothing filter for image recovery and denoising implemented in CUDA C with OpenGL rendering
  • SLI with Direct3D Texture, a simple example demonstrating the use of SLI and Direct3D interoperability with CUDA C
  • cudaEncode, showing how to use the NVIDIA H.264 Encoding Library using YUV frames as input
  • Vflocking Direct3D/CUDA, which simulates and visualizes the flocking behavior of birds in flight
  • simpleSurfaceWrite, demonstrating how CUDA kernels can write to 2D surfaces on Fermi GPUs

Windows developers should be sure to check out the new debugging and profiling features in Parallel Nsight v1.5 for Visual Studio at www.nvidia.com/ParallelNsight.

Please refer to the Release Notes and Getting Started Guides for more information.

In CUDA Toolkit 3.2 and the accompanying release of the CUDA driver, some important changes have been made to the CUDA Driver API to support large memory access for device code and to enable further system calls such as malloc and free.  Please refer to the CUDA Toolkit 3.2 Readiness Tech Brief for a summary of these changes.

Note: The developer driver packages below provide baseline support for the widest number of NVIDIA products in the smallest number of installers. More recent production driver packages for developers and end users may be available at www.nvidia.com/drivers.

For additional tools and solutions for Windows, Linux and MAC OS , such as CUDA Fortran, CULA, CUDA-GDB, please visit our Tools and Ecosystem Page

Download Quick Links [ Windows ] [ Linux ] [ MacOS ]

Windows XP, Windows VISTA, Windows 7

Description of Download Link to Binaries Documents
Developer Drivers for WinXP (263.06) 32-bit
64-bit  
 
Developer Drivers for WinVista and Win7 (263.06) 32-bit
64-bit  
 
Notebook Developer Drivers for WinXP (260.99) 32-bit
64-bit  
 
 Notebook Developer Drivers for WinVista and Win7 (260.99) 32-bit
64-bit
 
 

CUDA Toolkit

  • C/C++ compiler
  • Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • GPU-accelerated Sparse Matrix library
  • GPU-accelerated RNG library
  • Additional tools and documentation
 32-bit
64-bit
Windows Getting Started Guide 
Release Notes 
Release Notes Errata 
CUDA C Programming Guide 
CUDA C Best Practices Guide 
OpenCL Programming Guide 
OpenCL Best Practices Guide 
OpenCL Implementation Notes 
CUDA Reference Manual (pdf) 
CUDA Reference Manual (chm) 
API Reference 
PTX ISA 2.2 
Visual Profiler User Guide  
Visual Profiler Release Notes 
Fermi Compatibility Guide 
Fermi Tuning Guide 
CUBLAS User Guide  
CUFFT User Guide  
CUSPARSE User Guide  
CURAND User Guide  
CUDA Developer Guide for Optimus Platforms 
License
 CUDA Toolkit Build Rules Patch for Windows  download  README
 NVIDIA Performance Primitives (NPP) library 32-bit
64-bit
NPP Release Notes  
NPP License 
 GPU Computing SDK code samples 32-bit
64-bit
OpenCL Release Notes 
CUDA C/C++ Release Notes 
DirectCompute Release Notes 
CUDA Occupancy Calculator 
License
 NVIDIA OpenCL Extensions   Compiler_Options 
D3D9 Sharing 
D3D10 Sharing 
D3D11 Sharing 
Device Attribute Query 
Pragma Unroll

 

 Linux

Description of Download Link to Binaries Documents
Developer Drivers for Linux (260.19.26) 32-bit
64-bit
README_Linux.txt

CUDA Toolkit

  • C/C++ compiler
  • cuda-gdb debugger
  • Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • GPU-accelerated Sparse Matrix library
  • GPU-accelerated RNG library
  • Additional tools and documentation
  Linux Getting Started Guide  
Release Notes 
Release Notes Errata 
CUDA C Programming Guide 
CUDA C Best Practices Guide 
OpenCL Programming Guide 
OpenCL Best Practices Guide 
OpenCL Implementation Notes 
CUDA Reference Manual (pdf) 
CUDA Reference Manual (chm) 
API Reference 
PTX ISA 2.2 
CUDA-GDB User Manual 
Visual Profiler User Guide  
Visual Profiler Release Notes 
Fermi Compatibility Guide 
Fermi Tuning Guide 
CUBLAS User Guide  
CUFFT User Guide  
CUSPARSE User Guide  
CURAND User Guide  
CUDA Developer Guide for Optimus Platforms 
License
CUDA Toolkit for Fedora 13 32-bit
64-bit
 
CUDA Toolkit for RedHat Enterprise Linux 5.5 32-bit
64-bit   
 
CUDA Toolkit for Ubuntu Linux 10.04 32-bit
64-bit
 
CUDA Toolkit for RedHat Enterprise Linux 4.8 32-bit
64-bit
 
CUDA Toolkit for OpenSUSE 11.2 32-bit
64-bit
 
CUDA Toolkit for SUSE Linux Enterprise Desktop 11 SP1 32-bit
64-bit
 
NVIDIA Performance Primitives (NPP) library 32-bit
64-bit
NPP Release Notes  
NPP License 
 GPU Computing SDK code samples  download CUDA C/C++ Release Notes 
CUDA Occupancy Calculator 
License
 NVIDIA OpenCL Extensions   Compiler_Options 
D3D9 Sharing 
D3D10 Sharing 
D3D11 Sharing 
Device Attribute Query 
Pragma Unroll

MacOS

Description of Download Link to Binaries Documents
Developer Drivers for MacOS download  

CUDA Toolkit

  • C/C++ compiler
  • Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • GPU-accelerated Sparse Matrix library
  • GPU-accelerated RNG library
  • Additional tools and documentation
download Mac Getting Started Guide 
Release Notes 
Release Notes Errata 
CUDA C Programming Guide 
CUDA C Best Practices Guide 
CUDA Reference Manual (pdf) 
CUDA Reference Manual (chm) 
API Reference 
PTX ISA 2.2 
Visual Profiler User Guide  
Visual Profiler Release Notes 
Fermi Compatibility Guide 
Fermi Tuning Guide 
CUBLAS User Guide  
CUFFT User Guide  
CUSPARSE User Guide  
CURAND User Guide  
CUDA Developer Guide for Optimus Platforms 
License
CUDA Toolkit: GFEC patch for MacOS download README
GPU Computing SDK code samples download CUDA C/C++ Release Notes
CUDA Occupancy Calculator 
License