CUDA Toolkit 3.0 (March 2010)

 


Note: a more recent release is now available.

 

Release Highlights

  • Support for the new Fermi architecture, with:
    • Native 64-bit GPU support
    • Multiple Copy Engine support
    • ECC reporting
    • Concurrent Kernel Execution
    • Fermi HW debugging support in cuda-gdb
    • Fermi HW profiling support for CUDA C and OpenCL in Visual Profiler
  • C++ Class Inheritance and Template Inheritance support for increased programmer productivity
  • A new unified interoperability API for Direct3D and OpenGL, with support for:
    • OpenGL texture interop
    • Direct3D 11 interop support
  • CUDA Driver / Runtime Buffer Interoperability, which allows applications using the CUDA Driver API to also use libraries implemented using the CUDA C Runtime such as CUFFT and CUBLAS.
  • CUBLAS now supports all BLAS1, 2, and 3 routines including those for single and double precision complex numbers
  • Up to 100x performance improvement while debugging applications with cuda-gdb
  • cuda-gdb hardware debugging support for applications that use the CUDA Driver API
  • cuda-gdb support for JIT-compiled kernels
  • New CUDA Memory Checker reports misalignment and out of bounds errors, available as a stand-alone utility and debugging mode within cuda-gdb
  • CUDA Toolkit libraries are now versioned, enabling applications to require a specific version, support multiple versions explicitly, etc.
  • CUDA C/C++ kernels are now compiled to standard ELF format
  • Support for device emulation mode has been packaged in a separate version of the CUDA C Runtime (CUDART), and is deprecated in this release. Now that more sophisticated hardware debugging tools are available and more are on the way, NVIDIA will be focusing on supporting these tools instead of the legacy device emulation functionality.
    • On Windows, use the new Parallel Nsight development environment for Visual Studio, with integrated GPU debugging and profiling tools (was code-named "Nexus"). Please see www.nvidia.com/nsight for details.
    • On Linux, use cuda-gdb and cuda-memcheck, and check out the solutions from Allinea and TotalView that will be available soon.
  • Support for all the OpenCL features in the latest R195 production driver package:
    • Double Precision
    • Graphics Interoperability with OpenCL, Direc3D9, Direct3D10, and Direct3D11 for high performance visualization
    • Query for Compute Capability, so you can target optimizations for GPU architectures (cl_nv_device_attribute_query)
    • Ability to control compiler optimization settings via support for pragma unroll in OpenCL kernels and an extension that allows programmers to set compiler flags. (cl_nv_compiler_options)
    • OpenCL Images support, for better/faster image filtering
    • 32-bit global and local atomics for fast, convenient data manipulation
    • Byte Addressable Stores, for faster video/image processing and compression algorithms
    • Support for the latest OpenCL spec revision 1.0.48 and latest official Khronos OpenCL headers as of 2010-02-17

Please review the release notes for additional important information about this release.

For more information on general purpose computing features of the Fermi architecture, see: www.nvidia.com/fermi.

A rapidly growing list of libraries and tools for CUDA architecture GPUs is available here.

Note: The developer driver packages below provide baseline support for the widest number of NVIDIA products in the smallest number of installers. More recent production driver packages for end users are available at www.nvidia.com/drivers.

 

Windows

 

Developer Drivers for WinXP (197.13) 32-bit
64-bit
 
Developer Drivers for WinVista & Win7 (197.13) 32-bit
64-bit
 
Notebook Developer Drivers for WinXP 32-bit
64-bit
 
Notebook Developer Drivers for WinVista & Win7 32-bit
64-bit
 

CUDA Toolkit

  • C/C++ compiler
  • CUDA Visual Profiler
  • OpenCL Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • Additional tools and documentation
32-bit
64-bit
Getting Started Guide for Windows
Release Notes 
CUDA C Programming Guide 
CUDA C Best Best Practices Guide 
OpenCL Programming Guide 
OpenCL Best Best Practices Guide 
OpenCL Implementation Notes 
CUDA Reference Manual 
API Reference 
PTX ISA 2.0 
Visual Profiler User Guide 
Visual Profiler Release Notes 
Fermi Compatibility Guide 
Fermi Tuning Guide 
CUBLAS User Guide
CUFFT User Guide 
License  
NVIDIA Performance Primitives (NPP) library 32-bit
64-bit
 
CULA: GPU-accelerated LAPACK libraries download more info
NVIDIA Parallel Nsight for Visual Studio   more info
CUDA Fortran from PGI download more info
GPU Computing SDK code samples 32-bit
64-bit
Release Notes for CUDA C 
Release Notes for DirectCompute 
Release Notes for OpenCL 
CUDA Occupancy Calculator 
License  
NVIDIA OpenCL Extensions   Compiler_Options 
D3D9 Sharing 
D3D10 Sharing 
D3D11 Sharing 
Device Attribute Query 
Pragma Unroll  

Linux

 

Developer Drivers for Linux (195.36.15) 32-bit
64-bit
 

CUDA Toolkit

  • C/C++ compiler
  • cuda-gdb debugger
  • CUDA Visual Profiler
  • OpenCL Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • Additional tools and documentation
  Getting Started Guide for Linux
Release Notes for Linux 
CUDA C Programming Guide 
CUDA C Best Best Practices Guide 
OpenCL Programming Guide 
OpenCL Best Best Practices Guide 
OpenCL Implementation Notes 
CUDA Reference Manual 
API Reference 
PTX ISA 2.0 
CUDA-GDB User Manual
Profiler User Guide
Visual Profiler Release Notes 
Fermi Compatibility Guide 
Fermi Tuning Guide 
CUBLAS User Guide
CUFFT User Guide 
License
CUDA Toolkit for Fedora 10 32-bit
64-bit
 
CUDA Toolkit for RedHat Enterprise Linux 5.3 32-bit
64-bit
 
CUDA Toolkit for Ubuntu Linux 9.04 32-bit
64-bit
 
CUDA Toolkit for RedHat Enterprise Linux 4.8 32-bit
64-bit
 
CUDA Toolkit for OpenSUSE 11.1 32-bit
64-bit
 
CUDA Toolkit for SUSE Linux Enterprise Desktop 11 32-bit
64-bit
 
NVIDIA Performance Primitives (NPP) library 32-bit
64-bit
 
CULA: GPU-accelerated LAPACK libraries download more info
CUDA Fortran from PGI download more info
GPU Computing SDK code samples and more download Release Notes for CUDA C 
Release Notes for OpenCL 
CUDA Occupancy Calculator 
License  
NVIDIA OpenCL extensions   Compiler_Options 
D3D9 Sharing 
D3D10 Sharing 
D3D11 Sharing 
Device Attribute Query 
Pragma Unroll  

MacOS

 

Developer Drivers for MacOS download  

CUDA Toolkit

  • C/C++ compiler
  • CUDA Visual Profiler
  • OpenCL Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • Additional tools and documentation
download

Getting Started Guide for Mac
Release Notes for Mac 
CUDA C Programming Guide 
CUDA C Best Best Practices Guide 
OpenCL Programming Guide 
OpenCL Best Best Practices Guide 
OpenCL Implementation Notes 
CUDA Reference Manual 
API Reference 
PTX ISA 2.0 
Visual Profiler User Guide
Visual Profiler Release Notes 
Fermi Compatibility Guide 
Fermi Tuning Guide 
CUBLAS User Guide
CUFFT User Guide 
License

NVIDIA Performance Primitives (NPP) library download  
CULA: GPU-accelerated LAPACK libraries download more info
PGI CUDA Fortran download more info
GPU Computing SDK code samples and more download Release Notes for CUDA C 
Release Notes for OpenCL 
CUDA Occupancy Calculator 
License