CUDA Toolkit 4.0

Release Highlights

Easier Application Porting

Share GPUs across multiple threads

Use all GPUs in the system concurrently from a single host thread

No-copy pinning of system memory, a faster alternative to cudaMallocHost()

C++ new/delete and support for virtual functions

Support for inline PTX assembly

Thrust library of templated performance primitives such as sort, reduce, etc.

NVIDIA Performance Primitives (NPP) library for image/video processing

Layered Textures for working with same size/format textures at larger sizes and higher performance

Faster Multi-GPU Programming

Unified Virtual Addressing

GPUDirect v2.0 support for Peer-to-Peer Communication

New & Improved Developer Tools

Automated Performance Analysis in Visual Profiler

C++ debugging in CUDA-GDB for Linux and MacOS

GPU binary disassembler for Fermi architecture (cuobjdump)

Parallel Nsight 2.0 now available for Windows developers with new debugging and profiling features.

Watch the CUDA Toolkit 4.0 Feature and Overview Webinar (or just the slides) for an overview of some of the exciting new features of this release.

Check out the NEW CUDA 4.0 Math Library Performance Review

Find all the latest versions of other Libraries and Tools on our Tools & EcoSystem Page

Please download the lastest CUDA Toolkit 4.0 Errata Update.

The latest released NVIDIA Drivers are always available at www.nvidia.com/drivers

For previous releases, see the CUDA Toolkit Release Archive

Get yourself fully trained- check out the latest CUDA Webinars

Become a CUDA Registered Developer, report bugs, engage with NVIDIA engineering

Jump to: [Windows][ Linux ] [ MacOS ]

Windows 7, VISTA, Windows XP Downloads Developer Drivers for WinXP (270.81)

Support for XP on notebooks is being phased out and is not available for this release. See Release Notes and Getting Started Guides for more information. 32-bit 64-bit Developer Drivers for WinVista and Win7 (270.81) 32-bit 64-bit Notebook Developer Drivers for WinVista and Win7 (275.33) 32-bit 64-bit CUDA Toolkit C/C++ compiler

Visual Profiler

GPU-accelerated BLAS library

GPU-accelerated FFT library

GPU-accelerated Sparse Matrix library

GPU-accelerated RNG library

Additional tools and documentation 32-bit 64-bit documentation *NEW* CUDA Toolkit 4.0 Build Customization BUG FIX Update

Fixes error message "$(CUDABuildTasksPath) property is not valid" download GPU Computing SDK - complete package including all code samples 32-bit 64-bit

browse online Parallel Nsight 2.0 download Learn about additional tools, libraries, and more… CUDA Ecosystem CUDA Tools SDK (APIs for 3rd party performance analysis tools and cluster management solutions) 32-bit 64-bit