With the CUDA Toolkit from NVIDIA, you can accelerate your C or C++ code by moving the computationally intensive portions of your code to an NVIDIA GPU. In addition to providing drop-in library acceleration, you are able to efficiently access the massive parallel power of a GPU with a few new syntactic elements and calling functions from the CUDA Runtime API.
The CUDA Toolkit from NVIDIA is free and includes:
Read through the Introduction to CUDA C/C++ series on Mark Harris’ Parallel Forall blog.
Try CUDA by taking a self-paced lab on nvidia.qwiklab.com. These labs only require a supported web browser and a network that allows Web Sockets. Click here to verify that your network & system support Web Sockets in section "Web Sockets (Port 80)", all check marks should be green.
So, now you’re ready to deploy your application?
You can register today to have FREE access to NVIDIA TESLA K40 GPUs.
Develop your codes on the fastest accelerator in the world. Try a Tesla K40 GPU and accelerate your development.
The CUDA Toolkit is a free download from NVIDIA and is supported on Windows, Mac, and most standard Linux distributions.
Starting with CUDA 5.5, CUDA also supports the ARM architecture
For the host-side code in your application, the nvcc compiler will use your default host compiler.