GPU-Accelerated Computing with Python
NVIDIA’s CUDA Python provides a driver and runtime API for existing toolkits and libraries to simplify GPU-based accelerated processing. Python is one of the most popular programming languages for science, engineering, data analytics, and deep learning applications. However, as an interpreted language, it’s been considered too slow for high-performance computing.
Set Up CUDA Python
To run CUDA Python, you’ll need the CUDA Toolkit installed on a system with CUDA-capable GPUs. Use this guide to install CUDA. If you don’t have a CUDA-capable GPU, you can access one of the thousands of GPUs available from cloud service providers, including Amazon AWS, Microsoft Azure, and IBM SoftLayer. The NVIDIA-maintained CUDA Amazon Machine Image (AMI) on AWS, for example, comes pre-installed with CUDA and is available for use today.
NVIDIA AMIs on AWS Download CUDATo get started with Numba, the first step is to download and install the Anaconda Python distribution that includes many popular packages (Numpy, SciPy, Matplotlib, iPython, etc.) and “conda,” a powerful package manager. Once you have Anaconda installed, install the required CUDA packages by typing conda install numba cudatoolkit pyculib.
Getting Started Blogs
Numba: High-Performance Python with CUDA Acceleration
Numba provides Python developers with an easy entry into GPU-accelerated computing and a path for using increasingly sophisticated CUDA code with a minimum of new syntax and jargon.
Numba Tutorial for CUDA
Check out the Numba tutorial for CUDA on the ContinuumIO github repository.
Seven Things You Might Not Know About Numba
Numba runs inside the standard Python interpreter, so you can write CUDA kernels directly in Python syntax and execute them on the GPU. Dive deeper into several aspects of using Numba on the GPU that are often overlooked.
GPU-Accelerated Graph Analytics in Python with Numba
Numba provides Python developers with an easy entry into GPU-accelerated computing and a path for using increasingly sophisticated CUDA code with a minimum of new syntax and jargon.