Comprehensive GPU function library, including functions for math, signal processing, image processing, statistics, and more. Interfaces for C, C++, Fortran, and Python. Integrates with any CUDA application. Contains an array-based API for easy programmability. Contains the popular "GFOR" for-loop for running all loop iterations simultaneously on the GPU. Designed for use on the full range of systems, from single GPU systems to large multi-GPU supercomputers. Free for use on a single GPU.

Sign up for Gear Up For Speedup: Delivering 4x Speedups In Science And Engineering Code With ArrayFire

Find out what ArrayFire can do for you. Watch the webinar recording (MP4)

Key Features

 
  • It contains excellent GPU implementations of hundreds of matrix, signal, and image processing routines that enable it outperform CPU libraries like IPP, MKL, Eigen, Armadillo, and more.
  • It is optimized for any CUDA-enabled GPU. The same code will run on laptops, desktops, or servers.
  • It includes thousands of lines of highly-tuned device code.
  • It performs run-time analysis of your code to increase arithmetic intensity and memory throughput while avoiding unnecessary temporary allocations.
  • It combines and enhances all the best CUDA libraries available, including the fastest FFT, BLAS, and LAPACK implementations.
  • A simple array notation you can learn in minutes.
  • A few lines of ArrayFire code accomplishes what would have taken 10-100X lines in raw CUDA.
  • It is easier than templated programming and goes farther than simple directive-based approaches (and outperforms those approaches too).
  • It supports easily scaling to take advantage of multiple GPUs.
  • It can be used in C/C++ applications by itself or integrated with your existing CUDA code.
  • It has hundreds of functions you need to make your code faster including arithmetic, linear algebra, statistics, signal processing, image processing, and related algorithms (see more).
  • It supports single and double-precision floating point values, complex numbers, and booleans (see more).
  • It supports manipulating vectors, matrices, and N-dimensional arrays (see more).
  • It can execute loop iterations in parallel with gfor (see more).

Developers have experienced from 2X to 100X speedups, depending on the data-parallelism inherent in the application.

Availability

Free version available on the AccelerEyes website.

For more information:

Additional GPU Accelerated Libraries