The downloads accessible via this portal are confidential and are provided exclusively to Members of the NVIDIA Developer Program. Please do not re-distribute or post the links to these files or the files themselves.

If you are not logged in or do not have correct membership - you will be prompted to register

Double Precision Boys Function Implementation

This is a double-precision implementation of the Boys function of orders 0 through 50 for molecular computations with Gaussian basis functions. It requires CUDA 6.0 or later and compute capability 2.0 or higher.


Maxwell Compute Architecture: Compatibility Guide

 The Maxwell Compute Architure will be the foundation of many of NVIDIA's GPUs. This document discusses code compatibility.


Maxwell Compute Architecture: Tuning Guide

 The Maxwell Compute Architure will be the foundation of many of NVIDIA's GPUs. This document discusses how to tune your code to optimize for this new architecture.


AMGX Solver Library Trial

This is a high performance, state-of-the-art library and includes a flexible solver composition system that allows a user to easily construct complex nested solvers and preconditioners. Read more about its features.


CUDA SIMD-within-a-word functions

An include file which contains a collection of inline functions for processing byte and half-word data packed into 32-bit words. The functions are hardware accelerated on Kepler platforms. Efficient emulation code is provided for earlier platform so the functions are portable across all compute capabilities. The functionality provided should be useful for image processing tasks and has many application areas.


CUDA Double-Double Precision Arithmetic (Updated April 2013)

Source code implementing double-double precision arithmetic functions which take double-double precision operands using a simple C-style interface. Functions include:

  • Negation
  • Addition
  • Subtraction
  • Multiplication
  • Division
  • Square Root

Any developer who requires precision beyond double precision will find this download very useful.


CUDA Accelerated Linpack

Download this code for GPU accelerated Linpack from your TESLA Cluster. For LINUX 64bit and Fermi Class GPU


CUDA Batch Solver (Updated June 2013)

This code provides an efficient solver and matrix inversion for small matrices, using partial pivoting.


CUDA Compiler SDK

Use the CUDA Compiler SDK to enable new languages to be GPU enabled , based on LLVM, this SDK provides all the essential files and documentation you need and  is now part of the CUDA Toolkit and not available as a separate download.

Now Part of CUDA Toolkit Install