Hybridizer: High-Performance C# on GPUs

Features, .NET, C++, CUDA

Nadeem Mohammad, posted Dec 13 2017

Hybridizer is a compiler from Altimesh that lets you program GPUs and other accelerators from C# code or .NET Assembly.

Read more

Fast INT8 Inference for Autonomous Vehicles with TensorRT 3

Features, Autonomous Vehicles, DP4A, Inference, Mixed Precision, TensorRT

Nadeem Mohammad, posted Dec 11 2017

Autonomous driving demands safety, and a high-performance computing solution to process sensor data with extreme accuracy.

Read more

CUTLASS: Fast Linear Algebra in CUDA C++

Features, C++, CUBLAS, CUDA, Deep Learning, Libraries, Linear Algebra

Nadeem Mohammad, posted Dec 05 2017

Matrix multiplication is a key computation within many scientific applications, particularly those in deep learning. Many operations in modern deep neural networks are either defined as matrix multiplications or can be cast as such.

Read more

RESTful Inference with the TensorRT Container and NVIDIA GPU Cloud

Features, Containers, Docker, Inference, NVIDIA GPU Cloud, REST, TensorRT

Nadeem Mohammad, posted Dec 05 2017

You’ve built, trained, tweaked and tuned your model. Finally, you have a Caffe, ONNX or TensorFlow model that meets your requirements.

Read more

TensorRT 3: Faster TensorFlow Inference and Volta Support

Features, Deep Learning, Inference, TensorFlow, TensorRT, Volta

Nadeem Mohammad, posted Dec 04 2017

NVIDIA TensorRT™ is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications.

Read more