Millions of people now share photos, video, and music or spoken words on the web daily. GPU powered micro services can process this data quickly to deliver great visual experiences and intelligent capabilities based on deep learning.


The NVIDIA GPU REST Engine (GRE) is a critical component for developers building low-latency web services. GRE includes a multi-threaded HTTP server that presents a RESTful web service and schedules requests efficiently across multiple NVIDIA GPUs. The overall response time depends on how much processing you need to do, but GRE itself adds very little overhead and can process null-requests in as little as 10 microseconds.