Note: This video may require joining the NVIDIA Developer Program or login

GTC Silicon Valley-2019 ID:S91039:Amazon Elastic Inference: Reduce Deep Learning Inference Cost (Presented by Amazon Web Services)

Rahul Sharma(Amazon)
Deploying deep learning applications at scale can be cost prohibitive due to the need for hardware acceleration to meet latency and throughput requirements of inference. Amazon Elastic Inference helps you tackle this problem by reducing the cost of inference by up to 75% with GPU-powered acceleration that can be right-sized to your application's inference needs. In this session, learn about how to deploy TensorFlow, Apache MXNet, and ONNX models with Amazon Elastic Inference on Amazon EC2 and Amazon SageMaker.