GTC 2020: Advanced Optimizations of Persistent Recurrent Neural Networks

After clicking “Watch Now” you will be prompted to login or join.

Click “Watch Now” to login or join the NVIDIA Developer Program.

WATCH NOW

Advanced Optimizations of Persistent Recurrent Neural Networks

Vasily Volkov, NVIDIA | Jeremy Appleyard, NVIDIA

GTC 2020

Recurrent Neural Networks (RNNs) with small batch sizes tend to be bandwidth-bound when implemented naively. Persisting the majority of the inputs in low-level GPU memory can turn the problem back into a compute-bound one and see order-of-magnitude speedups. We'll dive into our methods to achieve performance in cuDNN's persistent RNN implementation, many of which are applicable to other persistent methods.

View More GTC 2020 Content