After clicking “Watch Now” you will be prompted to login or join.


Click “Watch Now” to login or join the NVIDIA Developer Program.


Advanced Optimizations of Persistent Recurrent Neural Networks

Vasily Volkov, NVIDIA | Jeremy Appleyard, NVIDIA

GTC 2020

Recurrent Neural Networks (RNNs) with small batch sizes tend to be bandwidth-bound when implemented naively. Persisting the majority of the inputs in low-level GPU memory can turn the problem back into a compute-bound one and see order-of-magnitude speedups. We'll dive into our methods to achieve performance in cuDNN's persistent RNN implementation, many of which are applicable to other persistent methods.

View More GTC 2020 Content