GTC 2020: rlpyt: A High-Performance, Open-Source Code Base for Reinforcement Learning
After clicking “Watch Now” you will be prompted to login or join.
Click “Watch Now” to login or join the NVIDIA Developer Program.
rlpyt: A High-Performance, Open-Source Code Base for Reinforcement Learning
Adam Stooke, University of California, Berkeley
We are pleased to share "rlpyt", an open-source, high-throughput code base for reinforcement learning in Python using PyTorch. It contains modular implementations of leading model-free RL algorithms from all three families: deep Q-learning, policy gradient, and Q-value policy gradients (e.g., DQN, PPO, and SAC, respectively). These families have developed along separate lines of research, such that few (if any) other code bases have incorporated all three. We unify their implementations over modular, optimized RL infrastructure code that includes easy-to-use options for multi-CPU and multi-GPU parallelism. rlpyt delivers fast experiments for small- to medium-scale research and development (large-scale meaning Dota with hundreds of GPUs per run, for example). We'll summarize its features, algorithms implemented, basic usage, and relation to prior offerings. rlpyt is available at https://github.com/astooke/rlpyt, which includes links to documentation and our white paper.