Sharath Sreenivas

Sharath Sreenivas is a Deep Learning Engineer at NVIDIA and is interested in development and optimization of learning algorithms. He received his M.S in Computer Science with a focus on Machine learning from University of California, Santa Cruz.

Pretraining BERT with Layer-wise Adaptive Learning Rates

Training with larger batches is a straightforward way to scale training of deep neural networks to larger numbers of accelerators and reduce the training time. 10 MIN READ