About the author
Mostofa Patwary
Mostofa Patwary is a senior deep learning research scientist at the Applied Deep Learning Research team at NVIDIA. Mostofa's research interests span in the areas of natural language processing, scalable deep learning, HPC, and algorithm engineering. Prior to joining NVIDIA, Mostofa worked on scaling large language models and the predictability of scaling deep learning applications at Baidu's Silicon Valley AI Lab. Mostofa also made significant contributions in developing large-scale code for several core kernels in machine learning capable of running on supercomputers.
Mostofa Patwary
Post by Mostofa Patwary
story-generation-tree (2)
By Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Raul Puri, Pascale Fung, Anima Anandkumar and Bryan Catanzaro |
time-spent-per-iteration
By Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro |