Author: Mostofa Patwary | NVIDIA Technical Blog

Mostofa Patwary

Mostofa Patwary is a senior deep learning research scientist at the Applied Deep Learning Research team at NVIDIA. Mostofa's research interests span in the areas of natural language processing, scalable deep learning, HPC, and algorithm engineering. Prior to joining NVIDIA, Mostofa worked on scaling large language models and the predictability of scaling deep learning applications at Baidu's Silicon Valley AI Lab. Mostofa also made significant contributions in developing large-scale code for several core kernels in machine learning capable of running on supercomputers.

Posts by Mostofa Patwary

Conversational AI Aug 08, 2023

Mostofa Patwary

Posts by Mostofa Patwary

Curating Trillion-Token Datasets: Introducing NVIDIA NeMo Data Curator

Scaling Language Model Training to a Trillion Parameters Using Megatron

Adding External Knowledge and Controllability to Language Models with Megatron-CNTRL

State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU