Megatron-Turing Natural Language Generation

Megatron-Turing Natural Language Generation model (MT-NLG), is the largest and the most powerful monolithic transformer English language model with 530 billion parameters. This 105-layer, transformer-based MT-NLG improves upon the prior state-of-the-art models in zero-, one-, and few-shot settings. It demonstrates unmatched accuracy in a broad set of natural language tasks such as, Completion prediction, Reading comprehension, Commonsense reasoning, Natural language inferences, Word sense disambiguation, etc.

Training such a large model was made possible by novel parallelism techniques demonstrated in the paper on the NVIDIA DGX SuperPOD-based Selene supercomputer. You can read more about the model and its accuracy on NVIDIA Technical Blog.

With the intent of accelerating research on the largest english language model till date and enabling customers to experiment, employ and apply such a large language model on downstream language tasks - NVIDIA is pleased to announce an Early Access program for its managed API service to MT-NLG mode.

We want to invite organizations around the world to apply in this Early Access program and collaborate with NVIDIA on research problems like How to apply this technology in a responsible manner?, how to detect, prevent and manage elements like toxicity, bias, inappropriate responses often eminent with such large language models? etc.

If you have a research goal and are open to collaborate, please apply using the below link


Join now