Wenwen Gao

Wenwen Gao is a senior product manager for NeMo at NVIDIA, focusing on LLM training framework and microservices. Her past experience include LLM inference (NIM) and recommender systems (Merlin). She holds a B.S. in computer science from the University of Toronto and an M.B.A. from the MIT Sloan School of Management.
Wenwen Gao

Posts by Wenwen Gao

Agentic AI / Generative AI

Accelerating Large-Scale Mixture-of-Experts Training in PyTorch

Training massive mixture-of-experts (MoE) models has long been the domain of a few advanced users with deep infrastructure and distributed-systems expertise.... 7 MIN READ
Agentic AI / Generative AI

Reinforcement Learning with NVIDIA NeMo-RL: Megatron-Core Support for Optimized Training Throughput

The initial release of NVIDIA NeMo-RL included training support through PyTorch DTensor (otherwise known as FSDP2). This backend enables native integration with... 7 MIN READ
Agentic AI / Generative AI

Reinforcement Learning with NVIDIA NeMo-RL: Reproducing a DeepScaleR Recipe Using GRPO

Reinforcement learning (RL) is the backbone of interactive AI. It is fundamental for teaching agents to reason and learn from human preferences, enabling... 5 MIN READ
Agentic AI / Generative AI

Scaling to Millions of Tokens with Efficient Long-Context LLM Training

The evolution of large language models (LLMs) has been marked by significant advancements in their ability to process and generate text. Among these... 7 MIN READ
Agentic AI / Generative AI

Run Hugging Face Models Instantly with Day-0 Support from NVIDIA NeMo Framework

As organizations strive to maximize the value of their generative AI investments, accessing the latest model developments is crucial to continued success. By... 6 MIN READ
A multi-data center illustration.
Data Center / Cloud

Turbocharge LLM Training Across Long-Haul Data Center Networks with NVIDIA Nemo Framework

Multi-data center training is becoming essential for AI factories as pretraining scaling fuels the creation of even larger models, leading the demand for... 6 MIN READ