Posts by Fan Yu
Networking / Communications
Feb 02, 2026
Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel
In LLM training, Expert Parallel (EP) communication for hyperscale mixture-of-experts (MoE) models is challenging. EP communication is essentially all-to-all,...
11 MIN READ
Data Science
Aug 31, 2022
Scaling Recommendation System Inference with NVIDIA Merlin Hierarchical Parameter Server
Recommendation systems are widely used today to personalize user experiences and improve customer engagement in various settings like e-commerce, social media,...
11 MIN READ