Posts by Kaiyu Xie
Data Center / Cloud
Oct 20, 2025
Scaling Large MoE Models with Wide Expert Parallelism on NVL72 Rack Scale Systems
Modern AI workloads have moved well beyond single-GPU inference serving. Model parallelism, which efficiently splits computation across many GPUs, is now the...
10 MIN READ