Shang Zhang

Shang Zhang is a Senior AI DevTech Engineer at NVIDIA, specializing in accelerating and deploying deep learning applications on GPUs. His work focuses on optimizing large language model (LLM) systems and developing efficient GPU kernels. He earned his Ph.D. in Physics and Scientific Computing from the University of Michigan, Ann Arbor.

Posts by Shang Zhang

Data Center / Cloud Nov 21, 2024

NVIDIA TensorRT-LLM Multiblock Attention Boosts Throughput by More Than 3x for Long Sequence Lengths on NVIDIA HGX H200

Generative AI models are advancing rapidly. Every generation of models comes with a larger number of parameters and longer context windows. The Llama 2 series... 5 MIN READ