Bo Yang Hsueh is a developer technology engineer at NVIDIA. Bo Yang is the leader and major developer of FasterTransformer. He attended the transformer acceleration three years ago. Recently, his focus is on giant NLP model acceleration, including public models like T5 and GPT-J. Bo Yang received his M.S. in computer science from National Chiao Tung University.
Full-Stack Innovation Fuels Highest MLPerf Inference 2.1 Results for NVIDIA

Today’s AI-powered applications are enabling richer experiences, fueled by both larger and more complex AI models as well as the application of many models in... 14 MIN READ
Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server

This is the first part of a two-part series discussing the NVIDIA Triton Inference Server’s FasterTransformer (FT) library, one of the fastest libraries for... 10 MIN READ
Deploying GPT-J and T5 with NVIDIA Triton Inference Server

This is the second part of a two-part series about NVIDIA tools that allow you to run large transformer models for accelerated inference. For an introduction to... 16 MIN READ