Technical Blog
Tag: large language models
Subscribe
Technical Walkthrough
Aug 03, 2022
Accelerated Inference for Large Transformer Models Using FasterTransformer and Triton Inference Server
This is the first part of a two-part series discussing the NVIDIA FasterTransformer library, one of the fastest libraries for distributed inference of...
10 MIN READ
Technical Walkthrough
Jul 28, 2022
NVIDIA AI Platform Delivers Big Gains for Large Language Models
As the size and complexity of large language models (LLMs) continue to grow, NVIDIA is today announcing updates to the NeMo Megatron framework that provide...
7 MIN READ