Technical Walkthrough 3

Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server

This is the first part of a two-part series discussing the NVIDIA Triton Inference Server’s FasterTransformer (FT) library, one of the fastest libraries for... 10 MIN READ
Technical Walkthrough 3

Deploying GPT-J and T5 with NVIDIA Triton Inference Server

This is the second part of a two-part series about NVIDIA tools that allow you to run large transformer models for accelerated inference. For an introduction to... 16 MIN READ
Technical Walkthrough 4

Adapting P-Tuning to Solve Non-English Downstream Tasks

With the increasing demand for access to pretrained large language model (LLM) weights, the climate around LLM sharing is changing. Recently, Meta released Open... 15 MIN READ
Technical Walkthrough 1

Generating Synthetic Data with Transformers: A Solution for Enterprise Data Challenges

Big data, new algorithms, and fast computation are three main factors that make the modern AI revolution possible. However, data poses many challenges for... 8 MIN READ
News 2

Insider’s Guide to GTC: Computer Vision, NLP, Recommenders, and Robotics

Looking for different topic areas? Keep an eye out for our other posts! Join us at GTC, March 21-24, to explore the latest technology and research across AI,... 6 MIN READ
Framework of workflow for NLP.
News 0

NVIDIA Announces Riva Speech AI and Large Language Modeling Software For Enterprise

NVIDIA recently unveiled new breakthroughs in NVIDIA Riva for speech AI, and NVIDIA NeMo for large-scale language modeling (LLM). Riva is a GPU-accelerated... 3 MIN READ