Technical Blog
Tag: Transformers
Subscribe
Technical Walkthrough
Feb 01, 2023
New cuBLAS 12.0 Features and Matrix Multiplication Performance on NVIDIA Hopper GPUs
The NVIDIA H100 Tensor Core GPU, based on the NVIDIA Hopper architecture with the fourth generation of NVIDIA Tensor Cores, recently debuted delivering...
10 MIN READ
Technical Walkthrough
Sep 12, 2022
Improving Japanese Language ASR by Combining Convolutions with Attention Mechanisms
Automatic speech recognition (ASR) research generally focuses on high-resource languages such as English, which is supported by hundreds of thousands of hours...
5 MIN READ
Technical Walkthrough
Aug 03, 2022
Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server
This is the first part of a two-part series discussing the NVIDIA Triton Inference Server’s FasterTransformer (FT) library, one of the fastest libraries for...
10 MIN READ
Technical Walkthrough
Aug 03, 2022
Deploying GPT-J and T5 with NVIDIA Triton Inference Server
This is the second part of a two-part series about NVIDIA tools that allow you to run large transformer models for accelerated inference. For an introduction to...
16 MIN READ
Technical Walkthrough
Jul 28, 2022
NVIDIA AI Platform Delivers Big Gains for Large Language Models
As the size and complexity of large language models (LLMs) continue to grow, NVIDIA is today announcing updates to the NeMo Megatron framework that provide...
7 MIN READ
News
Jul 27, 2022
Developing NLP Applications for Healthcare
Natural language processing (NLP) can be defined as the combination of artificial intelligence (AI), computer science, and computational linguistics to...
4 MIN READ