Transformers

May 15, 2023
Efficiently Scale LLM Training Across a Large GPU Cluster with Alpa and Ray
Recent years have seen a proliferation of large language models (LLMs) that extend beyond traditional language tasks to generative AI. This includes models like...
16 MIN READ

Feb 01, 2023
New cuBLAS 12.0 Features and Matrix Multiplication Performance on NVIDIA Hopper GPUs
The NVIDIA H100 Tensor Core GPU, based on the NVIDIA Hopper architecture with the fourth generation of NVIDIA Tensor Cores, recently debuted delivering...
10 MIN READ

Sep 14, 2022
NVIDIA, Arm, and Intel Publish FP8 Specification for Standardization as an Interchange Format for AI
AI processing requires full-stack innovation across hardware and software platforms to address the growing computational demands of neural networks. A key area...
4 MIN READ

Sep 12, 2022
Improving Japanese Language ASR by Combining Convolutions with Attention Mechanisms
Automatic speech recognition (ASR) research generally focuses on high-resource languages such as English, which is supported by hundreds of thousands of hours...
5 MIN READ

Aug 03, 2022
Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server
This is the first part of a two-part series discussing the NVIDIA Triton Inference Server’s FasterTransformer (FT) library, one of the fastest libraries for...
10 MIN READ

Aug 03, 2022
Deploying GPT-J and T5 with NVIDIA Triton Inference Server
This is the second part of a two-part series about NVIDIA tools that allow you to run large transformer models for accelerated inference. For an introduction to...
16 MIN READ

Jul 28, 2022
NVIDIA AI Platform Delivers Big Gains for Large Language Models
As the size and complexity of large language models (LLMs) continue to grow, NVIDIA is today announcing updates to the NeMo framework that provide training...
7 MIN READ

Jul 27, 2022
Developing NLP Applications for Healthcare
Natural language processing (NLP) can be defined as the combination of artificial intelligence (AI), computer science, and computational linguistics to...
4 MIN READ

Jul 12, 2022
Adapting P-Tuning to Solve Non-English Downstream Tasks
With the increasing demand for access to pretrained large language model (LLM) weights, the climate around LLM sharing is changing. Recently, Meta released Open...
15 MIN READ

Jun 28, 2022
Transformers4Rec: Building Session-Based Recommendations with an NVIDIA Merlin Library
Recommender systems help you discover new products and make informed decisions. Yet, in many recommendation-dependent domains such as e-commerce, news, and...
8 MIN READ

Jun 22, 2022
Novel Transformer Model Achieves State-of-the-Art Benchmarks in 3D Medical Image Analysis
At the Computer Vision and Pattern Recognition Conference (CVPR), NVIDIA researchers are presenting over 35 papers. This includes work on Shifted WINdows UNEt...
6 MIN READ

May 23, 2022
The Future of Computer Vision
Computer vision is a rapidly growing field in research and applications. Advances in computer vision research are now more directly and immediately applicable...
9 MIN READ

May 09, 2022
Generating Synthetic Data with Transformers: A Solution for Enterprise Data Challenges
Big data, new algorithms, and fast computation are three main factors that make the modern AI revolution possible. However, data poses many challenges for...
8 MIN READ

Nov 09, 2021
Accelerating Multiorgan Rendering for Radiology and Radiation Therapy with NVIDIA Clara Holoscan
Watch NVIDIA founder and CEO Jensen Huang’s GTC keynote address streaming on Nov. 9 and in replay. Tune in to a healthcare special address by Kimberly...
13 MIN READ

May 10, 2021
Enabling Predictive Maintenance Using Root Cause Analysis, NLP, and NVIDIA Morpheus
Background Predictive maintenance is used for early fault detection, diagnosis, and prediction when maintenance is needed in various industries including oil...
6 MIN READ

Jul 29, 2020
Optimizing NVIDIA AI Performance for MLPerf v0.7 Training
MLPerf is an industry-wide AI consortium that has developed a suite of performance benchmarks covering a range of leading AI workloads that are widely in use...
16 MIN READ