Just Announced—Run Jupyter Notebooks on Google Cloud with NGC's New One Click Deploy Feature.  Read More

Pretrained AI Models

Accelerate AI development with production-quality models from the NGC catalog.

What Are Pretrained AI Models?

AI and machine learning models are built on mathematical algorithms and are trained using data and human expertise. These models help us accurately predict outcomes based on input data such as images, text, or language. But building, training, and optimizing production-quality models is expensive, requiring numerous iterations, domain expertise, and countless hours of computation.

Pretrained models have been trained on representative datasets and fine-tuned with weights and biases. These models can be easily retrained with custom data in a fraction of the time it takes to train from scratch.

Explore NGC Models

Pretrained Models from the NGC Catalog

With production-ready, AI pretrained models from the NGC™ catalog, NVIDIA’s hub of GPU-optimized AI and high-performance computing (HPC) software, data scientists and developers can quickly adapt models or simply deploy them as is for inference.

Diverse Use Cases

NGC’s state-of-the-art, pretrained models and resources cover a wide set of use cases, from computer vision to natural language understanding to speech synthesis. These models leverage automatic mixed precision (AMP) on Tensor Cores and can scale from a single-node to multi-node systems to speed up training and inference.

Automatic Mixed Precision (AMP) on Tensor Cores

NVIDIA’s TAO Toolkit

Domain Adaptable

The NVIDIA TAO Toolkit makes it easy to adapt and fine-tune the pretrained models with your custom data.

TAO Toolkit abstracts away the AI and deep learning framework complexity and enables you to build production-quality computer vision or conversational AI models in hours rather than months.

Transparent Model Resumes

Just like a resume provides a snapshot of a candidate's skills and experience, model credentials do the same for a model. Many pretrained models include critical parameters such as batch size, training epochs, and accuracy, providing you with the necessary transparency and confidence to pick the right model for your use case.

Deploy AI Models with Confidence with the New Model Credentials Feature

SDKs such as Clara, Isaac, Riva, Merlin

SDK Integration

The pretrained models can be integrated into industry SDKs such as NVIDIA Clara™ for healthcare, NVIDIA Isaac™ for robotics, NVIDIA Riva for conversational AI, and more, making it easier for you to use them in your end-user applications and services.

A Model for Every Use Case

Get started today with models that span across diverse use cases, including computer vision, speech, and language understanding.


Computer Vision

With computer vision, devices can understand the world around us through images and videos. It uses image classification, object detection and tracking, object recognition, semantic segmentation, and instance segmentation.

License Plate Detection

LPDNet models detect one or more license plate objects from a car image and return a box around each object, along with an LPD label for each object.

Pull LPDNet Model


PeopleNet models detect one or more physical objects from three categories within an image and return a box around each object, along with a category label for each object. The three categories of objects detected are persons, bags, and faces.

Pull PeopleNet Model


Residual network architecture introduced “skip connections.” The main advantage of these models is the usage of residual layers as a building block that helps with gradient propagation during training.

Explore All ResNet-50


The SSD model is based on the "SSD: Single Shot MultiBox Detector" paper, which describes SSD as "a method for detecting objects in images using a single deep neural network."

Explore All SSD Models

Natural Language Processing

Natural language processing (NLP) uses algorithms and techniques to enable computers to understand, interpret, manipulate, and converse in human languages. It includes sentiment analysis, speech recognition, speech synthesis, language translation, and natural language generation.


BERT is a transformer-based pretrained language representation model that provides state-of-the-art results on a wide array of NLP tasks, including intent detection and named-entity recognition.

Explore All BERT Models


BioBERT checkpoints and scripts help achieve state-of-the-art results in biomedical text-mining benchmark tasks.

Explore All BioBERT Models

NMT Transformer

This model is based on the Transformer “Big” architecture originally presented in the "Attention Is All You Need" paper by Google. It includes pretrained models for multiple languages.

Explore All NMT Models


Speech deals with recognizing and translating audio into text or synthesizing speech from text. It includes speech synthesis, automatic speech recognition (ASR), and text-to-speech (TTS).


CitriNet is a Quartznet variant that utilizes efficient mechanisms such as subword encoding for highly accurate transcription and non-autoregressive connectionist temporal classification (CTC)-based decoding for efficient inference.

Explore All CitriNet


The QuartzNetmodel is an end-to-end neural acoustic model for ASR based on the Jasper model. It uses separable convolutions and larger filters, making it smaller than Jasper while maintaining comparable accuracy.

Explore All QuartzNet


The Kaldi Speech Recognition Toolkit project began in 2009 at Johns Hopkins University and is now the de-facto speech recognition toolkit in the community, enabling speech services for millions of people every day.

View Kaldi Model

FastPitch and HiFiGAN

The Fastpitch model produces a mel spectrogram from raw text, whereas HiFiGAN can generate audio from a mel spectrogram. These models can be combined and trained as an end-to-end pipeline for generating audio from text.

Explore All FastPitch and
HiFiGAN Models
Defined.AI Sample speech Data

Get Started with Sample Training Data for Speech Models

The quality of your training data sets the foundation for your AI applications. To help you customize the pretrained models for your speech application, Defined.AI, an NVIDIA partner, is offering 30 minutes of free sample data. You can access it now through the NGC catalog.


Adapt Models Faster with NVIDIA TAO

NVIDIA Train, Adapt, and Optimize (TAO) is an AI-model-adaptation platform that simplifies and accelerates the creation of enterprise AI applications and services. By fine-tuning pretrained models with custom data through a UI-based, guided workflow, enterprises can produce highly accurate computer vision, speech, and language understanding models in hours rather than months, eliminating the need for large training runs and deep AI expertise.

Learn More

NGC Catalog Resources

Technical Blogs

Learn how to use the NGC catalog with these step-by-step instructions.

Explore technical blogs


Read about the latest NGC catalog updates and announcements.

Read news

GTC Sessions

Watch all the top NGC sessions from GTC on demand.

Watch GTC sessions


Walk through how to use the NGC catalog with these video tutorials.

Watch webinars

Accelerate your AI development with pretrained models from the NGC catalog.

Get Started