NVIDIA TAO

NVIDIA TAO is a framework for customizing vision foundation models for high accuracy and performance with fine-tuning microservices. TAO’s suite of modular microservices helps you easily adapt and optimize vision AI models for specific domains or tasks. This dramatically reduces the time and data you need to build high-performing AI solutions that are ready for deployment from the edge to the cloud.

At the heart of TAO is a collection of vision foundation models, multimodal models, and pre-trained vision models built on vast, commercially relevant datasets. Applicable across various industries, TAO excels at delivering custom industrial AI models for visual inspection, quality control, and robotic guidance.

Download Now Get Started

Forum

How TAO Works

The NVIDIA TAO workflow shows how developers can go from model training to production deployment in a seamless pipeline. The process begins with selecting a pretrained foundation model from the TAO model zoo or bringing a third-party model with an architecture supported by TAO. Next, developers adapt the model to their domain by fine-tuning and optimizing it to be smaller and faster at runtime. Finally, the trained models can be exported into open formats for deployment across diverse environments—from edge to cloud—with NVIDIA DeepStream Inference Builder. This structured workflow ensures that high-performing AI models can be quickly customized, optimized, and deployed at scale.

What is TAO Toolkit and how does it fit into AI model development workflow?

Tech Blog

Build real-time visual inspection pipelines with NVIDIA TAO 6.

Read the Blog

TAO Documentation

Browse documentation and learn how to get started on TAO.

See the Quick Start Guide

TAO 6 Release Note

Learn the new features released in the latest TAO 6.

Read the Release Note

Key Features

Scale Custom Model Development With New Vision Foundation Models

Use high-performance vision foundation models as a general-purpose starting point for developing a variety of downstream vision tasks, like classification, detection, segmentation, and more. You can customize models for domain or task-specific vision applications across industries based on training data availability and performance requirements.

Achieve High Accuracy With Advanced Training Techniques

Apply advanced training and fine-tuning capabilities, including self-supervised learning (SSL), to learn from unlabeled, unstructured data. This accelerates training time and reduces annotation costs. Plus, post-train third-party models with an architecture supported by TAO.

Increase Inference Throughput With Knowledge Distillation

Use knowledge distillation to compress large models into efficient, edge-ready versions with minimal reduction in accuracy.

Reduce Data Preparation Times With TAO Data Services

Manage, process, and prepare datasets for AI model training with services that streamline the data pipeline process with tools for data ingestion, auto-labeling, and conversion to formats optimized for NVIDIA TAO.

Deploy Anywhere, Run Efficiently

With fine-tuning microservices (FTMS) and DeepStream Inference Builder, TAO standardizes training and deployment for all supported models for inference on edge or cloud. It offers training job orchestration, boosts status monitoring, and automatically searches for the best hyperparameters with AutoML.

TAO Models

Vision Foundation Models

Use vision foundation models (VFMs) as pretrained starting points, making it easy to fine-tune models for domain-specific tasks and deploy them efficiently at scale.

Pre-Trained Vision Models

Easily combine pretrained vision models with a foundation model for tasks like detection, segmentation, classification, and change detection, streamlining domain-specific customization.

Real-time detection (RT-DETR)
Text prompt-based segmentation (SegFormer)
Visual change detection (Visual ChangeNet)

Depth Estimation Models

Use mono and stereo depth estimation foundation models to achieve strong zero-shot generalization.

Multimodal Vision Models

Use multimodal vision models to combine vision (image and video) data with text to perform tasks like feature extraction, detection, or segmentation

Get Started With TAO

Set Up Your System

Check to see if your machine meets the system requirements and compatibility, then get started by installing TAO.

Hardware Requirement

Setup Options

TAO Github Tutorials and Notebooks

Check out extended resources and Jupyter notebooks for TAO.

Learn More

Download From NGC

Find the latest TAO and models at NGC.

Download Now

Performance

Unlock peak inference performance with NVIDIA pretrained models across platforms—from the edge with NVIDIA Jetson™ solutions to the cloud featuring NVIDIA Ampere architecture GPUs. For more details on batch size and other models, check the detailed performance datasheet.

Model Arch	Model Variant	Inference Resolution	Precision	NVIDIA DGX Spark	NVIDIA Jetson AGX Thor™	NVIDIA L40s	NVIDIA A100	RTX PRO 6000 SE	NVIDIA H200	NVIDIA B200	NVIDIA HGX™ GB200
C-RADIOv2 Classification	Large 322M	3x224x224	FP16	297	635	1453	1520	2443	3579	5781	6018
NV-DINOv2	Large 305M	3x224x224	FP16	207	413	1020	1048	1747	2542	5667	5957
RT-DETR+C-RADIOv2	Base 147M	3x640x640	FP16	204	248	670	Awaiting Results	1204	1934	3316	Awaiting Results
SegFormer+C-RADIOv2	Base 92M	3x640x640	FP16	254	264	1155	1330	1960	2746	3187	3386
Multi-Golden ChangeNet Classification+C-RADIOv2	Base	3x224x224	FP16	Awaiting Results	Awaiting Results	332	418	525	847	820	867
NV-DepthAnythingv2	Large 360M	3x518x924	FP32+FP16	Awaiting Results	25	66	70	108	176	320	320
C-FoundationStereo	Small 221M	2x3x320x736	FP16	2.3	1.5	1.0	Awaiting Results	18	20	19	Awaiting Results

Starter Kits

Accelerated Computing Hub

Visit the Accelerated Computing Hub to see examples of CUDA in action in C++ and Python. You’ll find tutorials and example code that will help you learn more about how to use CUDA.

Fine-Tuning

Take advantage of Supervised Fine-Tuning (SFT) with labeled data and Self-Supervised Learning (SSL) with unlabeled data.

Model Distillation

Distill knowledge from a larger teacher model into a smaller student model for target compute.

Knowledge Distillation

AI-Assisted Auto Labeling

Use prompts and descriptors to auto-label object detection and segmentation masks.

Auto-Labeling in TAO

Depth Estimation

Access the highest-accuracy depth estimation models.

Download FoundationStereo
Download NVDepthAnythingv2

Model Deployment

Optimize inference with the DeepStream SDK.

Deploy With DeepStream Inference Builder
Watch the Video

More Resources

Join the Community

Sign Up for the Developer Newsletter

Decorative image representing Developer Program

Join the NVIDIA Developer Program

Ethical AI

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.

Get Started Today.

Start With NVIDIA TAO