NVIDIA Transfer Learning Toolkit

Create accurate and efficient AI models for Intelligent Video Analytics and Computer Vision without expertise in AI frameworks. Develop like a pro with zero coding.


Get Started


Transfer Learning Toolkit (TLT) is a python based AI toolkit for taking purpose-built pre-trained AI models and customizing them with your own data. Transfer learning extracts learned features from an existing neural network to a new one. Transfer learning is often used when creating a large training dataset is not feasible. Developers, researchers and software partners building intelligent vision AI apps and services, can bring their own data to fine-tune pre-trained models instead of going through the hassle of training from scratch.


Easier & Faster Training

Add state of the art AI to your application with zero coding. No AI frameworks expertise needed

Highly Accurate AI

Remove barriers and unlock higher network accuracy by using purpose-built pre-trained models

Greater Throughput

Reduce deployment costs significantly and perform high throughput inference with DeepStream SDK and TLT



Faster Time to Market

Creating and training DNNs for a specific use-case such as building occupancy analytics, traffic monitoring, parking management, license plate recognition, anomaly detection and more, involves gathering enormous amounts of data, preparing data and ensuring that these models are highly accurate.

Avoid the time consuming process of creating and optimizing models from scratch or by using unoptimized open source models by focusing on your solution. TLT reduces the engineering effort from 80 weeks to ~ 8 weeks by rapidly training with NVIDIA purpose-built models to achieve higher throughput and accuracy in a shorter duration of time. By deploying your vision AI application using DeepStream, you can unlock greater stream density and deploy at scale.

TLT’s simple Command Line Interface (CLI) abstracts away the AI framework complexity and enables you to build production quality pre-trained models with a zero coding approach.



AI Models for Classification, Detection & Segmentation

TLT adapts popular network architectures and backbones to your data, allowing you to train, fine tune, prune and export highly optimized and accurate AI models for edge deployment.

Image Classification
Object Detection
Instance Segmentation
DetectNet_V2
FasterRCNN
SSD
YOLOV3
RetinaNet
DSSD
MaskRCNN
ResNet
10/18/34/50/101

VGG16/19

GoogLeNet

MobileNet V1/V2

DarkNet 19/53

SqueezeNet


Production-Quality Pre-Trained Models

Pre-trained models accelerate the AI training process and reduce costs associated with large scale data collection, labeling, and training models from scratch. NVIDIA’s purpose-built pre-trained models are production quality, highly accurate and can be used for various use-cases such as counting people, detecting vehicles, optimizing traffic, parking management, warehouse operations and more.

Model
Network Architecture
Accuracy
DashCamNet
DetectNet_v2-ResNet18
80%
FaceDetect-IR
DetectNet_v2-ResNet18
96%
PeopleNet
DetectNet_v2-ResNet34
84%
TrafficCamNet
DetectNet_v2-ResNet18
83.5%
VehicleMakeNet
ResNet18
91%
VehicleTypeNet
ResNet18
96%

Demos: Smart City AI Models


This sample video shows PeopleNet in action

PeopleNet

Identify foot traffic to operationalize retail stores, malls and mass transit locations. PeopleNet is a 3-class object detection network trained on 960x544 RGB images to detect person, bag and face.

Get PeopleNet from NGC



TrafficCamNet

Understand traffic flow around intersections and optimize traffic during congestion with TrafficNet, a 4-class object detection network trained on 960x544 RGB images to detect cars, two wheelers, persons, and road signs.

Get TrafficCamNet from NGC

VehicleMakeNet

The car classifying model that can be used for common smart city applications such as traffic intersections, secure entry, mall parking management and toll booth monitoring. The classification network is based on ResNet18, that can classify car crops of size 224 x 224 into 20 car makes, including Acura, Audi, BMW, Chevrolet, Chrysler, Dodge, Ford, GMC, Honda, Hyundai, Infiniti, Jeep, Kia, Lexus, Mazda, Mercedes, Nissan, Subaru, Toyota, and Volkswagen. CarMakeNet can be pipelined with DashCamNet or TrafficCamNet for smart city applications.

Get VehicleMakeNet from NGC


This sample video shows TrafficCamNet and VehicleMakeNet in action




This sample video shows DashCamNet and VehicleTypeNet in action

DashCamNet

A 4-class object detection network built on NVIDIA’s detectnet_v2 architecture with ResNet18 as the backbone feature extractor. It’s trained on 960x544 RGB images to detect cars, pedestrians, traffic signs and two wheelers.

Get DashCamNet from NGC

VehicleTypeNet

A classification network based on ResNet18, that can classify car crops of size 224 x 224 into 6 classes: Coupe, LargeVehicle, Sedan, SUV, Truck, Vans. can be pipelined with DashCamNet or TrafficCamNet for smart city applications.

Get VehicleTypeNet from NGC


FaceDetect-IR

This is a single class face detection network built on NVIDIA’s detectnet_v2 architecture with ResNet18 as the backbone feature extractor. The model is trained on 384x240x3 IR images augmented with synthetic noises and is trained for use cases where the person’s face is close to the camera, such as a laptop camera during video conferencing or a camera placed inside a car observing the driver or the passengers. When infra-red illuminators are used this model can continue to work even when visible light conditions are considered too dark for normal color cameras.

Get FaceDetect-IR from NGC


There are 25+ pre-trained models for objection detection, image classification and instance segmentation available on NVIDIA NGC

See All Models



Greater Throughput & Highest Accuracy for Vision AI

To reduce development efforts and increase throughput, developers can use highly accurate pre-trained models from TLT and deploy with DeepStream SDK. The following table shows the end-to-end inference performance on 1080p/30fps input stream. Note that running on the DLAs for Jetson Xavier NX and Jetson AGX Xavier frees up GPU for other tasks.


Jetson Nano
Jetson Xavier NX
Jetson AGX Xavier
T4
Model Architecture
Inference Resolution
Precision
Model Accuracy
GPU (FPS)*
GPU (FPS)
DLA1 (FPS)
DLA2 (FPS)
GPU (FPS)
DLA1 (FPS)
DLA2 (FPS)
GPU (FPS)
PeopleNet-ResNet18
960x544
INT8
80%
14
218
72
72
384
94
94
1105
PeopleNet-ResNet34
960x544
INT8
84%
10
157
51
51
272
67
67
807
TrafficCamNet-ResNet18
960x544
INT8
84%
19
261
105
105
464
140
140
1300
DashCamNet-ResNet18
960x544
INT8
80%
18
252
102
102
442
133
133
1280
FaceDetect-IR-ResNet18
384x240
INT8
96%
95
1188
570
570
2006
750
750
2520
VehicleTypeNet - ResNet18 ⊺
224x224
INT8
96%
120
1333
678
678
3047
906
906
11918
VehicleMakeNet - ResNet18 ⊺
224x224
INT8
91%
173
1871
700
700
3855
945
945
15743

Greater end-to-end throughput using Transfer Learning Toolkit and DeepStream SDK
* FP16 inference on Jetson Nano
⊺ Throughput measured using trtexec and does not reflect end-to-end performance




Why Use Transfer Learning Toolkit?


Robust AI Using Quantization-Aware Training

Typically AI models when represented in lower precision make them compute-efficient. INT8 precision AI models are significantly faster than running inference in floating point , quantizing FP32/16 weights to INT8 post-training can reduce model accuracy due to quantization errors. With the quantization-aware training feature of TLT, quantization of weights in the training step helps produce comparable accuracy as FP16/FP32 models versus quantization post-training. With Quantize Aware Training (QAT) in TLT, developers can achieve upto 2X inference speedup while maintaining comparable accuracy to FP16.

Learn More





Powerful End-to-End AI Systems

Build end-to-end services and solutions for transforming pixels and sensor data to actionable insights using DeepStream SDK and Transfer Learning Toolkit.

The production ready AI models produced by TLT can be easily integrated with NVIDIA DeepStream and TensorRT for high throughput inference and enables you to unlock performance for a variety of applications including smart cities and hospitals, industrial inspection, logistics, traffic monitoring, retail analytics etc.

TLT pruning improves channel density for high throughput inference.





Testimonials


Using NVIDIA’S TLT made training a real time car detector and license plate detector easy. It eliminated our need to build models from the ground up, resulting in faster development of models and ability to explore options.


Booz Allen Hamilton

SmartCow is building turnkey AIoT solutions to optimize turnaround time at ports and dry docks. By using TLT, we were able to reduce the training iterations by 9x and reduce the data collection and labeling effort by 5x which significantly reduces our training cost by 2x

SmartCow


General FAQ

Yes, TLT models are free for commercial use. For specific licensing terms, refer to model EULA.

TLT uses TensorFlow and Keras framework completely abstracted away from the user. Users operate TLT through documented spec files and do not have to learn about DL framework.

Pull the TLT container from NGC. The container comes pre-packaged with Jupyter notebooks and sample spec files for various network architectures. Additional technical resources can be found here.

No third party pre-trained models are supported by TLT. Only NVIDIA pre-trained models from NGC are currently supported.

Training with TLT is only on x86 with NVIDIA GPU such as a V100. Models trained with TLT can be deployed on any NVIDIA platform including Jetson.

To deploy trained models on DeepStream, refer to Deploying to DeepStream chapter of TLT Getting started guide.

The six purpose-built models (PeopleNet, TrafficCamNet, DashCamNet, FaceDetectIR, VehicleTypeNet and VehicleCamNet) can be used as is out of box and can also be re-trained with your dataset. The architecture specific models for detection and classification are required to be re-trained with TLT.

Yes, instance segmentation with MaskRCNN DNN architecture is currently supported. To learn more, please read the blog post Training Instance Segmentation Models Using Mask R-CNN on the NVIDIA Transfer Learning Toolkit.

Yes, TLT 2.0 supports QAT (Quantization Aware Training) to improve INT8 accuracy and support for Automatic Mixed Precision to speed up AI training with Tensor Cores enabled for NVIDIA Volta and Turing GPUs. To learn more, please read the blog post Improving INT8 Accuracy Using Quantization Aware Training and the NVIDIA Transfer Learning Toolkit.

Latest Product News



Developer Tutorial

Learn how to train a 90-class COCO MaskRCNN model with TLT and deploy it on Deepstream using TensorRT


TRY TODAY





Developer Tutorial

Learn how to train an AI model with Quantization Aware Training in Transfer Learning Toolkit.

 

READ NOW


NVIDIA GTC

BMW research group showcases use of NVIDIA ISAAC SDK and TLT for building smart transport robots.



LEARN MORE


Community Projects

Learn something new or build your own project. See projects built by our developer community.



SUBMIT A PROJECT


Simplify and speed-up AI training with Transfer Learning Toolkit


Get Started