Jetson Benchmarks
Jetson is used to deploy a wide range of popular DNN models and ML frameworks to the edge with high performance inferencing, for tasks like real-time classification and object detection, pose estimation, semantic segmentation, and natural language processing (NLP).
MLPerf Inference Benchmarks
The tables below show inferencing benchmarks from the NVIDIA Jetson submissions to the MLPerf Inference Edge category.
|Model
|NVIDIA Jetson AGX Orin (TensorRT)
|NVIDIA Orin MaxQ (TensorRT)
|NVIDIA Jetson Orin NX
|Single Stream (Samples/s)
|Offline (Samples/s)
|Multi Stream (Samples/s)
|Offline (Samples/s)
|System Power(W)
|Offline (Samples/s)
|Image Classification
ResNet-50
|1538
|6438.10
|3686
|3525.91
|23.06
|2517.99
|Object Detection
Retinanet
|51.57
|92.40
|60.00
|34.6
|22.4
|36.14
|Medical Imaging
3D-Unet
|.26
|.51
|N/A
|3.28
|28.64
|.19
|Speech-to-text
RNN-T
|9.822
|1170.23
|N/A
|14472
|25.64
|405.27
|Natural Language Processing
BERT
|144.36
|544.24
|N/A
|3685.36
|25.91
|163.57
- Steps to reproduce these results can be found at v3.0 Results | MLCommons
- These results were achieved with the NVIDIA Jetson AGX Orin Developer Kit running a preview of TensorRT 8.5.0, and CUDA 11.4
- Note different configurations were used for single stream, offline and multistream. Reference the MLCommons page for more details
|Model
|Jetson Xavier NX (TensorRT)
|Jetson AGX Xavier 32GB (TensorRT)
|Image Classification
ResNet-50
|1245.10
|2039.11
|Object Detection
SSD-small
|1786.91
|2833.59
|Object Detection
SSD-Large
|36.97
|55.16
|Speech to Text
RNN-T
|259.67
|416.13
|Natural Language
Processing
BERT-Large
|61.34
|96.73
- Full Results can be found at v1.1 Results | MLCommons
- These results were achieved on Jetson AGX Xavier Developer Kit and Jetson Xavier NX Developer Kit running Jetpack 4.6, TensorRT 8.0.1, CUDA 10.2
- ResNet-50, SSD-small, and SSD-Large were run on the GPU and both DLAs
- These MLPerf Results can be reproduced with the code in the following link: https://github.com/mlcommons/inference_results_v2.0/tree/master/closed/NVIDIA
NVIDIA Pretrained Model Benchmarks
NVIDIA pretrained models from NGC start you off with highly accurate and optimized models and model architectures for various use cases. Pretrained models are production-ready. You can further customize these models by training with your own real or synthetic data, using the NVIDIA TAO (Train-Adapt-Optimize) workflow to quickly build an accurate and ready to deploy model.The table below shows inferencing benchmarks for some of our pretrained models running on Jetson modules.
|Model
|Jetson Orin Nano 4GB
|Jetson Orin Nano 8GB
|Jetson Orin NX 8GB
|Jetson Orin NX 16GB
|Jetson AGX Orin 32GB
|Jetson AGX Orin 64GB
|PeopleNet (v2.5 unpruned)
|57
|117
|192
|240
|409
|685
|Action Recognition 2D
|220
|372
|440
|483
|1158
|1517
|Action Recognition 3D
|13
|26
|32
|39
|71
|108
|LPR Net
|552
|974
|1314
|1427
|2800
|4213
|Dashcam Net
|200
|400
|689
|877
|1482
|2139
|Bodypose Net
|69
|137
|169
|203
|360
|563
|Model
|Jetson Nano
|Jetson TX2 NX
|Jetson Xavier NX
|Jetson AGX Xavier
|PeopleNet (v2.5 unpruned)
|2
|5
|120
|195
|Action Recognition 2D
|32
|88
|245
|472
|Action Recognition 3D
|1
|3
|21
|32
|LPR Net
|47
|86
|714
|1236
|Dashcam Net
|11
|26
|424
|667
|Bodypose Net
|3
|7
|104
|172
- Jetson Orin & Jetson Xavier Benchmarks were run using Jetpack 5.1.1
- Each Jetson module was run with maximum performance (MAXN for JAO64, JAO32, ONX16, ONX8; and 15W mode for JON8, and 10W mode for JON4)
- For Jetson Nano and Jetson TX2 NX, these benchmarks were run using Jetpack 4.6.1
- Each Jetson module was run with maximum performance (MAXN)
- Reproduce these results by downloading these models from our NGC catalog
Jetson Family Benchmarks
|Model
|Jetson Orin Nano 4GB
|Jetson Orin Nano 8GB
|Jetson Orin NX 8GB
|Jetson Orin NX 16GB
|Jetson AGX Orin 32GB
|Jetson AGX Orin 64GB
|Inveption_V4
|182
|361
|593
|769
|1337.8
|1702.6
|VGG19
|174
|361
|442
|532
|937
|1471
|Super_resolution
|102
|203
|280
|386
|610
|882
|UNET-sgmentation
|76
|148
|183
|217
|387
|584
|Pose Estimation
|280
|546
|665
|800
|1424
|2048
|Yolov3-tiny
|371
|731
|1156
|1440
|2611
|3179
|Resnet50
|621
|1158
|1725
|2183
|3717
|4834
|SSD-Mobilnet
|1094
|2156
|2893
|3457
|6415
|7671
|SSD_Resnet34_1200x1200
|18
|34
|52
|72
|120
|163
|Yolov5m
|69
|131
|162
|193
|342
|519
|Yolov5s
|158
|301
|379
|449
|785
|1135
- These Benchmarks were run using Jetpack 5.1.1
- Each Jetson module was run with maximum performance (Max Frequencies in MAXN for JAO64, JAO32, ONX16, ONX8; and 15W mode for JON8, and 10W mode for JON4)
- Steps to reproduce these results can be found here