Getting Started With NVIDIA AI for Your Applications
NVIDIA offers a range of AI tools, software development kits (SDKs), and technologies that can be used to optimize and enhance applications for deployment on NVIDIA GPUs. For additional support, reach out with questions on the Developer Forums or the NVIDIA Developer Discord.
Post in Developer Forums Join Our Discord Channel
Profile and Optimize Your Pipeline
Profile Before Optimizing
Profiling allows you to pinpoint where optimization is critical and where small changes can have a big impact. Optimizing without first understanding where it is necessary may result in little to no performance gains.
NVIDIA Nsight™ Systems is a system-wide performance analysis tool that is simple to use and allows you to visualize CPU-GPU interaction, track GPU activity, and trace GPU workloads. Creating an application trace takes only a few minutes and provides all the insights you need to determine the optimization that promises the best return on investment.
Best Practices for Profiling and Optimizing
- Profile in a clean environment. Close other apps that utilize resources and add noise to the application trace.
- Triage then optimize. What is the bottleneck? Inference, pre- or post-processing, PCIe transfers?
- Reprofile after every change or optimization. Fixing one bottleneck might have unforeseeable side-effects on performance.
Download Our Best Practices Guide
Accelerate Your AI Pipeline
Choosing a Machine Learning Framework
Several factors come into play when selecting the optimal machine learning framework for deploying an AI model. Given the effort required to switch between frameworks, it’s important to ensure that the initial selection is the most appropriate for your needs. NVIDIA fully supports and recommends TensorRT and WinML for local deployment.
TensorRT is optimized for highest-performance inferencing on NVIDIA GPUs. It runs only on NVIDIA GPUs, while WinML can work with a variety of GPU hardware.
WinML | TensorRT | |
Direct deployment path from most frameworks via ONNX | ✓ | ✓ |
OS Support | Windows | Windows & Linux |
Hardware Support | Any GPU | NVIDIA GPUs |
Performance | Fast | Fastest |
Resources for WinML
Beginner
Blog: Using Windows ML, ONNX, and NVIDIA Tensor Cores
Blogs: End-to-End AI
Documentation: DirectML Execution Provider
Video: Workstation Inference With TensorRT, cuDNN, and WinML
Resources for TensorRT
Resources for Generative AI
Accelerate With DirectML
Use DirectML to accelerate generative AI applications. The benefit of DirectML is that the same optimization will run on any hardware.
How to Optimize Models like Stable Diffusion With Olive
Accelerate With TensorRT
Leverage optimizations in TensorRT 8.6 to accelerate generative AI models, such as Stable Diffusion, Llama2, Mistral-7B, and NVGPT-8B. The benefit of TensorRT is getting the best performance out of the GPU, seamlessly be it on NVIDIAs datacenter systems, or locally on native Windows with NVIDIA RTX Systems.
How to Optimize Models like Stable Diffusion With TensorRT
Example TRT Pipeline for Stable Diffusion
Demo application that showcases the acceleration of Stable Diffusion pipeline using TensorRT
TensorRT Extension for Stable Diffusion
New Stable Diffusion Models Accelerated with NVIDIA TensorRT
Blog: Unlock Faster Image Generation in Stable Diffusion Web UI with NVIDIA TensorRT
TensorRT Toolbox for Large Language Models
RAG on Windows using TensorRT-LLM and LlamaIndex
NVIDIA AI SDKs
SDKs provided by NVIDIA enable developers to seamlessly incorporate cutting-edge AI functionalities into their innovative applications, expanding the scope of their creativity and enhancing the overall user experience.
Video and Broadcast
Audio Effects SDK
The Audio Effects SDK delivers multi-effect, low-latency audio-quality enhancement algorithms, improving end-to-end conversation quality for narrowband, wideband, and ultra-wideband audio.
Augmented Reality SDK
The Augmented Reality SDK offers AI-powered, real-time 3D face tracking and body pose estimation based on a standard webcam feed. Developers can create unique AR effects such as overlaying 3D content on a face—driving 3D characters and virtual interactions in real time.
Video Effects SDK
The Video Effects SDK enables AI-based visual effects that run with standard webcam input and can be easily integrated into video conference and broadcast pipelines. The underlying deep learning models are optimized with NVIDIA AI using TensorRT for high-performance inference, enabling developers to apply multiple effects in real-time applications.
3D and Graphic Design
Audio2Face SDK
NVIDIA Omniverse™ Audio2Face beta is a reference application that simplifies animation of a 3D character to match any voice-over track, whether a user is animating characters for a game, film, real-time digital assistants, or just for fun.
NVIDIA DLSS
NVIDIA DLSS is a neural graphics technology that multiplies performance using AI to create entirely new frames and display higher resolution through image reconstruction—all while delivering best-in-class image quality and responsiveness.
OptiX Ray Tracing Engine
NVIDIA OptiX™ Ray Tracing Engine is an application framework for achieving optimal ray-tracing performance on the GPU. It provides a simple, recursive, and flexible pipeline for accelerating ray-tracing algorithms, including an advanced AI denoiser. Bring the power of NVIDIA GPUs to ray-tracing applications with programmable intersection, ray generation, and shading.
Photography
Data Loading Library
The NVIDIA Data Loading Library (DALI) is a portable, open-source library for decoding and augmenting images, videos, and speech to accelerate deep learning applications. DALI reduces latency and training time, and mitigates bottlenecks by overlapping training and preprocessing.
StyleGAN3
StyleGAN3 is a generative adversarial network (GAN) for creating high-quality, realistic images. It can generate high-quality, diverse images with a great level of control over different aspects of the generated images, such as facial features, hair, and clothing styles.
Learn More
Audio
Audio Effects SDK
The Audio Effects SDK delivers multi-effect, low-latency audio quality enhancement algorithms, improving end-to-end conversation quality for narrowband, wideband, and ultra-wideband audio.
NeMo
NVIDIA NeMo is an open-source framework for developers to build and train state-of-the-art conversational AI models. With NeMo, users can build models for real-time automated speech recognition (ASR), natural language processing (NLP), and text-to-speech (TTS) applications such as video call transcriptions and intelligent video assistants.
Riva
NVIDIA Riva is a GPU-accelerated speech AI SDK for building and deploying fully customizable, real-time AI pipelines that deliver world-class accuracy in all clouds, on premises, at the edge, and on embedded devices.
Resources: Examples of End-to-End Optimizations
Got a question? Ask through our forums and Discord channels below.