New Stable Diffusion Models Accelerated with NVIDIA TensorRT

Photo of a dog racing through a snowy forest.

At CES, NVIDIA shared that SDXL Turbo, LCM-LoRA, and Stable Video Diffusion are all being accelerated by NVIDIA TensorRT. These enhancements allow GeForce RTX GPU owners to generate images in real-time and save minutes generating videos, vastly improving workflows.

Video 1. Accelerate Stable Diffusion with NVIDIA RTX GPUs

SDXL Turbo

SDXL Turbo achieves state-of-the-art performance with a new distillation technology, enabling single-step image generation. NVIDIA hardware, accelerated by Tensor Cores and TensorRT, can produce up to four images per second, giving you access to real-time SDXL image generation for the first time ever. For more information about non-commercial and commercial use, see the Stability AI Membership page. 

Download the SDXL Turbo model on Hugging Face.


Low-Rank Adaptation (LoRA) is a training technique for fine-tuning Stable Diffusion models. Combined with the latent consistency model (LCM), a LoRA checkpoint enables you to drastically reduce the number of sampling steps needed to produce a Stable Diffusion image. This improves speed dramatically at the cost of an image quality hit. LCM-LoRA can run ~9x faster because it uses only four steps (compared to 50 steps traditionally) and is accelerated by TensorRT optimizations. 

Download the LCM-LoRA model on Hugging Face.

Stable Video Diffusion

Stable Video Diffusion by Stability AI is their first foundation model for generative video based on the image model Stable Diffusion. Stable Video Diffusion runs up to 40% faster with TensorRT, potentially saving up to minutes per generation. For more information about non-commercial and commercial use, see the Stability AI Membership page. 

The Stable Video Diffusion model will be available for download soon.

Get started with Stable Diffusion

To download the Stable Diffusion Web UI TensorRT extension, see the NVIDIA/Stable-Diffusion-WebUI-TensorRT GitHub repo. The newly released update to this extension includes TensorRT acceleration for SDXL, SDXL Turbo, and LCM-LoRA. 

For a demo showcasing the acceleration of a Stable Diffusion pipeline, see NVIDIA/TensorRT. For more information about the Automatic 1111 TensorRT extension, see TensorRT Extension for Stable Diffusion Web UI.

Have an idea for a generative AI-powered Windows app or plugin? Enter the NVIDIA Generative AI on RTX PCs Developer Contest and you could win a GeForce RTX 4090 GPU, a full GTC in-person conference pass, and more.

