TensorRT

Apr 22, 2024

Turbocharging Meta Llama 3 Performance with NVIDIA TensorRT-LLM and NVIDIA Triton Inference Server

We're excited to announce support for the Meta Llama 3 family of models in NVIDIA TensorRT-LLM, accelerating and optimizing your LLM inference performance. You...

9 MIN READ

Apr 02, 2024

Tune and Deploy LoRA LLMs with NVIDIA TensorRT-LLM

Large language models (LLMs) have revolutionized natural language processing (NLP) with their ability to learn from massive amounts of text and generate fluent...

15 MIN READ

Mar 18, 2024

Translate Your Enterprise Data into Actionable Insights with NVIDIA NeMo Retriever

Across every industry, and every job function, generative AI is activating the potential within organizations—turning data into knowledge and empowering...

9 MIN READ

Four images compared against three modes for quality.

Mar 07, 2024

NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization

In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models...

7 MIN READ

Four images of products against enhanced backgrounds.

Mar 07, 2024

Generate Stunning Images with Stable Diffusion XL on the NVIDIA AI Inference Platform

Diffusion models are transforming creative workflows across industries. These models generate stunning images based on simple text or image inputs by...

14 MIN READ

Feb 26, 2024

Detecting Real-Time Waste Contamination Using Edge Computing and Video Analytics

The past few decades have witnessed a surge in rates of waste generation, closely linked to economic development and urbanization. This escalation in waste...

8 MIN READ

Feb 13, 2024

Top Inference for Large Language Models Sessions at NVIDIA GTC 2024

Learn how inference for LLMs is driving breakthrough performance for AI-enabled applications and services.

1 MIN READ

Retrieval-Augmented Generation Conference Sessions at GTC

Feb 06, 2024

Top Retrieval-Augmented Generation (RAG) Sessions at NVIDIA GTC 2024 Sessions

Join us in-person or virtually and learn about the power of RAG with insights and best practices from experts at NVIDIA, visionary CEOs, data scientists, and...

1 MIN READ

Feb 01, 2024

Deploy an AI Coding Assistant with NVIDIA TensorRT-LLM and NVIDIA Triton

Large language models (LLMs) have revolutionized the field of AI, creating entirely new ways of interacting with the digital world. While they provide a good...

12 MIN READ

Jan 29, 2024

Emulating the Attention Mechanism in Transformer Models with a Fully Convolutional Network

The past decade has seen a remarkable surge in the adoption of deep learning techniques for computer vision (CV) tasks. Convolutional neural networks (CNNs)...

13 MIN READ

Decorative image of a workflow and the text "Part 3".

Jan 16, 2024

Robust Scene Text Detection and Recognition: Inference Optimization

In this post, we delve deeper into the inference optimization process to improve the performance and efficiency of our machine learning models during the...

9 MIN READ

Decorative image of a workflow and the text "Part 2".

Jan 16, 2024

Robust Scene Text Detection and Recognition: Implementation

To make scene text detection and recognition work on irregular text or for specific use cases, you must have full control of your model so that you can do...

6 MIN READ

Decorative image of a workflow and the text "Part 1".

Jan 16, 2024

Robust Scene Text Detection and Recognition: Introduction

Identification and recognition of text from natural scenes and images become important for use cases like video caption text recognition, detecting signboards...

8 MIN READ

Jan 11, 2024

Free Digital Webinar Series: How to Get Started with AI Inference

Learn how to improve your AI model performance with this series of expert-led talks on the NVIDIA AI inference platform.

1 MIN READ

Photo of a dog racing through a snowy forest.

Jan 08, 2024

New Stable Diffusion Models Accelerated with NVIDIA TensorRT

At CES, NVIDIA shared that SDXL Turbo, LCM-LoRA, and Stable Video Diffusion are all being accelerated by NVIDIA TensorRT. These enhancements allow GeForce RTX...

2 MIN READ

Jan 08, 2024

Contest: Build Generative AI on NVIDIA RTX PCs

NVIDIA is announcing the Generative AI on RTX PCs Developer Contest - designed to inspire innovation within the developer community. Build and submit your next...

1 MIN READ