Explore NVIDIA RAG Tools and Technologies

Click to Enlarge

How Retrieval-Augmented Generation Works

RAG enhances large language models (LLMs) by retrieving the most relevant and current information from external knowledge sources. Before a user can retrieve responses from a RAG pipeline, data must be ingested into the knowledge base.

Data Extraction: Multimodal, structured, and unstructured data is extracted from various formats and converted to text so it can be filtered, chunked, and fed into the retrieval pipeline.
Data Retrieval: Extracted data is passed to an embedding model to create knowledge embeddings that go into a vector database. When a user submits a query, the system embeds the query, retrieves relevant data from the vector database, reranks the results, and sends them to the LLM to return the most accurate and context-aware responses.

Evaluating RAG pipelines is crucial because these systems involve multiple interacting components, and mistakes or biases in the individual components can propagate through the system, leading to compounded errors in the generated output.

Explore RAG Technology

NVIDIA NeMo Retriever

NVIDIA NeMo™ Retriever is a collection of generative AI microservices for extraction, embedding, and reranking that enable developers to build pipelines that generate business insights in real time with high accuracy and maximum data privacy.

Get Started With NeMo Retriever

NVIDIA NeMo Agent Toolkit

NVIDIA NeMo Agent toolkit is an open-source library for framework-agnostic profiling, evaluation, and optimization of AI agent systems. By exposing hidden bottlenecks and costs, it helps enterprises scale agentic systems efficiently while maintaining reliability.

Get Started With NeMo Agent Toolkit

NVIDIA cuVS

NVIDIA cuVS is an open-source library for GPU-accelerated vector search and data clustering. It enables higher throughput, lower latency, and faster index build times, and improves the efficiency of semantic search within pipelines and applications such as information retrieval or RAG.

Get Started With cuVS

NVIDIA NeMo Curator

NVIDIA NeMo Curator provides prebuilt pipelines for generating synthetic data to customize and evaluate embedding models to improve the performance of RAG systems.

Get Started With NeMo Curator

NVIDIA NeMo Customizer

NVIDIA NeMo Customizer is a high-performance, scalable microservice that simplifies fine-tuning and alignment of generative AI models, including embedding models for domain-specific use cases, making it easier to adopt generative AI across industries.

Get Started With NeMo Customizer

NVIDIA NeMo Evaluator

NVIDIA NeMo Evaluator is a microservice for assessing generative AI models and RAG pipelines across academic and custom benchmarks on any platform.

Get Started With NeMo Evaluator

NVIDIA NIM

NVIDIA NIM™ is a set of easy-to-use microservices designed to accelerate the deployment of generative AI models across any cloud or data center.

Get Started With NIM

Explore NVIDIA AI Blueprints That Use RAG

NVIDIA Blueprints are reference workflows for AI use cases built with NVIDIA NIM and NeMo microservices. With these blueprints, developers can build production-ready agentic AI applications that empower employees with real-time insights, connecting them to AI query engines to enable transformational efficiency and productivity gains.

Enterprise RAG

Connect secure, scalable, reliable AI applications to your company’s internal enterprise data using industry-leading embedding and reranking models for information retrieval at scale.

Try Now

AI-Q Research Agent

AI-Q is an NVIDIA AI Blueprint for building AI agents that can access, query, and act on business knowledge using tools like advanced RAG and reasoning models, they transform enterprise data into an accessible, actionable resource.

Get Started With AI-Q

AI Assistants for Customer Service

Develop secure, context-aware virtual assistants that meet the unique needs of your business and enhance customer service operations.

Try Now

AI Agent for Video Search and Summarization

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A.

Try Now

Biomedical AI-Q Research Agent

Improve the efficiency and accuracy of various clinical development processes, including R&D, literature review, protocol generation, clinical trial screening, and pharmacovigilance.

Try Now

Retrieval-Augmented Generation

How Retrieval-Augmented Generation Works

Explore RAG Technology

NVIDIA NeMo Retriever

NVIDIA NeMo Agent Toolkit

NVIDIA cuVS

NVIDIA NeMo Curator

NVIDIA NeMo Customizer

NVIDIA NeMo Evaluator

NVIDIA NIM

Explore NVIDIA AI Blueprints That Use RAG

Enterprise RAG

AI-Q Research Agent

AI Assistants for Customer Service

AI Agent for Video Search and Summarization

Biomedical AI-Q Research Agent

Retrieval-Augmented Generation Learning Resources