Generative AI


Generative AI is a  type of artificial intelligence that uses neural networks to learn patterns from existing data and generate new, original text, image, audio, and video content.

 A stack diagram of generative AI hardware and software solutions

Click to Enlarge

How Generative AI Works

Generative AI models learn by recognizing patterns and structures within massive datasets of text, code, images, audio, video, and other data. These models use neural networks, often transformer networks, to process the information. Developers can then leverage the models to generate new content, enhance existing content, or create entirely new AI-powered applications. Retrieval-augmented generation (RAG) takes this further by integrating external knowledge sources, enabling AI to retrieve and synthesize up-to-date and contextually relevant information. This approach improves accuracy and can be used for tasks like creating realistic images from text descriptions, generating musical compositions, or building intelligent AI chatbots that can engage in human-like conversations.

Explore RAG Tools and Technologies

Explore Generative AI Tools and Technologies

NVIDIA Nemotron

NVIDIA Nemotron™ is a family of most open and efficient multimodal models, with open datasets and recipes for building agentic AI.

NVIDIA Cosmos

NVIDIA Cosmos™ is a platform of state-of-the-art generative world foundation models and data processing pipelines that accelerate the development of highly performant physical AI systems, such as robots and self-driving cars.

NVIDIA NIM

NVIDIA NIM™ is a set of easy-to-use microservices designed to accelerate the deployment of generative AI models across any cloud or data center.

NVIDIA Dynamo

NVIDIA Dynamo is an open-source, low-latency inference framework for serving generative AI models in distributed environments. It scales inference workloads across large GPU fleets with optimized resource scheduling, memory management, and data transfer, and it supports all major AI inference backends.

NVIDIA TensorRT

NVIDIA TensorRT™ is an ecosystem of APIs for high-performance deep learning inference. TensorRT includes an inference runtime and model optimizations that deliver low latency and high throughput for production applications.  

AI-Q NVIDIA Blueprint

AI-Q is an NVIDIA AI Blueprint for building AI agents that can access, query, and act on business knowledge using tools like advanced RAG and reasoning models. They transform enterprise data into an accessible, actionable resource.

NVIDIA AI Blueprints

NVIDIA AI Blueprints are comprehensive reference workflows that accelerate AI application development and deployment. They feature NVIDIA acceleration libraries, SDKs, and microservices for AI agents, digital twins, and more.

NVIDIA Riva

NVIDIA Riva is a GPU-accelerated multilingual speech and translation AI SDK for building and deploying fully customizable, real-time conversational AI pipelines.

Manage the AI Agent Lifecycle With NVIDIA NeMo

NVIDIA NeMo Data Designer

NVIDIA NeMo™ Data Designer generates high-quality, domain-specific synthetic data from scratch or seed examples—accelerating model development while eliminating privacy risks and data collection bottlenecks.

NVIDIA NeMo Curator

NVIDIA NeMo Curator provides pre-built accelerated pipelines to process multimodal data at scale, improving the performance of agentic systems.

NVIDIA NeMo Customizer

NVIDIA NeMo Customizer is a high-performance, scalable microservice that simplifies fine-tuning and alignment of AI models for domain-specific use cases, making it easier to adopt generative AI across industries.

NVIDIA NeMo Evaluator

NVIDIA NeMo Evaluator is an SDK and microservice for evaluating generative AI models, RAG pipelines, and agents, with over 100 benchmarks and custom metrics across any environment.

NVIDIA NeMo Framework

NVIDIA NeMo Framework supports pretraining, post-training, and reinforcement learning of LLMs and multimodal generative AI models with state-of-the-art data processing, optimized large-scale training techniques, and flexible deployment options.

NVIDIA NeMo Retriever

NVIDIA NeMo Retriever is a collection of generative AI microservices that enable organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses.

NVIDIA NeMo Guardrails

NVIDIA NeMo Guardrails orchestrates dialog management, ensuring accuracy, appropriateness, and security in smart applications with LLMs. It safeguards organizations overseeing generative AI systems.

NVIDIA NeMo Agent Toolkit

NVIDIA® NeMo Agent Toolkit is an open-source AI framework that is interoperable with other frameworks and supports end-to-end optimization of complex agentic systems. By exposing hidden bottlenecks and costs, it helps enterprises scale agentic systems efficiently while maintaining reliability.

Generative AI Learning Resources