Generative AI for Developers

Generative AI has introduced a new wave of developer tools, frameworks and applications. The vastly expanding ecosystem helps train massive multimodal models, fine-tune for use cases, quantize and deploy from data centers to the smallest embedded devices. Developers building generative AI applications need an accelerated computing platform with full-stack optimizations, from chip and systems software to acceleration libraries and application development frameworks. With NVIDIA’s leading model APIs, it’s easy to get started.

Get Started

NVIDIA Full-Stack Generative AI Software Ecosystem

NVIDIA offers a full-stack accelerated computing platform purpose-built for generative AI workloads. The platform is both deep and wide, offering a combination of hardware, software, and services—all built by NVIDIA and its broad ecosystem of partners—so developers can deliver cutting-edge solutions.



Diagram showing NVIDIA Full-Stack Generative AI Software Ecosystem

Building applications for specific use cases and domains requires user-friendly APIs, efficient fine-tuning techniques, and, in the context of large language model (LLM) applications, integration with robust third-party apps, vector databases, and guardrailing systems. NVIDIA offers AI Foundation models and endpoints, including popular open-source community models such as Llama 2, Stable Diffusion, and ESM2, enabling developers to quickly build custom generative AI applications.

Our software stack powers partners like OpenAI, Cohere, Google VertexAI, and AzureML, allowing developers to use generative AI API endpoints. For domain-specific customization or augmenting applications with databases, in addition to NVIDIA NeMo™, NVIDIA’s ecosystem includes Hugging Face, LangChain, LlamaIndex, and Milvus.


To deploy safe, trustworthy models, NeMo provides simple tools for evaluating trained and fine-tuned models, including GPT and its variants. Developers can also add programmable guardrails with NeMo Guardrails to control the output of LLM applications, such as implementing controls to avoid discussing politics and tailoring responses based on user requests.

MLOps and LLMOps tools further assist in evaluating LLM models. NVIDIA NeMo can be integrated with LLMOps tools such as Weights & Biases and MLFlow. Developers can also use NVIDIA Triton™ Inference Server to analyze model performance and standardize AI model deployment.


Accelerating specific generative AI computations on compute infrastructure requires libraries and compilers that are specifically designed to address the needs of LLMs. Some of the most popular libraries include XLA, Megatron-LM, CUTLASS, CUDA®, NVIDIA® TensorRT™-LLM, RAFT, and cuDNN.


Building large-scale models often requires upwards of thousands of GPUs, and inferencing is done on multi-node, multi-GPU configurations to address memory-limited bandwidth issues. This requires software that can carefully orchestrate the different generative AI workloads on accelerated infrastructure. Some management and orchestration libraries include Kubernetes, Slurm, Nephele, and NVIDIA Base Command™.

NVIDIA-accelerated computing platforms provide the infrastructure to power these applications in the most cost-optimized way, whether they’re run in a data center, the cloud, or on local desktops and laptops. Powerful platforms and technologies include NVIDIA DGX™ platform, NVIDIA HGX™ systems, NVIDIA RTX™ systems, and NVIDIA Jetson™.


Build With Generative AI

Developers can choose to engage with the NVIDIA AI platform at any layer of the stack, from infrastructure, software, and models to applications, either directly through NVIDIA products or through a vast ecosystem of offerings.

Start With State-of-the-Art Foundation Models

Try the latest models, including Llama 2, Stable Diffusion, and NVIDIA’s Nemotron-3 8B family.


Experience AI Foundation Models

Optimize and Deploy Models Across Platforms

Optimize and quantize models with TensorRT-LLM, and deploy them using NeMo in the data center, in the cloud, and on PCs with NVIDIA RTX GPUs.

Apply Optimization Techniques

Connect Generative AI Models to Knowledge Bases

Use retrieval-augmented generation (RAG) to connect LLMs to the latest information.


Try a RAG Example on GitHub

Train and Optimize Generative AI for Every Industry

Develop, train, and deploy generative AI models for industries, including gaming, healthcare, automotive, and industrial.


Implement Gen AI for Industries

Best Practices for LLM Application Development

Tune in to hands-on sessions with NVIDIA experts to learn about state-of-the-art models, customization and optimization techniques, and how to run your own LLM apps.

Watch Sessions on Demand

Benefits

Decorative image of a comprehensive, full-stack platform

End-to-End Accelerated Stack

Accelerates every layer of the stack, from infrastructure to the app layer, with offerings from DGX Cloud to NeMo.

Decorative image of product availability and choice

High Performance

Delivers real-time performance with GPU optimizations, including quantization-aware training, layer and tensor fusion, and kernel tuning.

 Decorative image of state-of-the-art computing performance

Ecosystem Integrations

Tightly integrates with leading generative AI frameworks. For example, NVIDIA NeMo's connectors enable the use of NVIDIA AI Foundation models and TensorRT-LLM optimizations within the LangChain framework for RAG agents.

Access Exclusive NVIDIA Resources

The NVIDIA Developer Program gives you access to training, documentation, how-to guides, expert forums, support from peers and domain experts, and information on the right hardware to tackle the biggest challenges.


Join the NVIDIA Developer Program

A collage of images showing hands-on technical training and certification programs

Get Generative AI Training and Certification

Elevate your technical skills in generative AI and LLMs with NVIDIA Training’s comprehensive learning paths, covering fundamental to advanced topics, featuring hands-on training, and delivered by NVIDIA experts. Showcase your skills and advance your career by getting certified by NVIDIA.

Explore Training
A group of developers are working with NVIDIA experts

Connect With NVIDIA Experts

Have questions as you’re getting started? Explore our NVIDIA Developer Forum for AI to get your questions answered or explore insights from other developers.

Visit Forums
NVIDIA Inception program for generative AI startups

Build Your Custom Generative AI With NVIDIA Partners

For generative AI startups, NVIDIA Inception provides access to the latest developer resources, preferred pricing on NVIDIA software and hardware, and exposure to the venture capital community. The program is free and available to tech startups of all stages.

Learn More NVIDIA Inception

Latest News

Explore what’s new and learn about our latest breakthroughs.

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

Google's state-of-the-art, new, lightweight, 2-billion and 7-billion-parameter open language model, Gemma, is optimized with NVIDIA TensorRT-LLM and can run anywhere, reducing costs and speeding up innovative work for domain-specific use cases.

Learn More
NVIDIA Reveals Gaming, Creating, Generative AI, Robotics Innovations at CES

NVIDIA Reveals Gaming, Creating, Generative AI, Robotics Innovations at CES

At CES, NVIDIA released the TensorRT-LLM library for Windows, announced NVIDIA Avatar Cloud Engine (ACE) microservices with generative AI models for digital avatars, and unveiled a partnership with iStock by Getty Images, a generative AI service powered by NVIDIA Picasso.

Learn More
Amgen to Build Generative AI Models for Novel Human Data Insights and Drug Discovery

Amgen to Build Generative AI Models for Novel Human Data Insights and Drug Discovery

Amgen, an early adopter of NVIDIA BioNeMo™, uses it to accelerate drug discovery and development with generative AI models. They plan to integrate the NVIDIA DGX SuperPOD™ to train state-of-the-art models in days rather than months.

Learn More

Get Started With Generative AI

Scale Your Business Applications With Generative AI

Experience, prototype, and deploy AI with production-ready APIs that run anywhere.

Get Started

Enterprise-Ready Generative AI With NVIDIA AI Enterprise

The NVIDIA AI Enterprise subscription includes production-grade software, accelerating enterprises to the leading edge of AI with easy-to-deploy microservices, enterprise support, security, and API stability.

Learn More NVIDIA AI Enterprise Talk to an Expert