AI Foundation

Feb 11, 2025
NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance
In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...
7 MIN READ

Jan 06, 2025
Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency
Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...
8 MIN READ

Dec 18, 2024
NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference
Recurrent drafting (referred as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...
6 MIN READ

Dec 17, 2024
Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner
Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...
5 MIN READ

Nov 21, 2024
Deploying Fine-Tuned AI Models with NVIDIA NIM
For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...
6 MIN READ

Nov 21, 2024
Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor
Industrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...
6 MIN READ

Oct 09, 2024
Develop Academic and Industrial Applications with a New Specialized Math Model
Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.
1 MIN READ

Oct 08, 2024
Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy
This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...
7 MIN READ

Oct 04, 2024
Just Released: NVIDIA TensorRT-LLM 0.13.0
Updates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.
1 MIN READ

Oct 03, 2024
New Reward Model Helps Improve LLM Alignment with Human Preferences
Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...
4 MIN READ

Sep 30, 2024
Improve Reinforcement Learning from Human Feedback with Leaderboard-Topping Reward Model
Llama 3.1 Nemotron 70B Reward model helps generate high-quality training data that aligns with human preferences for finance, retail, healthcare, scientific...
1 MIN READ

Sep 16, 2024
Generate code with Abacus AI’s Dracarys Large Language Model
Dracarys, fine-tuned from Llama 3.1 70B and available from NVIDIA NIM microservice, supports a variety of applications, including data analysis, text...
1 MIN READ

Sep 05, 2024
Low Latency Inference Chapter 1: Up to 1.9x Higher Llama 3.1 Performance with Medusa on NVIDIA HGX H200 with NVLink Switch
As large language models (LLMs) continue to grow in size and complexity, multi-GPU compute is a must-have to deliver the low latency and high throughput that...
5 MIN READ

Aug 13, 2024
New NIM Available: Mistral Large 2 Instruct LLM
The new model by Mistral excels at a variety of complex tasks including text summarization, multilingual translation and reasoning, programming, question and...
1 MIN READ

Jul 26, 2024
Power Text-Generation Applications with Mistral NeMo 12B Running on a Single GPU
NVIDIA collaborated with Mistral to co-build the next-generation language model that achieves leading performance across benchmarks in its class. With a growing...
6 MIN READ

Jul 25, 2024
Revolutionizing Code Completion with Codestral Mamba, the Next-Gen Coding LLM
In the rapidly evolving field of generative AI, coding models have become indispensable tools for developers, enhancing productivity and precision in software...
5 MIN READ