Posts by Vinh Nguyen
Agentic AI / Generative AI
Jun 18, 2025
LLM Inference Benchmarking: How Much Does Your LLM Inference Cost?
This is the fourth post in the large language model latency-throughput benchmarking series, which aims to instruct developers on how to determine the cost of...
10 MIN READ
Agentic AI / Generative AI
May 14, 2025
Build Custom Reasoning Models with Advanced, Open Post-Training Datasets
Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from...
5 MIN READ
Agentic AI / Generative AI
May 06, 2025
LLM Inference Benchmarking Guide: NVIDIA GenAI-Perf and NIM
This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. ...
11 MIN READ
Agentic AI / Generative AI
Apr 02, 2025
LLM Inference Benchmarking: Fundamental Concepts
This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM...
15 MIN READ
Agentic AI / Generative AI
Feb 12, 2025
LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework
Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. ...
10 MIN READ
Agentic AI / Generative AI
Oct 08, 2024
Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy
This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...
7 MIN READ