NeMo
Sep 10, 2024
Streamlining Data Processing for Domain Adaptive Pretraining with NVIDIA NeMo Curator
Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building domain-specific models. These models demonstrate...
16 MIN READ
Sep 10, 2024
Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer
As large language models (LLMs) are becoming even bigger, it is increasingly important to provide easy-to-use and efficient deployment paths because the cost of...
10 MIN READ
Sep 05, 2024
Achieving State-of-the-Art Zero-Shot Waveform Audio Generation across Audio Types
Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously...
6 MIN READ
Sep 05, 2024
Low Latency Inference Chapter 1: Up to 1.9x Higher Llama 3.1 Performance with Medusa on NVIDIA HGX H200 with NVLink Switch
As large language models (LLMs) continue to grow in size and complexity, multi-GPU compute is a must-have to deliver the low latency and high throughput that...
5 MIN READ
Aug 28, 2024
Build an Enterprise-Scale Multimodal PDF Data Extraction Pipeline with an NVIDIA NIM Agent Blueprint
Trillions of PDF files are generated every year, each file likely consisting of multiple pages filled with various content types, including text, images,...
8 MIN READ
Aug 21, 2024
Practical Strategies for Optimizing LLM Inference Sizing and Performance
As the use of large language models (LLMs) grows across many applications, such as chatbots and content creation, it's important to understand the process of...
2 MIN READ
Aug 21, 2024
Mistral-NeMo-Minitron 8B Foundation Model Delivers Unparalleled Accuracy
Last month, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading state-of-the-art large language model (LLM). Mistral NeMo 12B consistently outperforms...
5 MIN READ
Aug 16, 2024
Leverage the Latest Open Models for Synthetic Data Generation with NVIDIA Nemotron-4 340B
Since the introduction and subsequent wide adoption of Large Language Models (LLMs) – data has been the lifeblood of businesses building accurate and safe AI...
9 MIN READ
Aug 15, 2024
NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support
NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques...
5 MIN READ
Aug 14, 2024
How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model
Large language models (LLM) are now a dominant force in natural language processing and understanding, thanks to their effectiveness and versatility. LLMs such...
12 MIN READ
Aug 07, 2024
Building AI Agents with NVIDIA NIM Microservices and LangChain
NVIDIA NIM, part of NVIDIA AI Enterprise, now supports tool-calling for models like Llama 3.1. It also integrates with LangChain to provide you with a...
3 MIN READ
Aug 06, 2024
Accelerating Hebrew LLM Performance with NVIDIA TensorRT-LLM
Developing a high-performing Hebrew large language model (LLM) presents distinct challenges stemming from the rich and complex nature of the Hebrew language...
8 MIN READ
Aug 05, 2024
Securing Generative AI Deployments with NVIDIA NIM and NVIDIA NeMo Guardrails
As enterprises adopt generative AI applications powered by large language models (LLMs), there is an increasing need to implement guardrails to ensure safety...
6 MIN READ
Aug 05, 2024
Developing Robust Georgian Automatic Speech Recognition with FastConformer Hybrid Transducer CTC BPE
Building an effective automatic speech recognition (ASR) model for underrepresented languages presents unique challenges due to limited data resources. In...
9 MIN READ
Aug 01, 2024
Deliver Personalized Retail Experiences with an AI-Powered Shopping Advisor
Imagine being able to put your best sales associate in front of every customer for every interaction. Your best sales associate offers product recommendations...
4 MIN READ
Jul 31, 2024
Curating Custom Datasets for LLM Parameter-Efficient Fine-Tuning with NVIDIA NeMo Curator
In a recent post, we discussed how to use NVIDIA NeMo Curator to curate custom datasets for pretraining or continuous training use cases of large language...
11 MIN READ