Megatron

Jun 23, 2026

Maximize AI Factory Energy Efficiency Through Full-Stack Inference and Training Optimizations

Power can account for 40% of the operating expenses (OpEx) to run an AI factory. Each watt can be spent on overhead, data ingestion, training, or generating...

10 MIN READ

Jun 15, 2026

Fine-Tuning Biological Foundation Models with LoRA Using NVIDIA BioNeMo Recipes

Foundation models are reshaping computational biology. Pretrained on massive corpora of protein or genomic sequences, models such as ESM2 (a protein language...

12 MIN READ

May 20, 2026

Mastering Agentic Techniques: AI Agent Customization

Autonomous AI agents are taking on all types of work for businesses: routing logistics fleets, triaging support tickets, generating code, and orchestrating...

16 MIN READ

Apr 22, 2026

Advancing Emerging Optimizers for Accelerated LLM Training with NVIDIA Megatron

Higher-order optimization algorithms such as Shampoo have been effectively applied in neural network training for at least a decade. These methods have...

9 MIN READ

Mar 09, 2026

Implementing Falcon-H1 Hybrid Architecture in NVIDIA Megatron Core

In the rapidly evolving landscape of large language model (LLM) development, NVIDIA Megatron Core has emerged as the foundational framework for training...

9 MIN READ

Jan 28, 2026

Speeding Up Variable-Length Training with Dynamic Context Parallelism and NVIDIA Megatron Core

This post introduces Dynamic Context Parallelism (Dynamic-CP), a scheduling approach in NVIDIA Megatron Core used for LLM post-training or DiT pre-training. It...

12 MIN READ

Aug 20, 2025

Reinforcement Learning with NVIDIA NeMo-RL: Megatron-Core Support for Optimized Training Throughput

The initial release of NVIDIA NeMo-RL included training support through PyTorch DTensor (otherwise known as FSDP2). This backend enables native integration...

7 MIN READ

Jul 12, 2024

Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities

First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of...

11 MIN READ