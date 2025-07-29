Technical Blog
FourCastNet 3 Enables Fast and Accurate Large Ensemble Weather Forecasting with Scalable Geometric ML

Jul 29, 2025
By Boris Bonev, Thorsten Kurth, Marius Koch, Dallas Foster, Niall Robinson, Andrea Paris, Mike Pritchard and Alexander Keller
A GIF of the earth.

FourCastNet3 (FCN3) is the latest AI global weather forecasting system from NVIDIA Earth-2. FCN3 offers an unprecedented combination of probabilistic skill, computational efficiency, spectral fidelity, ensemble calibration, and stability at subseasonal timescales. Its medium-range forecasting accuracy matches that of leading machine learning models, such as GenCast, and exceeds that of traditional numerical weather prediction systems, such as IFS-ENS. 

A single 60-day FCN3 rollout with 0.25° and 6-hourly resolution is computed in under four minutes on a single NVIDIA H100 Tensor Core GPU—an 8x speedup over GenCast and a 60x speedup over IFS-ENS. 

It also has remarkable calibration and spectral fidelity, with ensemble members retaining realistic spectral properties even at extended lead times of 60 days. FCN3 demonstrates a significant leap towards data-driven weather prediction with large ensembles from medium-range to subseasonal timescales.

5 FourCastNet3 ensemble members.
Figure 1. 2-week rollout of 15 FourCastNet3 ensemble members, displaying surface wind speeds during this period

FCN3 architecture 

FCN3 architecture diagram.
Figure 2. FCN3 is a neural operator for spherical signals that maps atmospheric and surface variables at the current time step to the next. Stochasticity is introduced through a hidden Markov model approach, which takes a spherical noise variable as conditioning input

FourCastNet3 employs a fully convolutional, spherical neural operator architecture,  based on spherical signal processing primitives (see Figure 2). Unlike FourCastNet2, which is based on the Spherical Fourier Neural Operator, FCN3 uses local spherical convolutions alongside spectral convolutions. 

These convolutions are parameterized using Morlet wavelets and formulated in the framework of discrete-continuous group convolutions. This approach enables anisotropic, localized filters well-suited to localized atmospheric phenomena, while also guaranteeing computational efficiency through a custom implementation in NVIDIA CUDA.

FourCastNet3 probabilistic scores.
Figure 3. Probabilistic scores of FCN3 computed on 12-hourly initial conditions throughout the out-of-sample validation year 2020. From top to bottom: continuously ranked probability score (CRPS), ensemble-mean root mean square error (RMSE), spread-skill ratio (SSR), and rank histograms are displayed

FCN3 introduces stochasticity at every predictive step through a latent noise variable whose evolution is governed by a diffusion process on the sphere. This hidden-Markov formulation enables efficient one-step generation of ensemble members—a key advantage over diffusion model-based approaches. FCN3 is trained jointly as an ensemble,  minimizing a composite loss function that combines the continuously ranked probability score (CRPS) in space and in the spectral domain. This approach ensures that FCN3 learns the correct spatial correlations in the underlying stochastic atmospheric processes.

Scaling ML models is often crucial to achieving competitive skill, but the effects of scale haven’t been investigated in data-driven weather models. FCN3 is unusual in its computational ambition. To scale it, we introduce a novel paradigm for model-parallelism inspired by domain decomposition in traditional numerical weather modeling. 

This approach enables us to fit larger models into VRAM during training by splitting the model across multiple devices, while lowering the disk I/O per device. To enable this, spatial operations such as convolutions are implemented in a distributed fashion using the NVIDIA Collective Communications Library (NCCL). Using this technology, FCN3 is trained on up to 1,024 GPUs, using simultaneous domain, batch, and ensemble parallelism. Check out our training code.

FourCastNet3 outperforms the best physics-based ensemble model, IFS-ENS, and matches Gencast in terms of predictive skill (see Figure 3). On a single NVIDIA H100, FCN3 produces a single 15-day forecast at 6 hourly temporal resolution and 0.25° spatial resolution in a minute—an 8x speedup over Gencast and a 60x speedup over IFS-ENS. 

Its probabilistic ensembles exhibit spread-skill ratios consistently near one, indicating well-calibrated forecasts where the predicted uncertainty aligns closely with observed atmospheric variability. Rank histograms and additional diagnostics confirm that ensemble members remain interchangeable with real-world observations, affirming the reliability and trustworthiness of FCN3’s predictions. 

Critically, FCN3 preserves atmospheric spectral signatures across all scales, faithfully reproducing the energy cascade and sharpness of real-world weather patterns even at extended lead times of up to 60 days. Unlike many ML models that blur high-frequency features or devolve into noisy artifacts over time, FCN3 maintains stable, physically realistic spectra—enabling accurate, sharp, and physically consistent forecasts well into the subseasonal range. 

This is shown in Figure 3, which depicts FCN3 predictions of 500 hPa wind intensities initialized on February 11, 2020, shortly before Storm Dennis made its landfall over Europe. FCN3 accurately captures the magnitude of wind intensities and their variability across different length scales, illustrated by the faithful angular power spectral density of the respective predictions. This remains even at extended rollouts of 30 days (720 hours) or longer.

Case study depicting FCN3 predictions of Storm Dennis. 
Figure 4. FourCastNet3 prediction of Atorm Dennis initialized on 2020-02-11 at 00:00:00 UTC. The plot depicts wind speeds at a pressure level of 850hPa and isohypses (height contours) of the 500hPa geopotential height.

Getting started with FourCastNet3

The fully trained FourCastNet3 checkpoint is available on NVIDIA NGC.

An easy way to run FCN3 inference is using Earth2Studio. To run a single 4-member ensemble inference, you can execute the following code:

from earth2studio.models.px import FCN3
from earth2studio.data import NCAR_ERA5
from earth2studio.io import NetCDF4Backend
from earth2studio.perturbation import Zero
from earth2studio.run import ensemble as run
import numpy as np

# load default package
model = FCN3.load_model(FCN3.load_default_package())

# determine output variables
out_vars = ["u10m", "v10m", "t2m", "msl", "tcwv"]

# data source initial condition
ds = NCAR_ERA5()
io = NetCDF4Backend("fcn3_ensemble.nc", backend_kwargs={"mode": "w"})

# no perturbation required due to hidden Markov formulation of FCN3
perturbation = Zero()

# invoke inference with 4 ensemble members
run(time=["2024-09-24"],
    nsteps=16,
    nensemble=4,
    prognostic=model,
    data=ds,
    io=io,
    perturbation=perturbation,
    batch_size=1,
    output_coords={"variable": np.array(out_vars)},
)

Results from this inference are depicted in Figure 4. For optimal FCN3 performance, we recommend installing torch-harmonics with custom CUDA extensions enabled and using automatic mixed precision in bf16 format during inference (which is the default in Earth2Studio). If you want to run custom FCN3 inference or train it yourself, you can find the code in makani

FCN3 predictions depicting total column water vapour and 10-meter zonal wind velocity alongside their respective ensemble standard deviations.
Figure 4. FourCastNet3 predictions at 96h lead time generated with the Earth2Studio script. The run was initialized on 2024-09-24 at 00:00:00 UTC. The top row depicts the tcwv (total column water vapor) field and the u10m (10-meter zonal wind velocity) field of ensemble member 2, respectively. The bottom row shows the standard deviation of both fields taken over all four ensemble members

Learn more about FCN3

Learn more about FourCastNet3 with these resources:

Full author list

Boris Bonev (NVIDIA), Thorsten Kurth (NVIDIA), Ankur Mahesh (LBNL), Mauro Bisson (NVIDIA), Karthik Kashinath (NVIDIA), Anima Anandkumar (Caltech), William D. Collins (LBNL), Mike Pritchard (NVIDIA), Alex Keller (NVIDIA)

About the Authors

Avatar photo
About Boris Bonev
Boris Bonev received his Ph.D. in applied mathematics with a focus on numerical methods for partial differential equations. At NVIDIA, he works on accelerated scientific computing involving novel algorithms and machine learning. He is excited about creating scalable algorithms from mathematical principles and mapping them to HPC systems.
Avatar photo
About Thorsten Kurth
Thorsten Kurth works at NVIDIA on optimizing scientific codes for GPU-based supercomputers. His focus is on providing optimized deep learning applications for HPC systems, including MLPerf HPC benchmark applications. These include end-to-end optimizations such as input pipeline including I/O tuning and distributed training. In 2018, he was awarded the Gordon Bell Prize for the first deep learning application that achieved more than 1 exaop peak performance on the OLCF Summit HPC system. In 2020, he was awarded the Gordon Bell Special Prize for HPC-based Covid-19 research for efficiently generating large ensembles of scientifically relevant spike trimer confirmations using the AI-driven MD simulations workflow.
Avatar photo
About Marius Koch
Marius Koch is a solution architect for climate and sustainability at NVIDIA, specializing in AI-driven physics modeling for climate change prediction, mitigation, and adaptation. His work focuses on risk assessment of extreme weather events and on CO2 storage in subsurface reservoirs. He holds a PhD in Aeronautics from Imperial College London and an MSc in Aerospace Engineering from the University of Stuttgart.
Avatar photo
About Dallas Foster
Dallas Foster is a senior deep learning software engineer for HPC and AI at NVIDIA. He received his PhD in mathematics at Oregon State University and has worked at Los Alamos National Laboratory, the National Center for Atmospheric Research, and MIT. As a member of the PhysicsNeMo team at NVIDIA, he has a particular focus on the application and deployment of deep learning for weather forecasting and molecular dynamics.
Avatar photo
About Niall Robinson
Niall Robinson is the developer relationship manager for Earth-2, NVIDIA’s platform for weather and climate. He collaborates with partners to develop innovative new solutions using Earth-2. Before NVIDIA, Niall worked at the U.K. Met Office, applying emerging technologies to solve problems. He started his career as a climate scientist working in decadal forecasting and air-quality science. He has a Ph.D. in atmospheric science from Manchester University.
Avatar photo
About Andrea Paris
Andrea Paris is an intern in the DevTech HPC Visualization group at NVIDIA. Prior to this, he was involved in climate research at NASA’s Jet Propulsion Laboratory and MIT, where he focused on high-performance computing and numerical methods. He holds a Master’s degree in Mechanical Engineering from ETH Zurich.
Avatar photo
About Mike Pritchard
Mike Pritchard is the director of climate simulation research for NVIDIA Research, where he works on Earth-2. He leads a team of researchers who are exploring the potential of AI to advance weather and climate simulation. Mike previously served as a professor of Earth system sciences at the University of California, Irvine, where he led a research lab studying climate dynamics and next-generation atmospheric simulation algorithms starting in 2013.
Avatar photo
About Alexander Keller
Alexander Keller is a senior director of research at NVIDIA working on the foundations of graphics, communications, and machine learning. Before his current role, he was the chief scientist of mental images. Prior to moving to industry, he worked as a full professor of computer graphics and scientific computing at Ulm University. Taking advantage of the unique synergy of machine learning and ray tracing, his research group released the first differentiable link-level simulator for 6G research.

