Building truly photorealistic 3D environments for simulation is challenging. Even with advanced neural reconstruction methods such as 3D Gaussian Splatting (3DGS) and 3D Gaussian with Unscented Transform (3DGUT), rendered views can still contain artifacts such as blurriness, holes, or spurious geometry—especially from novel viewpoints. These artifacts significantly reduce visual quality and can impede downstream tasks.
NVIDIA Omniverse NuRec brings real-world sensor data into simulation and includes a generative model, known as Fixer, to tackle this problem. Fixer is a diffusion-based model built on the NVIDIA Cosmos Predict world foundation model (WFM) that removes rendering artifacts and restores detail in under-constrained regions of a scene. This post walks you through how to use Fixer to transform a noisy 3D scene into a crisp, artifact-free environment ready for autonomous vehicle (AV) simulation. It covers using Fixer both offline during scene reconstruction and online during rendering, using a sample scene from the NVIDIA Physical AI open datasets on Hugging Face.
Step 1: Download a reconstructed scene
To get started, find a reconstructed 3D scene that exhibits some artifacts. The PhysicalAI-Autonomous-Vehicles-NuRec dataset on Hugging Face provides over 900 reconstructed scenes captured from real-world drives. First log in to Hugging Face and agree to the dataset license. Then download a sample scene, provided as a USDZ file containing the 3D environment. For example, using the Hugging Face CLI:
pip install huggingface_hub[cli] # install HF CLI if needed
hf auth login
# (After huggingface-cli login and accepting the dataset license)
hf download nvidia/PhysicalAI-Autonomous-Vehicles-NuRec \
--repo-type dataset \
--include "sample_set/25.07_release/Batch0005/7ae6bec8-ccf1-4397-9180-83164840fbae/camera_front_wide_120fov.mp4" \
--local-dir ./nurec-sample
This command downloads the scene’s preview video (camera_front_wide_120fov.mp4) to your local machine. Fixer operates on images, not USD or USDZ files directly, so using the video frames provides a convenient set of images to work with.
Next, extract frames with FFmpeg and use those images as input for Fixer:
# Create an input folder for Fixer
mkdir -p nurec-sample/frames-to-fix
# Extract frames
ffmpeg -i "sample_set/25.07_release/Batch0005/7ae6bec8-ccf1-4397-9180-83164840fbae/camera_front_wide_120fov.mp4" \
-vf "fps=30" \
-qscale:v 2 \
"nurec-sample/frames-to-fix/frame_%06d.jpeg"
Video 1 is the preview video showcasing the reconstructed scene and its artifacts. In this case, some surfaces have holes or blurred textures due to limited camera coverage. These artifacts are exactly what Fixer is designed to address.
Step 2: Set up the Fixer environment
Next, set up the environment to run Fixer.
Before proceeding, make sure you have Docker installed and GPU access enabled. Then complete the following steps to prepare the environment.
Clone the Fixer repository
This obtains the necessary scripts for subsequent steps:
git clone https://github.com/nv-tlabs/Fixer.git
cd Fixer
Download the pretrained Fixer checkpoint
The pretrained Fixer model is hosted on Hugging Face. To fetch this, use the Hugging Face CLI:
# Create directory for the model
mkdir -p models/
# Download only the pre-trained model to models/
hf download nvidia/Fixer --local-dir models
This will save the required files needed for inference in Step 3 to the models/ folder.
Step 3: Use online mode for real-time inference with Fixer
Online mode refers to using Fixer as a neural enhancer during rendering for fixing each frame during the simulation. Use the pretrained Fixer model for inference, which can run inside the Cosmo2-Predict Docker container.
Note that Fixer enhances rendered images from your scene. Make sure your frames are exported (for example, into examples/) and pass that folder to --input.
To run Fixer on all images in a directory, run the following steps:
# Build the container
docker build -t fixer-cosmos-env -f Dockerfile.cosmos .
# Run inference with the container
docker run -it --gpus=all --ipc=host \
-v $(pwd):/work \
-v /path/to/nurec-sample/frames-to-fix:/input \
--entrypoint python \
fixer-cosmos-env \
/work/src/inference_pretrained_model.py \
--model /work/models/pretrained/pretrained_fixer.pkl \
--input /input \
--output /work/output \
--timestep 250
Details about this command include the following:
- The current directory is mounted into the container at /work, allowing the container to access the files
- The directory is mounted in the frames extracted from the sample video through FFmpeg
- The script inference_pretrained_model.py (from the cloned Fixer repo src/ folder) loads the pre-trained Fixer model from the given path
--inputis the folder of input images (for example, examples/ contains some rendered frames with artifacts)--outputis the folder where enhanced images will be saved (where output is specified)--timestep250 represents the noise level the model uses for the denoising process
After running this command, the output/ directory will contain the fixed images. Note that the first few images may process more slowly as the model initializes, but inference will speed up for subsequent frames once the model is running.
Step 4: Evaluate the output
After applying Fixer to your images, you can evaluate how much it improved your reconstruction quality. This post reports Peak Signal-to-Noise Ratio (PSNR), a common metric for measuring pixel-level accuracy. Table 1 provides an example before/after comparison of the sample scene.
| Metric | Without Fixer | With Fixer |
| PSNR ↑ (accuracy) | 16.5809 | 16.6147 |
Note that if you try using other NuRec scenes from the Physical AI Open Datasets, or your own neural reconstructions, you can measure the quality improvement of Fixer with the metrics. Refer to the metrics documentation for instructions on how to compute these values.
In qualitative terms, scenes processed with Fixer look significantly more realistic. Surfaces that were previously smeared are now reconstructed with plausible details, fine textures such as road markings become sharper, and the improvements remain consistent across frames without introducing noticeable flicker.
Additionally, Fixer is effective at correcting artifacts when novel view synthesis is introduced. Video 3 shows the application of Fixer to a NuRec scene rendered from a novel viewpoint obtained by shifting the camera 3 meters to the left. When run on top of the novel view synthesis output, Fixer reduces view-dependent artifacts and improves the perceptual quality of the reconstructed scene.
Summary
This post walked you through downloading a reconstructed scene, setting up Fixer, and running inference to clean rendered frames. The outcome is a sharper scene with fewer reconstruction artifacts, enabling more reliable AV development.
To use Fixer with Robotics NuRec scenes, download a reconstructed scene from the PhysicalAI-Robotics-NuRec dataset on Hugging Face and follow the steps presented in this post.
Ready for more? Learn how Fixer can be post-trained to match specific ODDs and sensor configurations. For information about how Fixer can be used during reconstruction (Offline mode), see Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models.
Learn more about NVIDIA Research at NeurIPS.
At the forefront of AI innovation, NVIDIA Research continues to push the boundaries of technology in machine learning, self-driving cars, robotics, graphics, simulation, and more. Explore the cutting-edge breakthroughs now.
Stay up to date by subscribing to NVIDIA news and following NVIDIA Omniverse on Discord and YouTube.
- Visit our Omniverse developer page to get all the essentials you need to get started
- Access a collection of OpenUSD resources, including the self-paced Learn OpenUSD, Digital Twins, and Robotics training curriculums
- Tune into upcoming OpenUSD Insiders livestreams and connect with the NVIDIA Developer Community
Get started with developer starter kits to quickly develop and enhance your own applications and services.