Pioneering AI Co-Scientists for Fusion Research and Cancer Treatment

AI is reshaping scientific research and innovation. Scientists can leverage AI to generate, summarize, combine, and analyze scientific data. AI models can find patterns in data that human scientists have overlooked, find connections between seemingly unrelated fields and phenomena, and even propose new hypotheses to be tested.

An AI co-scientist is a collaborative, multi-agent AI system designed to assist human researchers in generating, reviewing, and refining novel scientific hypotheses, research proposals, and experimental plans. It is a virtual scientific partner that leverages advanced reasoning, interdisciplinary knowledge synthesis, and iterative feedback to accelerate scientific discovery. It is capable of designing experiments, analyzing data, and testing results in partnership with human experts and enabling rigorous, reproducible research.

Large language models (LLMs) are customized using knowledge and data from both text and non-text sources. The co-scientist uses this knowledge to generate new hypotheses and run simulations to test and validate these ideas. Human-in-the-loop collaboration is essential.

This post explores how NVIDIA is powering these AI co-scientists. It showcases two agents being developed by Los Alamos National Laboratories (LANL) to address two of the toughest challenges in science today: inertial confinement fusion (ICF) hypothesis generation and cancer treatment.

AI co-scientist for ICF hypothesis generation

LANL and NVIDIA are collaborating on a multiphase process to develop a co-scientist for ICF fusion hypothesis generation.

Fusion is the process that powers the stars. Achieving energy generation through fusion on Earth is one of the greatest scientific challenges. Inertial confinement fusion (ICF) is a process that achieves nuclear fusion by rapidly compressing and heating a tiny pellet of fuel using intense energy sources like lasers, causing the nuclei to fuse and release energy. ICF is also used to understand the exotic properties of matter, such as those found in the interior of Jupiter, and for national security purposes.

Because ICF is a highly coupled multiphysics, non-linear problem, predictability of large-scale codes remains a major scientific challenge. This complexity arises because ICF requires simulation of several physical phenomena that can interact in unpredictable ways and operate across vastly different spatial and temporal scales. Results from experiments at large laser facilities can deviate from predictions due to changes in initial conditions or choice of target-design parameters. To accelerate understanding and progress, it is essential to leverage all available tools—including AI.

Flow chart showing the process for developing an ICF hypothesis generation agent showing steps to train AI agents, generate and test hypotheses, run simulations, and verify scientific results. — Figure 1. NVIDIA and LANL are developing a co-scientist for ICF fusion hypothesis generation. The process includes training AI agents, generating and testing hypotheses, running simulations, and verifying scientific results

In the first phase of the process, LANL is leveraging open source NVIDIA NeMo framework libraries, including:

NeMo Curator for data curation
NeMo 2.0 for continual pretraining and fine-tuning
NeMo RL for reinforcement learning of the Llama Nemotron Super 1.5 model. It will become a more domain-aware reasoning model that can be used as the basis for building a trusted AI co-scientist

Figure 2 shows the steps involved in transforming Llama Nemotron Super 1.5 into a reasoning model for ICF physics. The steps encompass preparing datasets for domain adapted pretraining (DAPT), supervised fine-tuning (SFT), and reasoning traces using open access documents from public datasets, CORE, arXiv and OSTI.gov covering physics and ICF.

To verify that the model is becoming knowledgeable on ICF, academic and custom benchmarks are used, including questions generated by subject matter experts.

Workflow diagram showing a sequence for training a fusion-specific LLM, beginning with internet data, progressing through a baseline model, domain adaptation with specific documents, fine-tuning with Q&A instructions, and final optimization using reinforcement learning, resulting in an RLHF-tuned fusion LLM. — *Figure 2. The steps involved in transforming Llama Nemotron Super 1.5 into a reasoning model for ICF physics*

The ultimate goal of this work is to solve some of the most challenging problems in fusion research, including improving the performance of ongoing ICF implosion experiments at the National Ignition Facility and the OMEGA laser. This involves developing and benchmarking scientific concepts against computational simulations and physical experiments.

By refining designs and integrating feedback from experimental results, the AI co-scientist will provide insights that will inform new experiments at current and the next generation ICF facilities. This enables progress toward more efficient and reliable fusion energy solutions, while also addressing key questions relating to fundamental properties of matter and national security.

AI co-scientist for cancer treatment

Targeted alpha therapy (TAT) can be a highly effective treatment against cancer when precisely delivered. The radioactive atoms emit energetic alpha particles that destroy nearby cancer cells. However, imprecise targeting can cause these powerful emissions to damage healthy tissues, leading to unintended side effects.

To minimize such collateral damage, TAT relies on specialized chelator molecules that bind and transport the radioactive atoms to tumor sites. Designing effective chelators that remain stable and selective within complex biological environments remains a major research challenge.

Because the metals used in TAT have large radii, very few molecules are known to reliably bind to them. This limits researchers’ ability to apply data-driven approaches in designing new and improved therapeutic agents.

LANL is building an agentic AI discovery platform that combines generative AI and simulation in a single workflow to identify new and improved chelator molecules. By helping to rapidly search vast chemical spaces, this research is paving the way for safer, more effective, and more targeted therapies.

AI plays a central role in answering fundamental design questions that are involved, such as “What makes a good molecule?” and “Which molecules fit that behavior?” To facilitate the process, LANL has adapted the NVIDIA Llama Nemotron Super 1.5 and GenMol models to focus on molecular discovery and optimization.

Diagram showing NVIDIA-powered workflow that includes Hypothesis Generation with Nemotron, molecule generation with GenMol, Complex Construction using Architector, computational modeling powered by NVIDIA accelerated computing, and Hypothesis Evaluation. — *Figure 3. NVIDIA-powered workflow for designing specialized chelator molecules*

Workflow overview

In this workflow, the agent leverages Llama Nemotron Super 1.5 for hypothesis generation. Hypothesis generation works by prompting the LLM with a description of the problem and a list of the hypotheses it has tested before. The LLM then identifies the most promising hypothesis for the next iteration of the discovery loop, given its base knowledge and the assessment of prior hypotheses.

GenMol is then used to generate a set of molecules to test the hypotheses. GenMol produces molecules that resemble known drugs and can be tuned to satisfy scientific criteria, such as the traits listed in the LLM hypothesis, based on prompts or scientists’ design requirements.

The process then moves to construct chemical complexes between the chelator and the radioactive atom, using Architector.

Next, the workflow shifts to focus on computational modeling on the LANL NVIDIA-powered supercomputer, Venado. 3D molecular structures are designed using high-performance quantum simulations to predict key chemical properties.

This simulation data is finally used to assess the validity of the hypothesis proposed by the LLM, informing its next decisions. Both tools are packaged as NIMs that help to automatically select the best performance settings. With accelerated computing, scientists close the loop between hypotheses and generated data to rapidly adapt and cycle through further rounds of design and simulation.

Using this workflow, the LANL and NVIDIA joint team has already discovered molecules with improved binding energetics for Actinium atoms. The hypothesis-driven design accelerates the identification of the best molecules and highlights the properties that make them especially useful. This approach enables researchers to adapt the design process for more effective collaboration with AI, supporting further refinement of molecular candidates.

This work marks the beginning of a transformative effort to design new molecules with real-world applications. The impact has the potential to be far-reaching, as chelators are good for cancer therapies and also excellent for rapid treatment of poisoning and efficient purification of metals, and other chemical applications.

Moving forward, the focus will be on evaluating its feasibility, integration with delivery systems, and potential safety implications.

“With NVIDIA, Los Alamos National Laboratory is pioneering the design and deployment of AI co-scientists in research,” said Mark Chadwick, Associate Laboratory Director for Simulation, Computing, and Theory. “These co-scientists enable rapid hypothesis generation and validation across complex disciplines. We are combining domain knowledge with a combination of AI capabilities from NVIDIA to build co-scientists that are purpose-built for our mission to tackle some of humanity’s grandest challenges.”

This research used the Perlmutter supercomputer resources at the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility.

Get started building AI co-scientists

Leveraging AI for scientific discovery helps to accelerate critical assessments, shorten development cycles, and unlock deeper scientific insights faster than ever before. To learn more about this work, join NVIDIA at SC25 for the LANL Reasoning Model for Fusion and Agentic AI for Molecular Discovery talks at the NVIDIA booth. To get started building your own AI co-scientist, explore NVIDIA NeMo and Nemotron.

Acknowledgments

Thanks to Ping Yang, Danny Perez, Logan Augustine, Pascal Grosset, Jiyoung Lee, Thomas Summers, Michael Taylor, Radha Bahukutumbi, and David D Meyerhofer for their contributions.