NVIDIA cuEST

NVIDIA CUDA® Electronic Structure Theory  (cuEST, currently in beta) is a CUDA-X™ library designed to accelerate first-principles quantum chemistry applications on NVIDIA GPUs. Purpose‑built for industrial‑scale quantum chemistry workloads, cuEST offers a wide set of functionalities and flexible component-level APIs for accelerating the computationally intensive building blocks of molecular quantum chemistry.

Documentation
Forum


How cuEST Works

NVIDIA cuEST brings quantum chemistry acceleration to GPUs, enabling fast and accurate electronic structure predictions of molecular and material properties. With state‑of‑the‑art GPU algorithms that deliver significant speedups over traditional CPU and GPU solutions, cuEST removes long‑standing performance barriers that have prevented engineers and computational chemists from using high‑accuracy quantum chemistry in industrial workflows. Developers can leverage cuEST by integrating it within their own quantum chemistry codes, such as within density functional theory (DFT) self-consistent field methods.


Key Features of cuEST

Breakthrough Acceleration

Leveraging highly optimized algorithms and a variety of techniques, cuEST delivers breakthrough acceleration 50X speedups over traditional CPU- based quantum chemistry methods. This enables high-accuracy, GPU-based quantum chemistry codes at industrial scale.

Modular and Componentized APIs

With a component-level design and modular structure, cuEST APIs are fully composable , giving independent software vendors (ISVs), open source projects, and communities the flexibility to integrate NVIDIA GPU acceleration while preserving their existing end‑to‑end features and workflows.

Functionality for DFT 

cuEST provides building blocks for modern Gaussian-basis DFT, including construction of the overlap, kinetic, potential, Coulomb, exchange, exchange-correlation potential matrices and their derivatives, with support for a broad spectrum of generalized gradient approximation (GGA), meta‑GGA, and hybrid functionals.


Performance

End-to-End cuEST Speedup Relative to State-of-the-Art Tensor-Compressed CPU Code

Note that the end-to-end timings above are measured using cuEST library calls driven by a lightweight example SCF procedure provided as a sample.

DF-K Speedup Relative to State-of-the-Art Tensor-Compressed CPU Code 

cuEST DF-K Performance on 100-Series Platforms With Emulation and Mixed-Precision Techniques

cuEST DF-K Performance on NVIDIA RTX PRO 6000 Blackwell Server Edition​ With Emulation and Mixed-Precision Techniques

  • PSI4 is run on 56x cores of Intel Xeon Platinum 8570, which is the optimal CPU configuration for PSI4 time to solution. PSI4 v1.9.1 is used, tests of PSI4 v1.10.0 and v1.11tip all show same performance.
  • cuEST is run using a lightweight example SCF procedure provided as a sample.
  • cuEST is run on 1x GPU of the type indicated (H200, B200, A100, RTX Pro 6000 Server Edition).
  • Both codes run 20 RHF iterations for consistency, and both codes use same thresholds (e.g., pq threshold).
  • Molecules are systematic globular cutouts of benzene crystal.
  • DF-K performance in effective TFLOPS is computed as TFLOPs of dense rectangular DGEMM-based DF-K divided by K walltime in seconds.
  • Emulated results use cuEST v0.1 with dynamic Ozaki threshold scheme that yields coalescence with FP64 energies in total energy at the end of the SCF procedure.
  • PSI4 speedups use emulated cuEST results.

Get Started With cuEST

Quickly access cuEST resources, including environment setup guides, reference documentation, and GitHub sample code, to configure your stack and begin running GPU-accelerated workloads.

Atomic Precision at Production Scale

Learn how the NVIDIA cuEST library brings atomic‑level quantum chemistry simulations to production scale, empowering semiconductor innovators like Applied Materials, Samsung, Synopsys, and TSMC to accelerate material modeling from lab discovery to fab deployment.

Read Blog

Initialize cuEST in Your Environment

Set up your environment, configure dependencies, and run your first cuEST workloads with reproducible, GPU-accelerated test executions.

Explore Documentation

Get Started With Sample Code

Explore cuEST examples on GitHub that show how to initialize the runtime, define tests, and execute GPU-accelerated workloads end to end.

Explore GitHub

More Resources

Sign Up for the Developer Newsletter

Decorative image representing forums

Get Training and Certification

Join the NVIDIA Developer Program