The past decade has seen quantum computing leap out of academic labs into the mainstream. Efforts to build better quantum computers proliferate at both startups and large companies. And while it is still unclear how far we are away from using quantum advantage on common problems, it is clear that now is the time to build the tools needed to deliver valuable quantum applications.

To start, we need to make progress in our understanding of quantum algorithms. Last year, NVIDIA announced cuQuantum, a software development kit (SDK) for accelerating simulations of quantum computing. Simulating quantum circuits using cuQuantum on GPUs enables algorithms research with performance and scale far beyond what can be achieved on quantum processing units (QPUs) today. This is paving the way for breakthroughs in understanding how to make the most of quantum computers.

In addition to improving quantum algorithms, we also need to use QPUs to their fullest potential alongside classical computing resources: CPUs and GPUs. Today, NVIDIA is announcing the launch of NVIDIA CUDA-Q, a platform for hybrid quantum-classical computing with the mission of enabling this utility.

As quantum computing progresses, all valuable quantum applications will be hybrid, with the quantum computer working alongside high-performance classical computing. GPUs, which were created purely for graphics, transformed into essential hardware for high-performance computing (HPC). This required new software to enable powerful and straightforward programming. The transformation of quantum computers from science experiments to useful accelerators also requires new software.

This new era of quantum software will enable performant hybrid computation and increase the accessibility of quantum computers for the broader group of scientists and innovators.

## Quantum programming landscape

The last five years have seen the development of quantum programming approaches targeting small-scale, noisy quantum computing architectures. This development has been great for algorithm developers and enabled early prototyping of both standard quantum algorithms as well as hybrid variational approaches.

Due to the scarcity of quantum resources and practicalities of hardware implementations, most of these programming approaches have been at the pure Python level supporting a remote, cloud-based execution model.

As quantum architectures improve and algorithm developers consider true quantum acceleration of existing classical heterogeneous computing, the question arises: How should we support quantum coprocessing in the traditional HPC context?

NVIDIA has been a true pioneer in the development of HPC programming models, heterogeneous compiler platforms, and high-level application libraries that accelerate traditional scientific computing workflows with one or many NVIDIA GPUs.

We see quantum computing as another element of a heterogeneous HPC system architecture and envision a programming model that seamlessly incorporates quantum coprocessing into our existing CUDA ecosystem. Current approaches that start at the Python language level are not sufficient in this regard and will ultimately limit performant integration of classical and quantum compute resources.

## CUDA-Q for HPC

NVIDIA is developing an open specification for programming hybrid quantum-classical compute architectures in an HPC context. We are announcing the CUDA-Q programming model specification and corresponding NVQ++ compiler platform enabling a backend-agnostic (physical, simulated), single-source, modern C++ approach to quantum-accelerated high-performance computing.

CUDA-Q is inherently interoperable with existing classical parallel programming models such as CUDA, OpenMP, and OpenACC. This compiler implementation also lowers quantum-classical C++ source code representations to binary executables that natively target cuQuantum-enabled simulation backends.

This programming and compilation workflow enables a performant programming environment for accelerating hybrid algorithm research and development activities through standard interoperability with GPU processing and circuit simulation that scales from laptops to distributed multi-node, multi-GPU architectures.

auto ghz = [](const int N) __qpu__ { cudaq::qreg q(N); h(q[0]); for (auto i : cudaq::irange(N-1)) { cnot(q[i], q[i+1]); } mz(q); }; // Sample a GHZ state on 30 qubits auto counts = cudaq::sample(ghz, 30); counts.dump();

As shown in the code example, CUDA-Q provides a CUDA-like kernel-based programming approach, with a modern C++ focus. You can define quantum device code as standalone function objects or lambdas annotated with `__qpu__`

to indicate that this is to be compiled to and executed on the quantum device.

By relying on function objects over free functions (the CUDA kernel approach), you can enable an efficient approach to building up generic standard quantum library functions that can take any quantum kernel expression as input.

One simple example of this is the standard sampling CUDA-Q function (`cudaq::sample(...)`

), which takes a quantum kernel instance and any concrete arguments for which the kernel is to be evaluated as the input, and returns the familiar mapping of observed qubit measurement bit strings to the corresponding number of times observed.

CUDA-Q kernel programmers have access to certain built-in types pertinent for quantum computing (`cudaq::qubit`

, `cudaq::qreg`

, `cudaq::spin_op`

, and so on), quantum gate operations, and all traditional classical control flow inherited from C++.

An interesting aspect of the language compilation approach detailed earlier is the ability to compile CUDA-Q codes that contain CUDA kernels, OpenMP and OpenACC pragmas, and higher-level CUDA library API calls. This feature will enable hybrid quantum-classical application developers to truly take advantage of multi-GPU processing in tandem with quantum computing.

Future quantum computing use cases will require classical parallel processing for things like data preprocessing and postprocessing, standard quantum compilation tasks, and syndrome decoding for quantum error correction.

## An early look at quantum-classical applications

A prototypical hybrid quantum-classical algorithm targeting noisy, near-term quantum computing architectures is the variational quantum eigensolver (VQE). The goal for VQE is to compute the minimum eigenvalue for a given quantum mechanical operator, such as a Hamiltonian, with respect to a parameterized state preparation circuit by relying on the variational principle from quantum mechanics.

You execute the state preparation circuit for a given set of gate rotational parameters and perform a set of measurements dictated by the structure of the quantum mechanical operator to compute the expectation value at those concrete parameters. A user-specified classical optimizer is then used to iteratively search for the minimal expectation value by varying these parameters.

You can see what a general VQE-like algorithm looks like with the CUDA-Q programming model:

// Define your state prep ansatz… auto ansatz = [](std::vector<double> thetas) __qpu__ { … Use C++ control flow and quantum intrinsic ops … }; // Define the Hamiltonian cudaq::spin_op H = … use x, y, z to build up Hamiltonian … ; // Create a specific function optimization strategy int n_params = …; cudaq::nlopt::lbfgs optimizer; optimizer.initial_parameters = cudaq::random_vector(-1, 1, n_params); // Run the VQE algorithm with CUDA Quantum auto [opt_val, opt_params] = cudaq::vqe(ansatz, H, optimizer, n_params); printf("Optimal <H> = %lf\n", opt_val);

The main components required are the parameterized ansatz CUDA-Q kernel expression, shown in the code example as a lambda taking a `std::vector<double>`

.

The actual body of this lambda is dependent on the problem at hand, but you are free to build up this function with standard C++ control flow, in-scope quantum kernel invocations, and the logical set of quantum intrinsic operations.

The next component required is the operator whose expectation value you need for calculating. CUDA-Q represents these as the built-in `spin_op`

type, and you can build these up programmatically with Pauli `x(int)`

, `y(int)`

, and `z(int)`

function calls.

Next, you need a classical function optimizer, which is a general concept within the CUDA-Q language specification meant for subclassing to specific optimization strategies, either gradient-based or gradient-free.

Finally, the language exposes a standard library function for invoking the entire VQE workflow. It is parameterized on the CUDA-Q kernel instance modeling the state preparation ansatz, the operator for which you need the following values:

- The minimal eigenvalue
- The classical optimization instance
- The total number of variational parameters

You are then returned a structured binding that encodes the optimal eigenvalue and the corresponding optimal parameters for the state preparation circuit.

The preceding workflow is extremely general and lends itself to the development of variational algorithms that are ultimately generic with respect to quantum kernel expressions, spin operators of interest, and classical optimization routines.

But it also demonstrates the underlying philosophy of the CUDA-Q programming model: To provide core concepts to describe quantum code expressions, and then promote the utility of a standard library of generic functions enabling hybrid quantum-classical algorithmic composability.

## CUDA-Q Early Interest program

Quantum computers hold great promise to help us solve some of our most important problems. We’re opening up quantum computing to scientists and experts in domains where HPC and AI already play a critical role, as well as enabling easy integration of today’s best existing software with quantum software. This will dramatically accelerate quantum computers realizing their potential.

CUDA-Q provides an open platform to do just that, and NVIDIA is excited to work with the entire quantum community to make useful quantum computing a reality. Apply to the CUDA-Q Early Interest program to stay up-to-date on NVIDIA quantum computing developments.

For more information, see NVIDIA quantum computing solutions, with posts, videos, and more.