SDKs Accelerating Industry 5.0, Data Pipelines, Computational Science, and More Featured at NVIDIA GTC 2023

At NVIDIA GTC 2023, NVIDIA unveiled notable updates to its suite of NVIDIA AI software for developers to accelerate computing. The updates reduce costs in several areas, such as data science workloads with NVIDIA RAPIDS, model analysis with NVIDIA Triton, AI imaging and computer vision with CV-CUDA, and many more.

To keep up with the newest SDK advancements from NVIDIA, watch the GTC keynote from CEO Jensen Huang.

NVIDIA RAPIDS Accelerator for Apache Spark

NVIDIA RAPIDS Accelerator for Apache Spark is now available in the NVIDIA AI Enterprise 3.1 software suite. Speed up data processing and analytics or model training with Apache Spark 3 without code changes, while lowering infrastructure costs.

Highlights:

Integration with major platforms: Google Cloud Platform (GCP) Dataproc, Amazon EMR, Databricks on Azure and AWS, and Cloudera
The Accelerated Spark Analysis Tool makes cost-saving predictions and recommends optimized GPU parameters to maximize the speedup of your workload
With NVIDIA AI Enterprise, take advantage of guaranteed response times, priority security notifications, and data science experts from NVIDIA

Apply today for a free consultation to evaluate your Spark workloads for GPU acceleration and learn to configure your cluster at an average of 4x speedups.

Add this GTC session to your calendar:

Accelerate Spark with RAPIDS for Cost Savings

NVIDIA RAPIDS

Vector search is becoming an increasingly important step in use cases such as large language models, recommender systems, and computer vision. At GTC 2023, NVIDIA announced that RAPIDS RAFT, the toolkit providing accelerated, composable ML building blocks, can now power vector search.

By integrating RAPIDS RAFT, vector databases and search engines can now deliver significantly faster performance for tasks such as building indexes, loading data, and executing many different query types.

Highlights:

RAFT accelerates vector search use cases by offering accelerated Exact and Approximate Nearest Neighbor primitives on GPUs
RAFT-powered index-building time is up to 95x faster and queries per second are up to 3x faster than CPU implementations

NVIDIA is already working with FAISS, Milvus, and Redis to bring improved vector search performance to their users by building on RAFT. Milvus’ GPU-powered backend optimized with RAFT will be available soon.

For more information about RAPIDS RAFT vector search capabilities and everything else it can provide, see the RAPIDS RAFT User’s Guide and /rapidsai/raft GitHub repo.

Add these GTC sessions to your calendar:

CV-CUDA

With an open beta coming in April 2023, CV-CUDA is a new open-source library to build GPU-accelerated pre– and post-processing pipelines for AI computer vision at a cloud scale.

Highlights:

30+ computer vision operators with C/C++ and Python APIs to accelerate object detection, segmentation, and classification workflows
Support for batching of variable-shape images
Zero-copy integration with TensorFlow and PyTorch using DLPack and CUDA array interfaces
Single-line PIP Installation and PyPi availability
NVIDIA Triton Inference Server example using CV-CUDA, TensorRT, and VPF for video encoding and decoding

For more information, see the /CV-CUDA GitHub repo.

Add these GTC sessions to your calendar:

NVIDIA cuLitho

cuLitho, a software library for computational lithography, speeds up the largest workload in semiconductor manufacturing by 40x on NVIDIA Hopper GPUs.

As the semiconductor industry continues to push the state of the art for fabrication technology, it is increasingly facing challenges due to the limits of physics. Optical proximity correction (OPC) and other computational lithography methods are required to create masks that compensate for these challenges. The application of these complex methods has become the industry’s largest compute workload.

NVIDIA cuLitho is a library with optimized tools and algorithms for GPU-accelerating computational lithography and the manufacturing process of semiconductors by orders of magnitude over current CPU-based methods.

Highlights:

Reducing the time to produce a mask from 2 weeks to an overnight 8-hour run
Streamlining the data center: 1/8 the space, 1/4 the cost, and 1/9 the power
Enabling new lithography solutions, such as curvilinear OPC and High-NA EUV

For more information and partner quotes, see NVIDIA cuLitho.

Add these GTC sessions to your calendar:

NVIDIA Triton

Key updates to NVIDIA Triton Inference Server, open-source inference-serving software, brings fast and scalable AI to every application in production. Over 66 features were added in the last year.

Software updates:

PyTriton as a simple interface to run NVIDIA Triton in native Python code, enabling rapid prototyping and easy migration of Python-based models
Support for model ensembles and concurrent model analysis in Model Analyzer.
Paddle Paddle support and integration with Paddle Paddle FastDeploy
FasterTransformer backend with support for BERT, Hugging Face BLOOM, and FP8 in GPT
NVIDIA Triton management service (early access) for the automated and resource-efficient orchestration of models for inference at scale

Kick-start your inference journey with short-term access in NVIDIA LaunchPad without setting up your own environment.

Get started with NVIDIA Triton and get enterprise-grade support.

Add these GTC sessions to your calendar:

NVIDIA TensorRT

Updates to NVIDIA TensorRT, an SDK for high-performance deep learning inference, include a deep learning inference optimizer and runtime to deliver low latency and high throughput for inference applications.

New features:

Performance optimizations for generative AI diffusion and transformer models
Enhanced hardware compatibility to build and run on different GPU architectures (NVIDIA Ampere architecture and later)
Version compatibility so that you can build and run on different TensorRT versions from TensorRT 8.6 and later
Multi-GPU, multi-node inference for GPT-3 models in early access

Kick-start your inference journey with short-term access in NVIDIA LaunchPad without setting up your own environment.

Get started with TensorRT and get enterprise-grade support.

Add these GTC sessions to your calendar:

NVIDIA TAO Toolkit

With central updates to TAO Toolkit, you can use the power and efficiency of transfer learning to achieve state-of-the-art accuracy and production-class throughput for any platform. This low-code AI toolkit accelerates vision AI model development for all skill levels, from beginners to expert data scientists.

Highlights:

New state-of-the-art vision transformers for image classification, object detection, and segmentation tasks
AI-assisted annotation tool for auto-generated segmentation masks
ONNX model export that enables TAO models to be deployed on any devices, such as GPUs, CPUs, and MCUs
Increased AI transparency and explainability by offering TAO as open source

For more information, see Access the Latest in Vision AI Model Development Workflows with NVIDIA TAO Toolkit 5.0. Begin customizing your AI models with TAO Toolkit and try it on LaunchPad.

Add these GTC sessions to your calendar:

NVIDIA DeepStream

NVIDIA came out with the latest version of DeepStream, which adds a new runtime. It enables new capabilities and unlocks new use cases that require tight scheduling solutions. Existing DeepStream developers continue to benefit from hardware-accelerated plug-ins while unlocking smart automation and Industry 5.0 use cases.

Updates:

New accelerated extensions
New runtime with advanced scheduling options
Updated accelerated plug-ins.

For more information, see Get Started with the NVIDIA DeepStream SDK. Try it on LaunchPad.

Add these GTC sessions to your calendar:

NVIDIA Quantum

NVIDIA announced the latest version of the NVIDIA Quantum platform for accelerating quantum computing simulation, hybrid quantum classical algorithm development, and hybrid system deployment.

cuQuantum enables the quantum computing ecosystem to solve problems at the scale of future quantum advantage, enabling the development of algorithms and the design and validation of quantum hardware.

cuQuantum highlights:

Multi-node, multi-GPU support in the DGX cuQuantum appliance.
Support for approximate tensor network methods.
Adoption of cuQuantum continues to gain momentum, including CSPs and industrial quantum groups.

NVIDIA also unveiled the general availability of NVIDIA CUDA-Q, an open, QPU-agnostic platform for hybrid quantum-classical computing. This hybrid, quantum-classical programming model is interoperable with today’s most important scientific computing applications, enabling a massive new class of domain scientists and researchers to program quantum computers.

CUDA-Q highlights:

Single-source C++ and Python implementations as well as a compiler toolchain for hybrid systems and a standard library of quantum algorithmic primitives
QPU-agnostic, partnering with quantum hardware companies across a broad range of qubit modalities
Delivering up to a 300x speedup over a leading Pythonic framework also running on an NVIDIA A100 GPU

At GTC 2023, NVIDIA and Quantum Machines announced DGX Quantum, a partnership that brings together the world’s most powerful accelerated-computing platform with the world’s most advanced quantum controllers. Quantum Machines and NVIDIA will advance the field with a first-of-its-kind architecture for high-performance and low-latency, quantum-classical computing.

DGX Quantum highlights:

A reference architecture featuring a PCIe-connected OPX+ NVIDIA Grace Hopper system, scalable with the size of the QPU.
A CUDA-Q integration with QUA and the Quantum Machines stack, featuring a POC benchmark or benchmarks
Announcement of the QCC as the first QGX deployment in Q4 2023

For more information, see the NVIDIA CUDA-Q page.

Add these GTC sessions to your calendar:

NVIDIA Modulus

NVIDIA Modulus, the platform for developing physics-informed machine learning (physics-ML) models, now includes the data-driven neural operator family of architectures for training models for global scale weather prediction, such as FourCastNet. It uses the NVIDIA AI software stack to give the best performance and scaling to serve both AI research and production deployment at an industrial scale.

NVIDIA Modulus is available with expanded capabilities to cover different domains and both data-driven and physics-driven approaches. It can solve problems in a broad range of applications from computational fluid dynamics (CFD) and structural analysis to electromagnetics.

It’s available as open source under the simple Apache 2.0 license.

Along with recipes for developing physics-ML models for reference applications, Modulus is now free to use, develop, and contribute, no matter which field you are in. It includes open-source repositories that suit different workflows, from native PyTorch developers with modulus-launch to engineers that think in terms of symbolic PDEs with modulus-sym.

Download Modulus source code from the /NVIDIA/modulus GitHub repo.

For more information about Modulus Open Source, see Physics-Informed Machine Learning Platform NVIDIA Modulus Is Now Open Source.

Add this GTC session to your calendar: