New SDKs Accelerating AI Research, Computer Vision, Data Science, and More

NVIDIA revealed major updates to its suite of AI software for developers including JAX, NVIDIA CV-CUDA, and NVIDIA RAPIDS.

To learn about the latest SDK advancements from NVIDIA, watch the keynote from CEO Jensen Huang.

JAX on NVIDIA AI

Just today at GTC 2022, NVIDIA introduced JAX on NVIDIA AI, the newest addition to its GPU-accelerated deep learning frameworks. JAX is a rapidly growing library for high-performance numerical computing and machine learning research.

JAX can automatically differentiate native Python functions and implements a NumPy-like API.

In just a few lines of code, JAX enables distributed training across multi-node and multi-GPU systems, with accelerated performance through XLA-optimized kernels on NVIDIA GPUs.

Some of the research areas implemented using JAX include transformers, reinforcement learning, fluid dynamics, geophysical modeling, drug discovery, computer vision, and more. Early adopters of JAX include DeepMind, Google Research, eBay, and InstaDeep.

NVIDIA is collaborating with the JAX team to ensure the best performance and improved experience for JAX users on GPUs. Highlights from optimizations include the following:

Efficient scaling across multiple GPUs and multiple nodes
Easy workflow to train LLMs with optimized training scripts for T5X– and GPT-based models
Built for all major cloud platforms

For more information, see the following resources:

Apply now for updates on the JAX NGC container, available for private early access later this year.
Check out the new forum for JAX-related software product updates, releases, and critical bug fixes.
Join our new Discord server to chat within the community using JAX on NVIDIA GPUs.

Add this GTC session to your calendar:

Seamlessly Scale Out Complex Compute Workloads with JAX

PyTorch Geometric and DGL on NVIDIA AI

PyTorch Geometric (PyG) and Deep Graph Library (DGL) are the most popular GNN frameworks. NVIDIA introduced containers for GPU-optimized GNN frameworks, PyG and DGL, which are designed to help developers, researchers, and data scientists accelerate graph learning, including large heterogeneous graphs with billions of edges on NVIDIA GPUs.

With NVIDIA AI-accelerated GNN frameworks, you can achieve end-to-end performance optimization, making it the fastest solution to preprocess and train GNNs.

Highlights from this announcement:

Ready-to-use containers for GPU-optimized DGL and PyTorch Geometric
Up to 90% lower end-to-end execution time compared to CPUs for ETL, sampling, and training
End-to-end reference examples for GraphSage, R-GCN, and SE3-Transformer

Amazon Search, American Express, Entos, Meituan, and Pinterest have already taken advantage of early versions of this technology and are seeing great results.

American Express research is excited for DGL to help improve their cardholder experiences through improved fraud detection.

“OrbNet, with the help of DGL and NVIDIA GPUs, has enabled the accurate and data-efficient prediction of drug-molecule properties, reducing by years the amount of time needed to advance new drug candidates through lead identification and lead optimization.” Tom Miller, CEO, Entos

“Meituan’s GNN platform, with optimizations to DGL and GPU performance, serves many services at Meituan, including search, recommendation, advertising, and so on.” Mengdi Zhang, Head of Graph Learning, Senior Algorithm Expert, Meituan

Developers, researchers, and data scientists can use the new containers to accelerate development, enabling faster adoption of GNNs.

To quickly take advantage of GNN through containerized solutions, apply for early access to the GPU-optimized, performance-tuned, and tested containers for PyG and DGL.

Add these GTC sessions to your calendar:

NVIDIA CV-CUDA

NVIDIA introduced CV-CUDA, a new open source project enabling developers to build highly efficient, GPU-accelerated pre– and post-processing pipelines in cloud-scale artificial intelligence (AI) imaging and computer vision (CV) workloads.

Highlights:

Specialized set of 50+ highly performant CUDA kernels as standalone operators
Batching support with variable shape images in one batch

For more CV-CUDA updates, see the CV-CUDA early access interest page.

NVIDIA Triton

NVIDIA announced key updates to NVIDIA Triton, open-source, inference-serving software bringing fast and scalable AI to every application in production. Over 50 features were added in the last 12 months.

Notable feature additions:

Model orchestration using the NVIDIA Triton Management Service that automates deployment and management of multiple models on Triton Inference Server instances in Kubernetes. Apply for early access.
Large language model inference with multi-GPU, multi-node execution with the FasterTransformer backend.
Model pipelines (ensembles) with advanced logic using business logic scripting.
Auto-generation of minimal required model configuration for fast deployment is on by default.

Kick-start your NVIDIA Triton journey with immediate, short-term access in NVIDIA LaunchPad without setting up your own environment.

You can also download NVIDIA Triton from the NGC catalog, access code and documentation on the /triton-inference-server GitHub repo, and get enterprise-grade support.

Add these GTC sessions to your calendar:

NVIDIA RAPIDS

At GTC 2022, NVIDIA announced that RAPIDS, the data science acceleration solution chosen by 25% of Fortune 100 companies, is now further breaking down adoption and usability barriers. It is making accelerated analytics accessible to nearly every organization, whether they’re using low-level C++ libraries, Windows (WSL), or cloud-based data analytics platforms. New capabilities will be available mid-October.

Highlights:

Support for WSL and Arm SBSA now generally available
- Supporting Windows brings the convenience and power of RAPIDS to nine million new Python developers who use Windows.
Easily launch multi-node workflows on Kubernetes and Kubeflow
- Estimating cluster resources in advance for interactive work is often prohibitively challenging. You can now conveniently launch Dask RAPIDS clusters from within your interactive Jupyter sessions and burst beyond the resources of your container for combined ETL and ML workloads.

For more information about the latest release, download and try NVIDIA RAPIDS.

Add these GTC sessions to your calendar:

NVIDIA RAPIDS Accelerator for Apache Spark

New capabilities of the NVIDIA RAPIDS accelerator for Apache Spark 3.x were announced at GTC 2022. The new capabilities bring an unprecedented level of transparency to help you speed up your Apache Spark DataFrame and SQL operations on NVIDIA GPUs, with no code changes and without leaving the Apache Spark environment. Version 22.10 will be available mid-October.

The new capabilities of this release further the mission of accelerating your existing Apache Spark workloads, no matter where you run them.

Highlights:

The new workload acceleration tool analyzes Apache Spark workloads and recommends optimized GPU parameters for cost savings and performance.
Integration with Google Cloud DataProc.
Integration with Delta Lake and Apache Iceberg.

For more information about the latest release, download and try NVIDIA RAPIDS Accelerator for Apache Spark

Add these GTC sessions to your calendar:

NVIDIA cuQuantum and NVIDIA CUDA-Q

At GTC 2022, NVIDIA announced the latest version of the NVIDIA cuQuantum SDK for accelerating quantum circuit simulation. cuQuantum enables the quantum computing ecosystem to solve problems at the scale of future quantum advantage, enabling the development of algorithms and the design and validation of quantum hardware.

NVIDIA also announced ecosystem updates for NVIDIA CUDA-Q, an open, QPU-agnostic platform for hybrid quantum-classical computing. This hybrid, quantum/classical programming model is interoperable with today’s most important scientific computing applications. We are opening up the programming of quantum computers to a massive new class of domain scientists and researchers.

cuQuantum highlights:

Multi-node, multi-GPU support in the DGX cuQuantum appliance
Support for approximate tensor network methods
Adoption of cuQuantum continues to gain momentum, including CSPs and industrial quantum groups

CUDA-Q private beta highlights:

Single-source C++ and Python implementations as well as a compiler toolchain for hybrid systems and a standard library of quantum algorithmic primitives
QPU-agnostic, partnering with quantum hardware companies across a broad range of qubit modalities
Delivering up to a 300X speedup over a leading Pythonic framework also running on an A100 GPU

Add these GTC sessions to your calendar: