Developer Resources for Consumer Internet

A hub of news and technical resources for developers working in the consumer internet industry.

Consumer Internet Resources

Recommender System

NVIDIA Merlin

NVIDIA Merlin is an end-to-end recommender-on-GPU framework that provides fast feature engineering and high training throughput to accelerate experimentation and production retraining of deep learning recommender models. Merlin also enables low-latency, high-throughput, production inference.

Merlin for Recommender Systems

Conversational AI and Natural Language Processing

NVIDIA Riva

The NVIDIA Riva framework includes pretrained conversational AI models, tools, and optimized end-to-end services for speech, vision, and natural language understanding (NLU) tasks. In addition to AI services, Riva enables you to fuse vision, audio, and other sensor inputs simultaneously to deliver capabilities such as multi-user, multi-context conversations in applications such as virtual assistants, multi-user diarization, and call center assistants.

Riva for Conversational AI and NLU
NVIDIA NeMo

Using NVIDIA NeMo™, researchers and developers can build state-of-the-art conversational AI models using easy-to-use application programming interfaces.

NeMo for Conversational AI

Image and Video Understanding

NVIDIA Maxine

NVIDIA Maxine is a fully accelerated platform SDK for developers of video conferencing services to build and deploy AI-powered features that use state-of-the-art models in their cloud .Maxine includes APIs for the latest innovations from NVIDIA research such as face alignment, gaze correction, face re-lighting and real time translation in addition to capabilities such as super-resolution, noise removal, closed captioning and virtual assistants.

Maxine for Video Conferencing
NVIDIA DeepStream

The NVIDIA DeepStream SDK lets you build and deploy AI-powered intelligent video analytics (IVA) applications and services. DeepStream offers a multi-platform scalable framework with Transport Layer Security (TLS) for deploying on the edge and connecting to any cloud.

DEEPSTREAM SDK FOR INTELLIGENT VIDEO ANALYTICS
TAO Toolkit

The NVIDIA TAO Toolkit makes it possible to create accurate and efficient AI models for intelligent video analytics (IVA) and computer vision applications without expertise in AI frameworks. Developers, researchers, and software partners building intelligent vision AI apps and services can bring their own data to fine-tune pre-trained models instead of going through the hassle of training from scratch.

TAO Toolkit for Intelligent Video Analytics

Deep Learning SDKs

Training

NVIDIA® CUDA-X AI™ is a complete deep learning software stack for researchers and software developers to build high-performance, GPU-accelerated applications for conversational AI, recommendation systems, and computer vision. CUDA-X AI libraries deliver world-leading performance for both training and inference across industry benchmarks such as MLPerf.

Accelerate AI Training
Inference

NVIDIA TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications.

Boost Inference Capabilities

Data Science

RAPIDS

NVIDIA RAPIDS™ is an open-source suite of data processing and machine learning libraries, developed by NVIDIA, that enables GPU acceleration for data science workflows. RAPIDS relies on NVIDIA’s CUDA® language, allowing users to leverage GPU processing and high-bandwidth GPU memory through user-friendly Python interfaces.

Speed Up Data Science
Apache Spark 3.0

GPU-accelerated Apache Spark 3.0 speeds up data science pipelines—without code changes—and data processing and model training while substantially lowering infrastructure costs.

Speed Up Data Processing

Profiling Tools

Deep Learning Profiler

The Deep Learning Profiler is a tool for profiling deep learning models to understand and improve performance of data science models visually via TensorBoard or by analyzing text reports.

Improve Performance with DLProf
NVIDIA Nsight Systems

NVIDIA Nsight™ Systems is a system-wide performance analysis tool designed to visualize an application’s algorithms, to help you identify the largest opportunities for optimiz and tuning to scale efficiently across any quantity or size of CPUs and GPUs, from large servers to the smallest system on a chip (SoC).

Scale Your AI System with Nsight

Pre-Qualified Containers

NGC provides a range of options that meet the needs of data scientists, developers, and researchers with various levels of AI expertise. Quickly deploy AI frameworks with containers, get a head start with pre-trained models or model training scripts, and use domain-specific workflows and Helm charts for the fastest AI implementations, giving you faster time to solution.

Pre-Qualified Containers for AI

Kubernetes on NVIDIA GPUs

Kubernetes on NVIDIA GPUs enables enterprises to scale up training and inference deployment to multi-cloud GPU clusters seamlessly. It lets you automate the deployment, maintenance, scheduling, and operation of multiple GPU-accelerated application containers across clusters of nodes.

Scale Enterprise AI with Kubernetes

NVIDIA Data Center GPU Manager

The NVIDIA Data Center GPU Manager (DCGM) is a set of tools for managing and monitoring NVIDIA GPUs in cluster environments. It's a low-overhead tool suite that performs a variety of functions on each host system, including active health monitoring, diagnostics, system validation, policies, power and clock management, group configuration, and accounting.

NVIDIA Data Center GPU Manager for Clusters

Making Data Science Teams Productive with Kubernetes and RAPIDS

Data collected on a vast scale has fundamentally changed the way organizations do business, driving demand for teams to provide meaningful data science, machine learning, and deep learning-based business insights quickly. Learn how data science leaders can use RAPIDS to boost their teams’ productivity while optimizing their costs and minimizing deployment time.

Increase Data Science Productivity

Framework for GPU-Accelerated Conversational AI Applications

Real-time conversational AI is a complex and challenging task. Explore NVIDIA Riva and how to access its high-performance conversational AI services easily and quickly with just a few commands.

Framework for Conversational AI

Accelerating Wide-and-Deep Recommender Inference on GPUs

This blog describes a highly optimized, GPU-accelerated inference implementation of a wide-and-deep model based on TensorFlow’s DNNLinearCombinedClassifier API. The proposed solution allows for easy conversion from a trained TensorFlow wide-and-deep model to a mixed-precision inference deployment.

Mixed Precision Inference Deployment

Accelerating Apache Spark 3.0 with GPUs and RAPIDS

NVIDIA has worked with the Apache Spark community to implement GPU acceleration with the release of Spark 3.0 and the open-source RAPIDS Accelerator for Spark. In this blog, learn how the RAPIDS Accelerator for Apache Spark uses GPUs to speed up end-to-end data preparation and model training on the same Spark cluster, Spark SQL, and DataFrame operations without requiring any code changes.

RAPIDS Accelerator for Spark

Training and Fine-Tuning BERT Using NVIDIA NGC

BERT (Bidirectional Encoder Representations from Transformers) provides a game-changing twist to the field of natural language processing (NLP). It runs on supercomputers powered by NVIDIA GPUs to train its huge neural networks and achieve unprecedented NLP accuracy, impinging in the space of known human language understanding. AI like this has been anticipated for many decades. With BERT, it’s finally arrived.

Fine-Tune NLP

Training Framework for Recommender Systems

Click-through rate (CTR) estimation is one of the most critical components of modern recommender systems. In this blog, get an introduction to HugeCTR, a GPU-accelerated training framework for CTR estimation and a pillar of NVIDIA Merlin. HugeCTR, on a single NVIDIA V100 Tensor Core GPU, achieves a speedup of up to 114X over TensorFlow on a 40-core CPU node and up to 8.3X that of TensorFlow on the same V100 GPU.

Framework for Recommenders

Read More Technical Blogs

Programs For You

Developer Resources

The NVIDIA Developer Program provides the advanced tools and training needed to successfully build applications on all NVIDIA technology platforms. This includes access to hundreds of SDKs, a network of like-minded developers through our community forums, and more.

Join Today

Technical Training

NVIDIA Deep Learning Institute (DLI) offers hands-on training in AI, accelerated computing, and accelerated data science to solve real-world problems. Powered by GPUs in the cloud, training is available as self-paced, online courses or live, instructor-led workshops.

View Courses

Accelerate Your Startup

NVIDIA Inception—an acceleration platform for AI, data science, and HPC startups—supports over 7,000 startups worldwide with go-to-market support, expertise, and technology. Startups get access to training through the DLI, preferred pricing on hardware, and invitations to exclusive networking events.

Learn More