NVIDIA Merlin HugeCTR
NVIDIA Merlin™ accelerates the entire pipeline, from ingesting and training to deploying GPU-accelerated recommender systems. Merlin HugeCTR (Huge Click-Through-Rate) is a deep neural network (DNN) training and inference framework designed for recommender systems. It provides distributed training with model-parallel embedding tables, an embeddings cache, and data-parallel neural networks across multiple GPUs and nodes for maximum performance. HugeCTR covers common and recent architectures such as Deep Learning Recommendation Model (DLRM), Wide and Deep, Deep Cross Network (DCN), and DeepFM.
Download and Try It Today
Merlin HugeCTR Core Features
Training Embeddings at Scale
Data scientists and machine learning engineers building deep learning recommenders work with large embedding tables that often exceed available memory. Merlin HugeCTR's model parallelism and embedding cache is designed for recommender workflows. This makes it easy to train an embedding table of any size and fully leverage compute memory. HugeCTR also leverages the NVIDIA Collective Communication Library (NCCL) for high-speed, multi-node, and multi-GPU communications at scale.Learn more
Inherently Asynchronous, Multi-Threaded Pipeline
Effective data loading is challenging for machine learning engineers and data scientists who are continuously experimenting, training, and fine-tuning recommender models. HugeCTR's data reader is inherently asynchronous and multi-threaded. It will read batched data records that are high-dimensional, sparse, or categorical. Each record is fed directly to fully connected layers. HugeCTR's embedding layer compresses input-sparse features to dense-embedding vectors. HugeCTR's model parallelism enables embedded training in a homogeneous cluster across multiple nodes and GPUs.Explore HugeCTR on GitHub
Inference, Hierarchical Deployment on Multiple GPUs
HugeCTR provides concurrent model inference execution across multiple GPUs through the use of a parameter server and embedding cache that are shared between multiple model instances. HugeCTR also leverages NVIDIA Triton™ Inference Server to ease workflows for data scientists and machine learning engineers when deploying models to production.Learn more
Interoperability with Open Source
Machine learning engineers and data scientists use a hybrid of methods, libraries, tools, and frameworks that often include open-source components. HugeCTR is an open-source component of NVIDIA Merlin and is designed to optimize embeddings training within recommender workflows. HugeCTR is interoperable with open source and its SOK (SparseOperationsKit) is compatible with TensorFlow Distribute Strategy and Horovod.Learn more
Embeddings optimization enables more experimentation, fine tuning, and better prediction at scale. HugeCTR's optimized embedding implementation is up to 8X more performant than other frameworks’ embedding layers. This optimized implementation is also made available as a TensorFlow plug-in that works seamlessly with TensorFlow and acts as a convenient drop-in replacement for the TensorFlow-native embedding layers.Learn more
Get Started with Merlin HugeCTR
All NVIDIA Merlin components are available as open-source projects on GitHub. However, a more convenient way to make use of these components is by using Merlin HugeCTR containers from the NVIDIA NGC catalog. Containers package the software application, libraries, dependencies, and runtime compilers in a self-contained environment. This way, the application environment is both portable, consistent, reproducible, and agnostic to the underlying host system software configuration.
The NGC container allows users to do preprocessing, feature engineering, and training of a deep learning-based recommender system model with HugeCTR.
HugeCTR supports Triton Inference Server to provide GPU-accelerated inference. The NGC container enables users to deploy Merlin NVTabular workflows and HugeCTR models to Triton Inference Server for production.
HugeCTR on GitHub
The GitHub repo helps users get started with HugeCTR and quickly train a model using a Python interface. Available resources include documentation, tutorials, examples, and notebooks.
Built on NVIDIA AI
NVIDIA AI empowers millions of hands-on practitioners and thousands of companies to use the NVIDIA AI Platform to accelerate their workloads. NVIDIA Merlin, is part of the NVIDIA AI Platform. NVIDIA Merlin was built upon and leverages additional NVIDIA AI software within the platform.
RAPIDS is a suite of open source software libraries and APIs that enables end-to-end data science and analytics pipelines entirely on GPUs.
Try it Today:GitHub
cuDF i is a Python GPU DataFrame library for loading, joining, aggregating, filtering, and manipulating data.
Try it Today:GitHub
NVIDIA Triton Inference Server
Take advantage of NVIDIA Triton™ Inference Server to run inference efficiently on GPUs by maximizing throughput with the right combination of latency and GPU utilization.
Try it Today:GitHub
Merlin HugeCTR Resources
Tencent and Merlin HugeCTR
Learn how Tencent deployed their real advertising recommendation training with Merlin and achieved more than 7X speedup over the original TensorFlow solution on the same GPU platform.
Watch the On-Demand
GPU Accelerated Recommender Systems Training and Inference
In this ACM RecSys 2022 accepted submission, learn about NVIDIA Merlin HugeCTR, a framework for click through rate estimation that is optimized for training and inference. It also enables training at scale with model-parallel embeddings and data-parallel neural networks.
Best Practices from Tencent
Discover insights, advice, and best practices about leading the design and development of Tencent's deep learning recommendations system.