Join Netflix, Fidelity, and NVIDIA to learn best practices for building, training, and deploying modern recommender systems. Register Free

NVIDIA Merlin NVTabular

NVIDIA Merlin™ accelerates the entire pipeline, from ingesting and training to deploying GPU-accelerated recommender systems. Merlin NVTabular is a feature engineering and preprocessing library designed to effectively manipulate terabytes of recommender system datasets and significantly reduce data preparation time. It provides efficient feature transformations, preprocessing, and high-level abstraction that accelerates computation on GPUs using the RAPIDS™ cuDF library.

Download and Try It Today

GitHub Repo Merlin NVTabular on NGC

Merlin NVTabular Core Features

Feature Engineering

NVTabular's fast feature transforms reduce data prep time and eases deploying recommender models to production. With NVTabular recommender focused APIs, data scientists and machine learning engineers are able to quickly process datasets of all sizes, implement more experimentation, and are not bound by CPU or GPU memory. Also, includes multi-hot categoricals and vector continuous passing support to ease feature engineering.

Run examples

Interoperability with Open Source

Data scientists and machine learning engineers use a hybrid of methods, tools, libraries, and frameworks, including open source. NVTabular native tabular data support includes comma-separated values (CSV) files, Apache Parquet, Apache Orc, and Apache Avro. Also, NVTabular data loaders are optimized for TensorFlow (TF), PyTorch, and Merlin HugeCTR. All Merlin components, including NVTabluar, are interoperable with open source.

Learn more

Accelerated on GPUs

NVTabular provides a high level abstraction that accelerates computation on GPUs using the RAPIDS cuDF library. Also, NVTabular's support for multi-node scaling and multi-GPU with DASK-CUDA and dask.distributed accelerates distributed parallelism.

Explore more

Merlin NVTabular Performance

NVTabular's multi-GPU support using RAPIDS cuDF, Dask, and Dask_cuDF enables a high-performance recommender-specific pipeline. Provides 95x speedup using NVTabular multi-GPU on the NVIDIA DGX™ A100 compared to Spark on a four-node, 96 vCPU core, CPU cluster processing 1.3 TB of data in the Criteo Terabyte dataset, Also provides a speedup of 5.3x using eight NVIDIA A100 GPUs, from 10 minutes on 1xA100 to 1.9 minutes on 8xA100.

Explore more benchmark details

Speedup Using NVTabular on Multi-GPU

Get Started with Merlin NVTabular

All NVIDIA Merlin components are available as open-source projects on GitHub. However, a more convenient way to make use of these components is by using Merlin NVTabular containers from the NVIDIA NGC catalog. Containers package the software application, libraries, dependencies, and runtime compilers in a self-contained environment. This way, the application environment is both portable, consistent, reproducible, and agnostic to the underlying host system software configuration.

Merlin NVTabular on NGC

Merlin Training

Enables users to do preprocessing and feature engineering with NVTabular and then train a deep learning-based recommender system model with HugeCTR.

Pull Container from NGC

Merlin Tensorflow Training

Utilize preprocessing and feature engineering with NVTabular and then train a deep learning-based recommender system model with TensorFlow.

Pull Container from NGC

Merlin Pytorch Training

Leverage preprocessing and feature engineering with NVTabular and then train a deep learning-based recommender system model with PyTorch.

Pull Container from NGC

Merlin Inference

Container allows users to deploy NVTabular workflows and HugeCTR or TensorFlow models to the NVIDIA Triton™ Inference Server for production.

Pull Container from NGC

Merlin NVTabular on GitHub

The GitHub repository helps users get started with NVTabular with documentation, tutorials, examples, and notebooks.

View GitHub Repo

Diagram illustrating components of NVIDIA Merlin

Merlin NVTabular Resources

Explore all Merlin resources.

Announcing NVTabular Open Beta

Discover how multi-GPU support and data loaders accelerate recommender workflows.

READ NOW

NVIDIA Merlin

NVIDIA Merlin consists of Merlin Feature Engineering: NVTabular, Merlin Training: HugeCTR, Merlin Inference: NVIDIA® TensorRT™ and Triton, and Merlin Reference Applications.

LEARN MORE

Industry Best Practices

Learn latest trends and insights about building, deploying, and optimizing recommender systems that effectively engage users and impact business value. Best practices from Tencent, Meituan, The New York Times, Magazine Luiza, and more.

DOWNLOAD WHITEPAPER

Session-Based Recommenders

The NVIDIA Merlin team designed Transformers4Rec to help machine learning engineers and data scientists explore and apply transformers to building sequential and session-based recommenders.

LEARN MORE

NVTabular is available to download from the NVIDIA NGC catalog, the GitHub repository, or the Anaconda repository.

Download Now