Speedy Model Training With RAPIDS + Determined AI

Model developers no longer face a steep learning curve to accelerate model training. By utilizing two open-source software projects, Determined AI’s Deep Learning Training Platform and the RAPIDS accelerated data science toolkit, they can easily achieve up to 10x speedups in data preprocessing and train models at scale.  Making GPUs accessible As the field of … Continued

Using RAPIDS with PyTorch

In this post we take a look at how to use cuDF, the RAPIDS dataframe library, to do some of the preprocessing steps required to get the mortgage data in a format that PyTorch can process so that we can explore the performance of deep learning on tabular data and compare it to the xgboost … Continued

Fast, Flexible Allocation for NVIDIA CUDA with RAPIDS Memory Manager

When I joined the RAPIDS team in 2018, NVIDIA CUDA device memory allocation was a performance problem. RAPIDS cuDF allocates and deallocates memory at high frequency, because its APIs generally create new Series and DataFrames rather than modifying them in place. The overhead of cudaMalloc and synchronization of cudaFree was holding RAPIDS back. My first … Continued

Building an Accelerated Data Science Ecosystem: RAPIDS Hits Two Years

GTC Fall 2020 marked the second anniversary of the initial release of RAPIDS. Created out of the GPU Open Analytics Initiative (GoAi) aimed at making accelerated, end-to-end analytics on GPUs easy, RAPIDS has proven GPUs are performant, easy to use, and transformative to the future of data analytics. By thinking about the relationship between software … Continued

Accelerating Single Cell Genomic Analysis using RAPIDS

The human body is made up of nearly 40 trillion cells, of many different types. Recent advances in experimental biology have made it possible to explore the genetic material of single cells. With the birth of this new field of single-cell genomics, scientists can now probe the DNA and RNA of individual cells in the … Continued

An Interactive 2010 Census Plotly-dash Visualization Accelerated By RAPIDS

The COVID-19 pandemic brings the efforts of the data science community to the forefront. Real-time, interactive visualizations of the novel coronavirus’ spread across populations help researchers, scientists, health officials and governments understand, validate, and communicate important insights hidden among hundreds of millions of rows of records.

Running Python UDFs in Native NVIDIA CUDA Kernels with the RAPIDS cuDF

In this post, I introduce a design and implementation of a framework within RAPIDS cuDF that enables compiling Python user-defined functions (UDF) and inlining them into native CUDA kernels. This framework uses the Numba Python compiler and Jitify CUDA just-in-time (JIT) compilation library to provide cuDF users the flexibility of Python with the performance of … Continued

Accelerating Apache Spark 3.0 with GPUs and RAPIDS

Given the parallel nature of many data processing tasks, it’s only natural that the massively parallel architecture of a GPU should be able to parallelize and accelerate Apache Spark data processing queries, in the same way that a GPU accelerates deep learning (DL) in artificial intelligence (AI). NVIDIA has worked with the Apache Spark community … Continued