Accelerating Single Cell Genomic Analysis using RAPIDS

The human body is made up of nearly 40 trillion cells, of many different types. Recent advances in experimental biology have made it possible to explore the genetic material of single cells. With the birth of this new field of single-cell genomics, scientists can now probe the DNA and RNA of individual cells in the … Continued

Achieving 100x Faster Single-Cell Modality Prediction with NVIDIA RAPIDS cuML

Single-cell measurement technologies have advanced rapidly, revolutionizing the life sciences. We have scaled from measuring dozens to millions of cells and from one modality to multiple high dimensional modalities. The vast amounts of information at the level of individual cells present a great opportunity to train machine learning models to help us better understand the … Continued

Accelerating Vector Search: Using GPU-Powered Indexes with RAPIDS RAFT

In the AI landscape of 2023, vector search is one of the hottest topics due to its applications in large language models (LLM) and generative AI. Semantic vector search enables a broad range of important tasks like detecting fraudulent transactions, recommending products to users, using contextual information to augment full-text searches, and finding actors that … Continued

Building an Accelerated Data Science Ecosystem: RAPIDS Hits Two Years

GTC Fall 2020 marked the second anniversary of the initial release of RAPIDS. Created out of the GPU Open Analytics Initiative (GoAi) aimed at making accelerated, end-to-end analytics on GPUs easy, RAPIDS has proven GPUs are performant, easy to use, and transformative to the future of data analytics. By thinking about the relationship between software … Continued

Accelerating Sequential Python User-Defined Functions with RAPIDS on GPUs for 100X Speedups

Motivation Custom “row-by-row” processing logic (sometimes called sequential User-Defined Functions) is prevalent in ETL workflows. The sequential nature of UDFs makes parallelization on GPUs tricky. This blog post covers how to implement the same UDF logic using RAPIDS to parallelize computation on GPUs and unlock 100x speedups. Introduction Typically, sequential UDFs revolve around records with … Continued

RAPIDS Accelerates Data Science End-to-End

Today’s data science problems demand a dramatic increase in the scale of data as well as the computational power required to process it. Unfortunately, the end of Moore’s law means that handling large data sizes in today’s data science ecosystem requires scaling out to many CPU nodes, which brings its own problems of communication bottlenecks, energy, and … Continued

Running Python UDFs in Native NVIDIA CUDA Kernels with the RAPIDS cuDF

In this post, I introduce a design and implementation of a framework within RAPIDS cuDF that enables compiling Python user-defined functions (UDF) and inlining them into native CUDA kernels. This framework uses the Numba Python compiler and Jitify CUDA just-in-time (JIT) compilation library to provide cuDF users the flexibility of Python with the performance of … Continued

Accelerated Vector Search: Approximating with RAPIDS RAFT IVF-Flat

Performing an exhaustive exact k-nearest neighbor (kNN) search, also known as brute-force search, is expensive, and it doesn’t scale particularly well to larger datasets. During vector search, brute-force search requires the distance to be calculated between every query vector and database vector. For the frequently used Euclidean and cosine distances, the computation task becomes equivalent … Continued