RAPIDS Accelerates Data Science End-to-End

Today’s data science problems demand a dramatic increase in the scale of data as well as the computational power required to process it. Unfortunately, the end of Moore’s law means that handling large data sizes in today’s data science ecosystem requires scaling out to many CPU nodes, which brings its own problems of communication bottlenecks, energy, and … Continued

Running Python UDFs in Native NVIDIA CUDA Kernels with the RAPIDS cuDF

In this post, I introduce a design and implementation of a framework within RAPIDS cuDF that enables compiling Python user-defined functions (UDF) and inlining them into native CUDA kernels. This framework uses the Numba Python compiler and Jitify CUDA just-in-time (JIT) compilation library to provide cuDF users the flexibility of Python with the performance of … Continued

Zero to Data Science: Making Data Science Teams Productive with Kubernetes and RAPIDS

  Data collected on a vast scale has fundamentally changed the way organizations do business, driving demand for teams to provide meaningful data science, machine learning, and deep learning-based business insights quickly. Data science leaders, plus the Dev Ops and IT teams supporting them, constantly look for ways to make their teams productive while optimizing their costs … Continued

Accelerating Single Cell Genomic Analysis using RAPIDS

The human body is made up of nearly 40 trillion cells, of many different types. Recent advances in experimental biology have made it possible to explore the genetic material of single cells. With the birth of this new field of single-cell genomics, scientists can now probe the DNA and RNA of individual cells in the … Continued

Accelerating Neurosymbolic AI with RAPIDS and Prometheux Vadalog Parallel

As the scale of available data continues to grow, so does the need for scalable and intelligent data processing systems to swiftly harness useful knowledge. Especially in high-stakes domains such as life sciences and finance, alongside scalability, transparency of data-driven processes becomes paramount to ensure the utmost trustworthiness. Started by scientists coming from the Knowledge … Continued

Accelerated Vector Search: Approximating with RAPIDS RAFT IVF-Flat

Performing an exhaustive exact k-nearest neighbor (kNN) search, also known as brute-force search, is expensive, and it doesn’t scale particularly well to larger datasets. During vector search, brute-force search requires the distance to be calculated between every query vector and database vector. For the frequently used Euclidean and cosine distances, the computation task becomes equivalent … Continued

RAPIDS Accelerates Data Science End-to-End

At GTC Europe in Munich Germany, NVIDIA announced RAPIDS, a suite of open-source software libraries for executing end-to-end data science and analytics pipelines entirely on GPUs.  RAPIDS aims to accelerate the entire data science pipeline including data loading, ETL, model training, and inference. This will enable more productive, interactive, and exploratory workflows. The RAPIDS libraries … Continued

Accelerating Apache Spark 3.0 with GPUs and RAPIDS

Given the parallel nature of many data processing tasks, it’s only natural that the massively parallel architecture of a GPU should be able to parallelize and accelerate Apache Spark data processing queries, in the same way that a GPU accelerates deep learning (DL) in artificial intelligence (AI). NVIDIA has worked with the Apache Spark community … Continued