DEVELOPER BLOG

Data Science |

Power Your Big Data Analytics with the Latest NVIDIA GPUs in the Cloud

Dask is an accessible and powerful solution for natively scaling Python analytics. Using familiar interfaces, it allows data scientists familiar with PyData tools to scale big data workloads easily. Dask is such a powerful tool that we have adopted it throughout a variety of projects at NVIDIA. When paired with RAPIDS, data practitioners can distribute big data workloads across massive NVIDIA GPU clusters.

To make it easier to leverage NVIDIA accelerated compute, we’ve added support for launching RAPIDS + Dask on the latest NVIDIA A100 GPUs in the cloud, allowing users and enterprises to get the most out of their data.

Spin-Up NVIDIA GPU Clusters Quickly with Dask Cloud Provider

While Dask makes scaling analytics workloads easy, distributing workloads in Cloud environments can be tricky. Dask-CloudProvider is a package that provides native Cloud integration, making it simple to get started on Amazon Web Services, Google Cloud Platform, or Microsoft Azure. Using native Cloud tools, data scientists, machine learning engineers, and DevOps engineers can stand-up infrastructure and start running workloads in no time.

RAPIDS builds upon Dask-CloudProvider to make spinning-up the most powerful NVIDIA GPU instances easy with raw virtual machines. While AWS, GCP, and Azure have great managed services for data scientists, these implementations can take time to adopt new GPU architectures. With Dask-CloudProvider and RAPIDS, users and enterprises can leverage the latest NVIDIA A100 GPUs, providing 20x more performance than the previous generation. With 40GB of GPU memory each and 600GB/s NVLINK connection, NVIDIA A100 GPUs are a supercharged workhorse for enterprise-scale data science workloads. Dask-CloudProvider and RAPIDS provide an easy way to get started with A100s without having to configure raw VMs from scratch.

RAPIDS strives to make NVIDIA accelerated data science accessible to a broader data-driven audience. With Dask, RAPIDS allows data scientists to solve enterprise-scale problems in less time and with less pain. For a deeper understanding of the latest RAPIDS features and integrations, read more here.