If you work with pandas, you’ve probably hit the wall. It’s that moment when your trusty workflow, so elegant on smaller datasets, grinds to a halt on a large one. A script that once took seconds now crawls for minutes.
Your next steps are predictable and frustrating. You might downsample your data and lose fidelity, rewrite your logic to process data in chunks, or face the daunting task of migrating your entire workflow to a distributed framework like Spark.
But what if you could break through that wall with a simple flag? Today, we’re showcasing three common pandas workflows that are dramatically accelerated by switching on a GPU-accelerated DataFrame library, called NVIDIA cuDF. It lets you use the GPU for your existing workflows without rewriting your code.
Workflow #1: Analyzing stock prices with time-based windows
A common financial analysis task is to explore large, time-series datasets to find trends. This often involves a sequence of pandas operations like groupby().agg()
and creating new date features.
The real bottleneck often appears when calculating metrics over a rolling time period. Using groupby().rolling() to calculate Simple Moving Averages (SMAs) over ’50-Day’ or ‘200-Day’ windows on a CPU can be incredibly slow.
With GPU acceleration, these operations are up to 20x faster. A cumulative workflow that takes minutes on a CPU can finish in seconds on a GPU.
See the difference for yourself:
Explore the code on colab or GitHub.
Workflow #2: Analyzing job postings with large string fields
Business intelligence often requires analyzing text-heavy data, which presents a major challenge in pandas. Large string columns consume huge amounts of memory—the notebook for this workflow loads an 8GB file—and make standard operations incredibly slow.
Tasks like reading files (read_csv
), calculating string length (.str.len()
), and merging DataFrames (pd.merge
) become serious performance drags. Yet, these operations are essential for answering business questions like, “Which companies have the longest job summaries?”
GPU acceleration provides a massive end-to-end speedup. Watch the side-by-side comparison:
Explore the code on colab or GitHub.
Workflow #3: Building an interactive dashboard with 7.3M data points
A primary goal for data analysts is to build interactive dashboards that allow stakeholders to explore data. The core of any dashboard is its ability to filter data quickly based on user input.
With pandas on a CPU, filtering millions of rows in real time is often impossible. Changing a date slider or selecting a value from a dropdown can lead to a laggy, unusable experience. This workflow shows a panel
dashboard built on 7.3 million cell tower locations where pandas operations like .between()
and .isin()
are triggered by user clicks.
With GPU acceleration, these filtering operations are near-instantaneous. The result is a smooth, fluid dashboard experience, even when interactively querying millions of geospatial data points.
See the dashboard in action:
Explore the code on colab or GitHub.
What if your pandas DataFrame is bigger than GPU memory?
A common question we get is, “This is great, but what if my dataset won’t fit into my GPU’s memory?”
Historically, this was a major limitation. Today, thanks to Unified Virtual Memory (UVM), you can process datasets that are larger than your GPU’s VRAM (the dedicated memory of the GPU). UVM intelligently pages data between your system’s RAM and the GPU’s memory, allowing you to work on massive pandas DataFrames without having to worry about memory management.
Check out this blog for more information or watch the video below.
Try it yourself: Same code. More speed.
As these workflows show, performance bottlenecks in pandas don’t have to force you into complex workarounds. Many of the most common performance issues can be solved by simply activating the GPU you may already have.
The best part? With NVIDIA cuDF, your existing pandas knowledge is all you need. Here’s a quick guide on how to turn it on.
Ready to get started? Explore these examples and more in our GitHub repository.
Are you a Polars user?
Polars also comes with a built-in GPU engine powered by NVIDIA cuDF. Read the Polars GPU Engine Powered by RAPIDS cuDF Now Available in Open Beta blog to learn more.