Data Science

Sep 23, 2025
Faster Training Throughput in FP8 Precision with NVIDIA NeMo
In previous posts on FP8 training, we explored the fundamentals of FP8 precision and took a deep dive into the various scaling recipes for practical large-scale...
12 MIN READ

Sep 23, 2025
How to Accelerate Community Detection in Python Using GPU-Powered Leiden
Community detection algorithms play an important role in understanding data by identifying hidden groups of related entities in networks. Social network...
9 MIN READ

Sep 18, 2025
The Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data
Over hundreds of Kaggle competitions, we've refined a playbook that consistently lands us near the top of the leaderboard—no matter if we’re working with...
13 MIN READ

Sep 17, 2025
NVIDIA RAPIDS 25.08 Adds New Profiler for cuML, Updates to the Polars GPU Engine, Additional Algorithm Support, and More
The 25.08 release of RAPIDS continues to push the boundaries toward making accelerated data science more accessible and scalable with the addition of several...
9 MIN READ

Sep 10, 2025
Accelerate Protein Structure Inference Over 100x with NVIDIA RTX PRO 6000 Blackwell Server Edition
The race to understand protein structures has never been more critical. From accelerating drug discovery to preparing for future pandemics, the ability to...
6 MIN READ

Aug 22, 2025
How to Spot (and Fix) 5 Common Performance Bottlenecks in pandas Workflows
Slow data loads, memory-intensive joins, and long-running operations—these are problems every Python practitioner has faced. They waste valuable time and make...
7 MIN READ

Aug 14, 2025
Upcoming Livestream: Building Cross-Framework Agent Ecosystems
Join us on Aug. 21 to see how NVIDIA NeMo Agent toolkit boosts multi-agent workflows with deep MCP integration.
1 MIN READ

Aug 13, 2025
Scaling LLM Reinforcement Learning with Prolonged Training Using ProRL v2
Currently, one of the most compelling questions in AI is whether large language models (LLMs) can continue to improve through sustained reinforcement learning...
8 MIN READ

Aug 07, 2025
Efficient Transforms in cuDF Using JIT Compilation
RAPIDS cuDF offers a broad set of ETL algorithms for processing data with GPUs. For pandas users, cuDF accelerated algorithms are available with the zero code...
9 MIN READ

Aug 07, 2025
Train with Terabyte-Scale Datasets on a Single NVIDIA Grace Hopper Superchip Using XGBoost 3.0
Gradient-boosted decision trees (GBDTs) power everything from real-time fraud filters to petabyte-scale demand forecasts. XGBoost open source library has long...
7 MIN READ

Aug 06, 2025
What’s New and Important in CUDA Toolkit 13.0
The newest update to the CUDA Toolkit, version 13.0, features advancements to accelerate computing on the latest NVIDIA CPUs and GPUs. As a major release, it...
19 MIN READ

Aug 01, 2025
7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows
You've been there. You wrote the perfect Python script, tested it on a sample CSV, and everything worked flawlessly. But when you unleashed it on the full 10...
8 MIN READ

Jul 24, 2025
Optimizing Vector Search for Indexing and Real-Time Retrieval with NVIDIA cuVS
AI-powered search demands high-performance indexing, low-latency retrieval, and seamless scalability. NVIDIA cuVS brings GPU-accelerated vector search and...
7 MIN READ

Jul 23, 2025
Serverless Distributed Data Processing with Apache Spark and NVIDIA AI on Azure
The process of converting vast libraries of text into numerical representations known as embeddings is essential for generative AI. Various technologies—from...
9 MIN READ

Jul 18, 2025
3 pandas Workflows That Slowed to a Crawl on Large Datasets—Until We Turned on GPUs
If you work with pandas, you’ve probably hit the wall. It’s that moment when your trusty workflow, so elegant on smaller datasets, grinds to a halt on a...
4 MIN READ

Jul 17, 2025
Feature Engineering at Scale: Optimizing ML Models in Semiconductor Manufacturing with NVIDIA CUDA‑X Data Science
In our previous post, we introduced the setup of predictive modeling in chip manufacturing and operations, highlighting common challenges such as imbalanced...
6 MIN READ