Apache Spark
Aug 20, 2024
NVIDIA GH200 Superchip Delivers Breakthrough Energy Efficiency and Node Consolidation for Apache Spark
With the rapid growth of generative AI, CIOs and IT leaders are looking for ways to reclaim data center resources to accommodate new AI use cases that promise...
8 MIN READ
Jun 14, 2024
Level Up Your Skills with Five New NVIDIA Technical Courses
With AI introducing an unprecedented pace of technological innovation, staying ahead means keeping your skills up to date. The NVIDIA Developer Program gives...
4 MIN READ
Nov 09, 2023
Accelerating Neurosymbolic AI with RAPIDS and Prometheux Vadalog Parallel
As the scale of available data continues to grow, so does the need for scalable and intelligent data processing systems to swiftly harness useful knowledge....
11 MIN READ
Oct 24, 2023
Reduce Apache Spark ML Compute Costs with New Algorithms in Spark RAPIDS ML Library
Spark RAPIDS ML is an open-source Python package enabling NVIDIA GPU acceleration of PySpark MLlib. It offers PySpark MLlib DataFrame API compatibility and...
8 MIN READ
Oct 18, 2023
New Self-Paced Course: RAPIDS Accelerator for Apache Spark
Dive into the RAPIDS Accelerator for Apache Spark toolset, including the workload qualification tool for estimating speedup on GPU and the profiling tool for...
1 MIN READ
Sep 14, 2023
ICYMI: Run RAPIDS-Accelerated Apache Spark on Amazon EMR
Streamline and accelerate deployment by integrating ETL and ML training into a single Apache Spark script on Amazon EMR.
1 MIN READ
Sep 06, 2023
GPUs for ETL? Optimizing ETL Architecture for Apache Spark SQL Operations
Extract-transform-load (ETL) operations with GPUs using the NVIDIA RAPIDS Accelerator for Apache Spark running on large-scale data can produce both cost savings...
8 MIN READ
Jul 17, 2023
GPUs for ETL? Run Faster, Less Costly Workloads with NVIDIA RAPIDS Accelerator for Apache Spark and Databricks
We were stuck. Really stuck. With a hard delivery deadline looming, our team needed to figure out how to process a complex extract-transform-load (ETL) job on...
7 MIN READ
Jun 12, 2023
Distributed Deep Learning Made Easy with Spark 3.4
Apache Spark is an industry-leading platform for distributed extract, transform, and load (ETL) workloads on large-scale data. However, with the advent of deep...
7 MIN READ
Jun 02, 2023
GPU Integration Propels Data Center Efficiency and Cost Savings for Taboola
When you see a context-relevant advertisement on a web page, it's most likely content served by a Taboola data pipeline. As the leading content recommendation...
13 MIN READ
Apr 18, 2023
New GPU Library Lowers Compute Costs for Apache Spark ML
Spark MLlib is a key component of Apache Spark for large-scale machine learning and provides built-in implementations of many popular machine learning...
6 MIN READ
Apr 04, 2023
Topic Modeling and Image Classification with Dataiku and NVIDIA Data Science
The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language...
11 MIN READ
Mar 21, 2023
Catapulting Enterprises to the Leading Edge of AI with NVIDIA AI Enterprise 3.1
Generative AI has marked an important milestone in the AI revolution journey. We are at a fundamental breaking point where enterprises are not only getting...
4 MIN READ
Mar 15, 2023
Smarter Retail Data Analytics with GPU Accelerated Apache Spark Workloads on Google Cloud Dataproc
A retailer's supply chain includes the sourcing of raw materials or finished goods from suppliers; storing them in warehouses or distribution centers; and...
13 MIN READ
Feb 24, 2023
Top Data Science Sessions at NVIDIA GTC 2023
Learn about the latest AI and data science breakthroughs from leading data science teams at NVIDIA GTC 2023.
1 MIN READ
Dec 14, 2022
Saving Apache Spark Big Data Processing Costs on Google Cloud Dataproc
According to IDC, the volume of data generated each year is growing exponentially. IDC’s Global DataSphere projects that the world will generate 221 ZB...
8 MIN READ