Apache Spark

Sep 14, 2023
Adobe Scales ML Pipelines for Optimized Delivery of Brand Messages
Streamline and accelerate deployment by integrating ETL and ML training into a single Apache Spark script on Amazon EMR.
1 MIN READ

Sep 06, 2023
GPUs for ETL? Optimizing ETL Architecture for Apache Spark SQL Operations
Extract-transform-load (ETL) operations with GPUs using the NVIDIA RAPIDS Accelerator for Apache Spark running on large-scale data can produce both cost savings...
8 MIN READ

Jul 17, 2023
GPUs for ETL? Run Faster, Less Costly Workloads with NVIDIA RAPIDS Accelerator for Apache Spark and Databricks
We were stuck. Really stuck. With a hard delivery deadline looming, our team needed to figure out how to process a complex extract-transform-load (ETL) job on...
7 MIN READ

Jun 12, 2023
Distributed Deep Learning Made Easy with Spark 3.4
Apache Spark is an industry-leading platform for distributed extract, transform, and load (ETL) workloads on large-scale data. However, with the advent of deep...
7 MIN READ

Jun 02, 2023
GPU Integration Propels Data Center Efficiency and Cost Savings for Taboola
When you see a context-relevant advertisement on a web page, it's most likely content served by a Taboola data pipeline. As the leading content recommendation...
13 MIN READ

Apr 18, 2023
New GPU Library Lowers Compute Costs for Apache Spark ML
Spark MLlib is a key component of Apache Spark for large-scale machine learning and provides built-in implementations of many popular machine learning...
6 MIN READ

Apr 04, 2023
Topic Modeling and Image Classification with Dataiku and NVIDIA Data Science
The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language...
11 MIN READ

Mar 21, 2023
Catapulting Enterprises to the Leading Edge of AI with NVIDIA AI Enterprise 3.1
Generative AI has marked an important milestone in the AI revolution journey. We are at a fundamental breaking point where enterprises are not only getting...
4 MIN READ

Mar 15, 2023
Smarter Retail Data Analytics with GPU Accelerated Apache Spark Workloads on Google Cloud Dataproc
A retailer's supply chain includes the sourcing of raw materials or finished goods from suppliers; storing them in warehouses or distribution centers; and...
13 MIN READ

Feb 24, 2023
Top Data Science Sessions at NVIDIA GTC 2023
Learn about the latest AI and data science breakthroughs from leading data science teams at NVIDIA GTC 2023.
1 MIN READ

Dec 14, 2022
Saving Apache Spark Big Data Processing Costs on Google Cloud Dataproc
According to IDC, the volume of data generated each year is growing exponentially. IDC’s Global DataSphere projects that the world will generate 221 ZB...
8 MIN READ

Sep 21, 2022
New SDKs Accelerating AI Research, Computer Vision, Data Science, and More
NVIDIA revealed major updates to its suite of AI software for developers including JAX, NVIDIA CV-CUDA, and NVIDIA RAPIDS. To learn about the latest SDK...
7 MIN READ

Aug 30, 2022
Upcoming Event: Data Science Sessions at GTC 2022
Learn about the latest AI and data science breakthroughs from the world's leading data science teams at GTC 2022.
1 MIN READ

Jan 06, 2022
RAPIDS Accelerator for Apache Spark Release v21.10
RAPIDS Accelerator for Apache Spark v21.10 is now available! As an open source project, we value our community, their voice, and requests. This release...
5 MIN READ

Nov 01, 2021
NVIDIA GTC: Top Data Science Sessions
NVIDIA GTC is the must attend AI conference for developers. It’s a place where practitioners, leaders, and innovators share their ideas about the latest...
4 MIN READ

Sep 02, 2021
RAPIDS Accelerator for Apache Spark Release v21.08
Introduction The August release (21.08) of RAPIDS Accelerator for Apache Spark is now available. It has been a little over a year since the first release at...
5 MIN READ