NVIDIA Deep Learning Institute Releases New Data Science Teaching Kit for Educators

As data grows in volume, velocity, and complexity, the data science field is booming.

There’s an ever-increasing demand for talent and skill sets to help design the best data science solutions. However, the expertise that can help drive these breakthroughs requires students to have a foundation in various tools, programming languages, computing frameworks, and libraries.

That’s why the NVIDIA Deep Learning Institute (DLI) has released the Accelerated Data Science Teaching Kit for qualified educators. The free kit was co-developed with Polo Chau, from Georgia Institute of Technology, and Xishuang Dong, from Prairie View A&M University, two highly regarded researchers and educators in the fields of data science and accelerated data analytics with GPUs.

“Data science unlocks the immense potential of data in solving societal challenges and large-scale complex problems across virtually every domain, from business, technology, science and engineering to healthcare, government and many more,” Chau said.

The teaching materials cover data collection, preprocessing, machine learning, scalable and distributed computing, data visualization, and graph analytics. It highlights how to take advantage of accelerated computing through lecture material and examples.

DLI Teaching Kits also come bundled with free GPU resources in the form of Google Colab credits for educators, as well as free DLI online, self-paced courses and certificate opportunities for students.

The latest release of the Accelerated Data Science Teaching Kit includes new lecture material and labs to show how to use Python libraries — pandas, Polars, and NetworkX on NVIDIA GPUs with zero code change. These libraries deliver 10x to 500x faster performance on NVIDIA GPUs versus CPUs without any API code changes.

Content also covers culturally responsive topics such as fairness and data bias, as well as challenges and important individuals from underrepresented groups.

The Accelerated Data Science Teaching Kit includes focused modules covering:

Introduction to Data Science and RAPIDS
Data Collection and Preprocessing (ETL)
Data Ethics and Bias in Data Sets
Data Integration and Analytics
Data Visualization
Scalable Computing with Hadoop, Hive, Spark, HBase and RAPIDS
Scalable Computing with Dask and UCX
Machine Learning (Classification)
Machine Learning (Clustering and Dimensionality Reduction)
Neural Networks
Graph Analytics
Streaming Data
Genomics
Text Analytics
CPU vs GPU-accelerated Data Science
Data Science Teams, Code Back-up, and Version Control
Team Project (Fake News Detection)

All modules include lecture slides, lecture notes and quiz/exam problem sets, and most modules include hands-on labs with included datasets and sample solutions in Python and interactive Jupyter notebook formats. Lecture videos are included for some modules and more are planned for future releases.

“Data science is such an important field of study, not just because it touches every domain and vertical, but also because data science addresses important societal issues relating to gender, race, age and other ethical elements of humanity,“ said Dong, whose school is a Historically Black College/University.
This is the fourth Teaching Kit released by the DLI, as part of its program that has reached over 10,000 qualified educators so far. Learn more about NVIDIA Teaching Kits.

This post was originally published on 9/2/21, but has been updated with the new Accelerated Data Science Teaching Kit.