This page is the syllabus for the NVIDIA Deep Learning Institue (DLI) Accelerated Data Science Teaching Kit outlining each module's organization in the downloaded Teaching Kit .zip file. It shows the content for every module as well as a link to the suggested online DLI course for each module where applicable. You will also find modules coming in future releases of the Teaching Kit, as well as links to stream the lecture videos when they become available in future releases.


Module 1: Introduction to Data Science

Lecture Slides

  • 1.1 - Teaching Kit Modules Overview
  • 1.2 - What is Data Science?
  • 1.3 - Why is Data Science Important?
  • 1.4 - Learning Goals and Expectations
  • 1.5 - Analytics Building Blocks
  • 1.6 - Example Data Science Project 1: Apolo Graph Exploration
  • 1.7 - Example Data Science Project 2: NetProbe Auction Fraud Detection
  • 1.8 - Data Science Buzzwords, Hype Cycle, General vs Narrow AI
  • 1.9 - Hidden Figures in Data Science From Underrepresented Groups
  • 1.10 - Career Paths and Challenges

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Introduction to RAPIDS and cuDF

Quiz

  • Module 1 Quiz

Module 2: Data Collection

Lecture Slides

  • 2.1 - Collecting Data
  • 2.2 - Scraping Data
  • 2.3 - Popular Scraping Libraries
  • 2.4 - Data Annotation and Data Quality
  • 2.5 - SQLite as Simple, Effective Storage
  • 2.6 - SQL Refresher
  • 2.7 - Beware of Missing Indexes

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Data Collection via API
  • Data Annotation in Active Learning
  • GPU-accelerated SQL with BlazingSQL

DLI Online Course Section

Quiz

  • Module 2 Quiz

Module 3: Data Pre-processing (ETL)

Lecture Slides

  • 3.1 - Introduction to Data Pre-processing
  • 3.2 - Data Cleaning and Statistical Preprocessing
  • 3.3 - Data Cleaners: OpenRefine and Wrangler
  • 3.4 - Feature Selection: Introduction to Filter Methods
  • 3.5 - Feature Selection: Introduction to Model-based Methods
  • 3.6 - Feature Reduction: PCA

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Data Wrangling with OpenRefine
  • Outlier Detection with IQR
  • Feature Reduction with PCA

Quiz

  • Module 3 Quiz

Module 4: Data Ethics and Reducing Bias in Data Sets

Lecture Slides

  • 4.1 - Sources of Bias and Fairness Measures
  • 4.2 - Tools for Discovering and Interpreting Bias in Models
  • 4.3 - Challenges Faced by Underrepresented Groups

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Classifier Audit with FairVis

Quiz

  • Module 4 Quiz

Module 5: Data Integration

Lecture Slides

  • 5.1 - Knowledge Graph
  • 5.2 - Data De-duplication

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Module 5 Quiz

Module 6: Data Analytics, Concepts and Tasks

Lecture Slides

  • 6.1 - Break Complex Problems into Simpler Ones: Part 1
  • 6.2 - Break Complex Problems into simpler Ones: Part 2

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Module 6 Quiz

Module 7: Visualization 101

Lecture Slides

  • 7.1 - What is Info Vis and Why it is Important
  • 7.2 - Human Perception
  • 7.3 - Gestalt Psychology
  • 7.4 - Chart Basics
  • 7.5 - Colors
  • 7.6 - Visual Exploratory Data Analytics with cuXFilter

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Creating Visualizations

Quiz

  • Module 7 Quiz

Module 8: Fixing Common Visualization Issues

Lecture Slides

  • 8.1 - Fixing Bar Charts, Line Charts, Tables and More
  • 8.2 - Applying What You’ve Learned
  • 8.3 - Crown Jewel, Self-contained Figures and More Tips

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Module 8 Quiz

Module 9: Data Visualization for Web (D3)

Lecture Slides

  • 9.1 - Why Learn D3?
  • 9.2 - Prerequisites: Javascript and SVG
  • 9.3 - D3 Overview
  • 9.4 - Enter-Update-Exit
  • 9.5 - Attributes, Styles, Classes and Text
  • 9.6 - Scales and Axes
  • 9.7 - Dynamic Data and Interaction

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Developing Interactive Web-based Visualizations (D3, Plotly)
  • Using Visualizations at Various Stages of Your Workflow

Quiz

  • Module 9 Quiz

Module 10: Distributed Computing: Hadoop, Hive

Lecture Slides

  • 10.1 - Big Data is Common. How to Store It?
  • 10.2 - Why Hadoop?
  • 10.3 - MapReduce Overview
  • 10.4 - Example MapReduce Program
  • 10.5 - How to Try Hadoop
  • 10.6 - Hive Overview and Comparing Pig

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Hadoop on AWS/Azure/Google Cloud

Quiz

  • Module 10 Quiz

Module 11: Distributed Computing: Spark

Lecture Slides

  • 11.1 - Overview
  • 11.2 - Example Spark Programs
  • 11.3 - Spark SQL and Other Spark Libraries
  • 11.4 - RAPIDS and Spark

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs

  • Spark on AWS/Azure/Google Cloud

Quiz

  • Module 11 Quiz

Module 12: Distributed Computing: HBase

Lecture Slides available in a future release of the Teaching Kit

  • 12.1 - Overview
  • 12.2 - How HBase Scales Up Storage
  • 12.3 - How to Use HBase
  • 12.4 - Learn More About HBase

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Available in a future release of the Teaching Kit

Module 13: Distributed Computing: Dask and UCX

Lecture Slides available in a future release of the Teaching Kit

  • 13.1 - Using Dask and UCX with RAPIDS and BlazingSQL

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Available in a future release of the Teaching Kit

Module 14: Introduction to Machine Learning: Classification

Lecture Slides available in a future release of the Teaching Kit

  • 14.1 - Overview
  • 14.2 - Introduction to Supervised Learning
  • 14.3 - Linear Regression
  • 14.4 - RAPIDS Acceleration: Linear Regression
  • 14.5 - Linear Classification
  • 14.6 - Overfitting and Cross Validation
  • 14.7 - Introduction to Tree-based Methods
  • 14.8 - Decision Tree
  • 14.9 - Visualizing Classification: ROC, AUC, Confusion Matrix
  • 14.10 - Bagging
  • 14.11 - Random Forests
  • 14.12 - RAPIDS Acceleration: Random Forest
  • 14.13 - Boosting
  • 14.14 - XGBoost
  • 14.15 - RAPIDS Acceleration: K-NN, XGBoost

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • Classification
  • TBD

DLI Online Course Section

Quiz

  • Available in a future release of the Teaching Kit

Module 15: Introduction to Machine Learning: Clustering and Dimensionality Reduction

Lecture Slides available in a future release of the Teaching Kit

  • 15.1 - Introduction to Unsupervised Learning
  • 15.2 - K-means, Affinity Propagation, Hierarchical Clustering
  • 15.3 - RAPIDS Acceleration: K-means
  • 15.4 - DBSCAN
  • 15.5 - Principal Component Analysis
  • 15.6 - t-SNE
  • 15.7 - UMAP
  • 15.8 - Visualizing Clusters
  • 15.9 - RAPIDS Acceleration: DBSCAN, PCA, UMAP

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • Discovering and Visualization Clusters (cuML)
  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 16: Neural Networks

Lecture Slides available in a future release of the Teaching Kit

  • 16.1 - Introduction to Artificial Neural Networks
  • 16.2 - Artificial Neurons, Layers, Perceptron
  • 16.3 - Multilayer Perceptron
  • 16.4 - Advanced Deep Neural Networks
  • 16.5 - Going From DS to DL and Back with RAPiDS

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

DLI Online Courses with Student Certificate Opportunity

Other Shorter DLI Online Courses

Quiz

  • Available in a future release of the Teaching Kit

Module 17: Graph Analytics

Lecture Slides available in a future release of the Teaching Kit

  • 17.1 - How to Represent and Store Graphs
  • 17.2 - Graph Power Laws
  • 17.3 - Centralities: Degree, Betweenness, Clustering Coefficient
  • 17.4 - PageRank and Personalized PageRank
  • 17.5 - Interactive Graph Exploration
  • 17.6 - RAPIDS Acceleration: Graphistry and cuxfilter

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • Graph Analytics with cuGraph

Quiz

  • Available in a future release of the Teaching Kit

Module 18: Spatial Analytics

Lecture Slides

  • Available in a future release of the Teaching Kit

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 19: Signal Analytics

Lecture Slides

  • Available in a future release of the Teaching Kit

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 20: Cyber Log Accelerator

Lecture Slides

  • Available in a future release of the Teaching Kit

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 21: Streaming Data

Lecture Slides

  • Available in a future release of the Teaching Kit

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 22: Genomics

Lecture Slides

  • Available in a future release of the Teaching Kit

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 23: Quantitative Analysis

Lecture Slides

  • Available in a future release of the Teaching Kit

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 24: Text Analytics

Lecture Slides available in a future release of the Teaching Kit

  • 24.1 - Basics: Preprocessing, Representation, Word Importance
  • 24.2 - Latent Semantic Indexing (Singular Value Decomposition)
  • 24.3 - SVD: Dimensionality Reduction, and Other Uses
  • 24.4 - Text Visualization

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • Latent Semantic Indexing for Text via Singular Value Decomposition (cuML)

Quiz

  • Available in a future release of the Teaching Kit

Module 25: CPU vs. GPU-accelerated Data Science

Lecture Slides available in a future release of the Teaching Kit

  • 25.1 - RAPIDS Benefits
  • 25.2 - Speed Comparison
  • 25.3 - Refactoring Workloads

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • Accelerating Workloads Using RAPIDS

DLI Online Courses with Student Certificate Opportunity

Quiz

  • Available in a future release of the Teaching Kit

Module 26: Bias and Fairness in AI and Data Science

Lecture Slides available in a future release of the Teaching Kit

  • 26.1 - Sources of Bias and Fairness Measures
  • 26.2 - Tools for Discovering and Interpreting Bias in Models

Lecture Videos

  • Available in a future release of the Teaching Kit

Labs available in a future release of the Teaching Kit

  • TBD

Quiz

  • Available in a future release of the Teaching Kit

Module 27: Working in Data Science Teams

Lecture Slides available in a future release of the Teaching Kit

  • 27.1 - Forming Great Teams
  • 27.2 - Project Idea Checklist: Heilmeier Questions
  • 27.3 - Pay Attention to Software Licenses Early On

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Available in a future release of the Teaching Kit

Module 28: Code Back-up and Version Control

Lecture Slides available in a future release of the Teaching Kit

  • 28.1 - Git: Overview and Benefits
  • 28.2 - Warning! Keep Your Repository Private Initially
  • 28.3 - GitHub and Bitbucket

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Available in a future release of the Teaching Kit

Module 29: Data Science Projects

Lecture Slides available in a future release of the Teaching Kit

  • 29.1 - Introduction to Project Tasks
  • 29.2 - Evaluation of Project Tasks

Lecture Videos

  • Available in a future release of the Teaching Kit

Quiz

  • Available in a future release of the Teaching Kit