GTC-DC 2019: You Don’t Have as Much Data as You Think You Do (Presented by CACI)
Jonathan Von Stroh, CACI
We’ll prove that, though data is everywhere, bigger datasets aren’t always better. The most convenient sources of labeled data are often subject to extreme sampling biases. We’ll introduce an approach to active learning and dataset quality analysis based on unsupervised learning.