GTC 2020: Rethinking Impact Factor: an NLP-Driven Metric and Pipeline Using Generalized Autoregressive Pretraining on Medical Journals for Granular Knowledge

After clicking “Watch Now” you will be prompted to login or join.

Click “Watch Now” to login or join the NVIDIA Developer Program.

WATCH NOW

Rethinking Impact Factor: an NLP-Driven Metric and Pipeline Using Generalized Autoregressive Pretraining on Medical Journals for Granular Knowledge

Leo Tam, NVIDIA | Yuan-Ting Hsieh, NVIDIA

GTC 2020

Dramatic advances in NLP have reinvented performance on public leaderboards, such as the Stanford Question Answering Dataset and decathlon superGLUE. Nominally, approaches follow transformers-based architectures in a pretrain-finetune paradigm, with the bulk of compute in the pretrain phase. Where previous studies have elucidated finetune paradigms for recurrent neural network architectures, ours examines a multi-phase NLP paradigm for realizing expert-level domain-specific performance, specifically for the clinical task of unplanned 30-day hospital readmission. With exhaustive GPU studies and Bayesian optimization, in part with the NVIDIA Clara Train platform, we'll show that systems are only as good as what they read. From 20 top medical practice journals over the past 90 years, we determined a novel AI-impact factor for the clinical task that guides to state-of-the-art AUC of 0.74. We'll review best practices on training modern transformer-based architectures for medicine.

View More GTC 2020 Content