After clicking “Watch Now” you will be prompted to login or join.
Automating End-to-End PyTorch Profiling
Aditya Agrawal, Google Brain | Marek Kolodziej, Uber ATG
GTC 2020
Video-to-video synthesis (vid2vid) aims to convert an input semantic video, such as human poses or segmentation masks, to an output photorealistic video. However, existing approaches have limited generalization capability. For example, to generalize a trained human synthesis model to a new subject previously unseen in the training set requires collecting a dataset of the new subject, as well as retraining a new model. To address these limitations, we propose an adaptive vid2vid framework to synthesize previously unseen subjects or scenes by leveraging few example images of the target at test time. Our model achieves this few-shot generalization capability via a novel network weight-generation module utilizing an attention mechanism. We conduct extensive experimental validations with comparisons to strong baselines on different datasets.