U.S. Library of Congress Processes over 16 Million Historic Newspaper Pages Using AI

The U.S. Library of Congress developed a GPU-accelerated, deep learning model to automatically extract, categorize, and caption over 16 million pages of historic American newspapers. This is the largest dataset of its kind ever produced.