Accelerating Billion Vector Similarity Searches with GPUs

Relying on the capabilities of GPUs, a team from Facebook AI Research has developed a faster, more efficient way for AI to run similarity searches. The study, published in IEEE Transactions on Big Data, creates a deep learning algorithm capable of handling and comparing high-dimensional data from media that is notably faster, while just as accurate as previous techniques.

In a world with an ever-growing supply of data, the work promises to ease both the compute power and time needed for processing large libraries.

“The most straightforward technique for searching and indexing [high-dimensional data] is by brute-force comparison, whereby you need to check [each image] against every other image in the database. This is impractical for collections containing billions of vectors,” Jeff Johnson, study colead and a research engineer at Facebook, said in a press release.

Containing millions of pixels and data points, every image and video creates billions of vectors. This large amount of data is valuable for analyzing, detecting, indexing, and comparing vectors. It is also problematic for calculating similarities of large libraries with traditional CPU algorithms that rely on several supercomputer components, slowing down overall computing time.

Using only four GPUs with CUDA, the researchers designed an algorithm for GPUs to both host and analyze library image data points. The method also compresses the data, making it easier, and thus faster to analyze.

An example of how the algorithm computes the smoothest path between images where only the first and the last image are given. Credit: Facebook/Johnson et al

The new algorithm processed over 95 million high-dimensional images in 35 minutes. A graph of a billion vectors took less than 12 hours to compute. According to a comparison test in the study, handling the same database with a cluster of 128 CPU servers took 108.7 hours-about 8.5x longer.

“By keeping computations purely on a GPU, we can take advantage of the much faster memory available on the accelerator, instead of dealing with the slower memories of CPU servers and even slower machine-to-machine network interconnects within a traditional supercomputer cluster,” said Johnson.

The researchers state the methods are already being applied to a wide variety of tasks, including a language processing search for translations. Known as the Facebook AI Similarity Search library, the approach is open source for implementation, testing, and comparison.

Read more >>>
Read the full article in IEEE Transactions on Big Data >>>