Data Science

Bringing Confidentiality to Vector Search with Cyborg and NVIDIA cuVS

Aug 15, 2024

By Nicolas Dupont, Corey Nolet, Rob Nertney and Nathan Stephens

Discuss (0)

AI-Generated Summary

Dislike

Cyborg has partnered with NVIDIA to enhance the security of vector databases using the NVIDIA cuVS library, which accelerates vector search with state-of-the-art algorithms and brings GPU acceleration to Cyborg's encrypted vector search engine.
The collaboration enables confidential vector search by utilizing NVIDIA Confidential Computing, which ensures data remains secured through cryptographic means and strong access controls, using trusted execution environments to provide a secure enclave for sensitive operations.
The joint proof-of-concept demonstrated significant performance improvements, with index build time sped up by an average of 47x and retrieval yielding a 9.8x performance boost, with minimal overhead from NVIDIA Hopper Confidential Computing modes for end-to-end encryption.

AI-generated content may summarize information incompletely. Verify important information. Learn more

In the era of generative AI, vector databases have become indispensable for storing and querying high-dimensional data efficiently. However, like all databases, vector databases are vulnerable to a range of attacks, including cyber threats, phishing attempts, and unauthorized access. This vulnerability is particularly concerning considering that these databases often contain sensitive and confidential information.

To address this critical issue, Cyborg has teamed up with NVIDIA to enhance the security of vector databases using the NVIDIA cuVS library, an open-source toolkit that accelerates vector search with state-of-the-art algorithms. This collaboration aims to bring GPU acceleration to Cyborg’s encrypted vector search engine, ensuring robust security without compromising performance.

Vector database vulnerabilities

Vector databases are a cornerstone of modern data-intensive applications, powering everything from retrieval-augmented generation (RAG) pipelines to recommendation systems.

The high-performance index-building and search capabilities of these databases make them essential for such applications, but the value of the data they store makes them attractive targets for malicious attacks and breaches. This risk of exposure is of particular concern for sectors where confidentiality is a business requirement:

Regulated industries: For example, healthcare, financial services, and the public sector, where stringent privacy and security requirements can outright preclude the use of vector search and its downstream applications.
IP-driven sectors: For example, pharmaceutical, manufacturing, and defense, where intellectual property forms a considerable value-driver and competitive advantage.

These concerns can be ignored when prototyping AI-driven workloads but are likely to become roadblocks when it comes to production.

Solution: confidential vector search

Cyborg, a NY-based startup, has developed an end-to-end encrypted vector search engine to solve this problem. By using forward privacy and cryptographic hashing, Cyborg Vector Search enables the secure indexing and retrieval of confidential data. End-to-end encryption means that no unencrypted vectors are ever stored in a database, considerably reducing the attack surface and addressing the confidentiality concerns mentioned earlier.

Cyborg Vector Search was designed to balance the following key performance characteristics:

End-to-end encryption: Guarantee the highest level of security and confidentiality through cryptographically secure architecture for stringent privacy requirements.
High performance: Minimize the incremental cost of end-to-end encryption, keeping the cryptographic overhead of encrypted indexing and retrieval at <5% and <30%, respectively.
Compatibility: Maintain compatibility with existing vector search pipelines and workloads to provide a simple transition from prototype to production.

NVIDIA hardware

To make encrypted indexing possible on GPUs, the solution uses NVIDIA Confidential Computing. Confidential Computing ensures that the data remains secured both cryptographically and through strong access controls, using trusted execution environments (TEEs) to provide a secure enclave for sensitive operations. This technology is crucial for maintaining the confidentiality of data during GPU-accelerated computations.

The hardware at the core of this solution is the NVIDIA H100 Tensor Core GPU (80 GB) with Confidential Computing enabled. Confidential Computing is publicly available on all NVIDIA Hopper Tensor Core GPUs today and will continue to be supported in the next generation of NVIDIA Blackwell Tensor Core GPUs.

NVIDIA GPUs configured in CC mode have hardware-based cryptographic engines, firewalls, and remote attestation flows activated to ensure the integrity of the TEE such that end users can ensure and validate that their confidential workloads are protected while in use on the GPU.

NVIDIA Hopper Confidential Computing encrypts and signs all user data on the PCIe bus with AES-GCM256 and blocks infrastructure and out-of-band access with firewalls configured by signed and attestable firmware. NVIDIA also provides a public remote attestation service such that end users or relying parties can receive up-to-date confidence that their drivers and firmware have not been revoked due to bugs or exploits.

Cyborg quickly accessed and developed their design using NVIDIA LaunchPad. LaunchPad provides NVIDIA customers, partners, and ISVs with hands-on access to prebuilt labs in a browser-based sandbox environment. The design was preconfigured with all the necessary steps to ensure the system was built and configured correctly for confidential workloads with the Develop Confidential VM Applications lab. Cyborg spent no time worrying about infrastructure and focused instead on developing their solution.

Accelerating confidential vector search

Confidential vector search, much like conventional vector search, is a computationally expensive process that can prove difficult to scale. This makes it a perfect candidate for GPU acceleration. NVIDIA cuVS contains highly optimized primitives to do just that.

To evaluate the effectiveness of this integration, Cyborg and NVIDIA conducted a joint proof-of-concept (POC). This involved integrating cuVS with Cyborg Vector Search to bring GPU-accelerated encrypted vector search to reality.

Diagram shows an encrypted indexing pipeline and encrypted retrieval pipeline with GPU-accelerated sections highlighted. — *Figure 1. Encrypted indexing and retrieval pipelines from the joint Cyborg-NVIDIA PoC*

This PoC compared encrypted indexing and retrieval performance on CPUs and GPUs. Specifically, we replaced scikit-learn KMeans and hashlib on a CPU with cuVS and a custom SHA-3 CUDA kernel on a GPU, respectively. The results speak for themselves:

Index build time was sped up by an average of 47x, reducing the time required to index vector embeddings from hours to minutes. The steps accelerated with cuVS saw an even better improvement of 52.2x for clustering model training and inference.
Retrieval also saw significant improvements: the cuVS-accelerated portion of the pipeline yielded a 9.8x performance boost with minimal code changes.
Enabling the NVIDIA Hopper Confidential Computing modes for end-to-end encryption on indexing and retrieval came at a marginal cost of 1-2% and 15-25%, respectively, compared to their unencrypted counterparts. This was a small overhead more than offset by GPU acceleration.

Starting with the index build process, Figure 2 shows the overall build time on CPU compared to GPU.

Clustering model training typically dominates index build time. If you exclude training and focus solely on quantization and encrypted indexing, the GPU still provides significant acceleration (Figure 3).

Finally, the shift to GPU yields a significant improvement across the entire retrieval pipeline (Figure 4).

All times are from the same index configuration (recall level > 0.95).

The IVFPQ index type was employed for its optimal combination of efficiency and accuracy.

Conclusion

In a world where data breaches are increasingly common, security is not just a luxury but a necessity for many organizations. The integration of Cyborg Vector Search with NVIDIA cuVS and NVIDIA Confidential Computing offers a strong approach to enhancing the security of vector databases, aiming to protect sensitive data while maintaining performance.

NVIDIA is the first and currently only GPU vendor that provides publicly available, general-access hardware-based Confidential Computing solutions with the NVIDIA Hopper family. The NVIDIA Blackwell generation continues to improve upon the technology, partnering with other industry leaders to increase performance, security, and ease of use.

Cyborg Vector Search is currently in closed testing with early commercial partners. As development continues, Cyborg would love to hear from you if data security is important to your AI workloads.

To try NVIDIA Hopper Confidential Computing today, register for the LaunchPad lab. You can also see what other LaunchPad solutions are available for trial from their extensive list of labs.

Discuss (0)

About the Authors

About Nicolas Dupont
Nicolas Dupont is the CEO of Cyborg, a startup working on confidential AI technology. Under his leadership, Cyborg has developed and launched Stealth, a confidential data platform, and the first confidential vector search engine.

View all posts by Nicolas Dupont

About Corey Nolet
Corey is a data scientist and principal engineer on the RAPIDS ML team at NVIDIA, where he focuses on building and scaling machine learning algorithms to support extreme data loads at light speed. Prior to working at NVIDIA, Corey spent over a decade building massive-scale exploratory data science & real-time analytics platforms for big-data and HPC environments in the defense industry. Corey holds Bs. & Ms. degrees in Computer Science. He is also working towards his Ph.D. in the same discipline, focused on the acceleration of algorithms at the intersection of graph and machine learning. Corey has a passion for using data to make better sense of the world.

View all posts by Corey Nolet

About Rob Nertney
Rob Nertney is a senior software architect for confidential computing. He has spent nearly 15 years architecting the features and deployment of accelerator hardware into hyperscale environments for both internal and external use by developers. He has several patents in processor design relating to secure solutions that are in production today. In his spare time, he loves golfing when the weather is nice, and gaming (on RTX hardware of course!) when the weather isn’t.

View all posts by Rob Nertney

About Nathan Stephens
Nathan Stephens is a senior manager for Developer Relations at NVIDIA. His background is in analytic solutions and consulting. He has experience building data science teams, architecting analytic infrastructure, and delivering innovative data products. He is a longtime advocate for operationalizing data science using open-source software.

View all posts by Nathan Stephens