Data Science

Bringing Confidentiality to Vector Search with Cyborg and NVIDIA cuVS

In the era of generative AI, vector databases have become indispensable for storing and querying high-dimensional data efficiently. However, like all databases, vector databases are vulnerable to a range of attacks, including cyber threats, phishing attempts, and unauthorized access. This vulnerability is particularly concerning considering that these databases often contain sensitive and confidential information.

To address this critical issue, Cyborg has teamed up with NVIDIA to enhance the security of vector databases using the NVIDIA cuVS library, an open-source toolkit that accelerates vector search with state-of-the-art algorithms. This collaboration aims to bring GPU acceleration to Cyborg’s encrypted vector search engine, ensuring robust security without compromising performance.

Vector database vulnerabilities

Vector databases are a cornerstone of modern data-intensive applications, powering everything from retrieval-augmented generation (RAG) pipelines to recommendation systems. 

The high-performance index-building and search capabilities of these databases make them essential for such applications, but the value of the data they store makes them attractive targets for malicious attacks and breaches. This risk of exposure is of particular concern for sectors where confidentiality is a business requirement:

  • Regulated industries: For example, healthcare, financial services, and the public sector, where stringent privacy and security requirements can outright preclude the use of vector search and its downstream applications.
  • IP-driven sectors: For example, pharmaceutical, manufacturing, and defense, where intellectual property forms a considerable value-driver and competitive advantage.

These concerns can be ignored when prototyping AI-driven workloads but are likely to become roadblocks when it comes to production. 

Cyborg, a NY-based startup, has developed an end-to-end encrypted vector search engine to solve this problem. By using forward privacy and cryptographic hashing, Cyborg Vector Search enables the secure indexing and retrieval of confidential data. End-to-end encryption means that no unencrypted vectors are ever stored in a database, considerably reducing the attack surface and addressing the confidentiality concerns mentioned earlier.

Cyborg Vector Search was designed to balance the following key performance characteristics:

  • End-to-end encryption: Guarantee the highest level of security and confidentiality through cryptographically secure architecture for stringent privacy requirements.
  • High performance: Minimize the incremental cost of end-to-end encryption, keeping the cryptographic overhead of encrypted indexing and retrieval at <5% and <30%, respectively.
  • Compatibility: Maintain compatibility with existing vector search pipelines and workloads to provide a simple transition from prototype to production.

NVIDIA hardware

To make encrypted indexing possible on GPUs, the solution uses NVIDIA Confidential Computing. Confidential Computing ensures that the data remains secured both cryptographically and through strong access controls, using trusted execution environments (TEEs) to provide a secure enclave for sensitive operations. This technology is crucial for maintaining the confidentiality of data during GPU-accelerated computations.

The hardware at the core of this solution is the NVIDIA H100 Tensor Core GPU (80 GB) with Confidential Computing enabled. Confidential Computing is publicly available on all NVIDIA Hopper Tensor Core GPUs today and will continue to be supported in the next generation of NVIDIA Blackwell Tensor Core GPUs. 

NVIDIA GPUs configured in CC mode have hardware-based cryptographic engines, firewalls, and remote attestation flows activated to ensure the integrity of the TEE such that end users can ensure and validate that their confidential workloads are protected while in use on the GPU. 

NVIDIA Hopper Confidential Computing encrypts and signs all user data on the PCIe bus with AES-GCM256 and blocks infrastructure and out-of-band access with firewalls configured by signed and attestable firmware. NVIDIA also provides a public remote attestation service such that end users or relying parties can receive up-to-date confidence that their drivers and firmware have not been revoked due to bugs or exploits.

Cyborg quickly accessed and developed their design using NVIDIA LaunchPad. LaunchPad provides NVIDIA customers, partners, and ISVs with hands-on access to prebuilt labs in a browser-based sandbox environment. The design was preconfigured with all the necessary steps to ensure the system was built and configured correctly for confidential workloads with the Develop Confidential VM Applications lab. Cyborg spent no time worrying about infrastructure and focused instead on developing their solution.

Confidential vector search, much like conventional vector search, is a computationally expensive process that can prove difficult to scale. This makes it a perfect candidate for GPU acceleration. NVIDIA cuVS contains highly optimized primitives to do just that.

To evaluate the effectiveness of this integration, Cyborg and NVIDIA conducted a joint proof-of-concept (POC). This involved integrating cuVS with Cyborg Vector Search to bring GPU-accelerated encrypted vector search to reality. 

Diagram shows an encrypted indexing pipeline and encrypted retrieval pipeline with GPU-accelerated sections highlighted.
Figure 1. Encrypted indexing and retrieval pipelines from the joint Cyborg-NVIDIA PoC 

This PoC compared encrypted indexing and retrieval performance on CPUs and GPUs. Specifically, we replaced scikit-learn KMeans and hashlib on a CPU with cuVS and a custom SHA-3 CUDA kernel on a GPU, respectively. The results speak for themselves:

  • Index build time was sped up by an average of 47x, reducing the time required to index vector embeddings from hours to minutes. The steps accelerated with cuVS saw an even better improvement of 52.2x for clustering model training and inference.
  • Retrieval also saw significant improvements: the cuVS-accelerated portion of the pipeline yielded a 9.8x performance boost with minimal code changes.
  • Enabling the NVIDIA Hopper Confidential Computing modes for end-to-end encryption on indexing and retrieval came at a marginal cost of 1-2% and 15-25%, respectively, compared to their unencrypted counterparts. This was a small overhead more than offset by GPU acceleration.

Starting with the index build process, Figure 2 shows the overall build time on CPU compared to GPU.

Bar chart comparing the overall index build time on CPU and GPU. The GPU significantly accelerates the process, reducing the time from several hours to minutes.
Figure 2. Overall index build time on CPU and GPU

Clustering model training typically dominates index build time. If you exclude training and focus solely on quantization and encrypted indexing, the GPU still provides significant acceleration (Figure 3).

Bar chart comparing index build time on CPU and GPU without clustering model training. The GPU markedly reduces the time required compared to CPU.
Figure 3. Index build time without clustering model training on CPU and GPU

Finally, the shift to GPU yields a significant improvement across the entire retrieval pipeline (Figure 4).

Bar chart comparing retrieval time on CPU and GPU. The GPU substantially speeds up retrieval compared to CPU.
Figure 4. Retrieval time on CPU and GPU

All times are from the same index configuration (recall level > 0.95).

The IVFPQ index type was employed for its optimal combination of efficiency and accuracy.

Conclusion

In a world where data breaches are increasingly common, security is not just a luxury but a necessity for many organizations. The integration of Cyborg Vector Search with NVIDIA cuVS and NVIDIA Confidential Computing offers a strong approach to enhancing the security of vector databases, aiming to protect sensitive data while maintaining performance. 

NVIDIA is the first and currently only GPU vendor that provides publicly available, general-access hardware-based Confidential Computing solutions with the NVIDIA Hopper family. The NVIDIA Blackwell generation continues to improve upon the technology, partnering with other industry leaders to increase performance, security, and ease of use.

Cyborg Vector Search is currently in closed testing with early commercial partners. As development continues, Cyborg would love to hear from you if data security is important to your AI workloads.

To try NVIDIA Hopper Confidential Computing today, register for the LaunchPad lab. You can also see what other LaunchPad solutions are available for trial from their extensive list of labs.

Discuss (0)

Tags