Data Sovereignty in Vector Stores: A Deep Dive into Qdrant’s Hybrid Cloud

M Quamer Nasim
6 min readMay 30, 2024

--

With the rise of RAG applications in enterprises, data compliance has become a key matter of concern. Much of the data that powers enterprise RAG applications consists of sensitive information that comes with various compliance requirements. Therefore, when enterprises look to build AI solutions, they need to use a stack that allows them to control the infrastructure in such a way that they adhere to the rules and regulations of various geographies. This is especially true when the AI application uses a vector store, as the data during transit and data stored need to be carefully protected.

To address these concerns, Qdrant has recently launched a Hybrid Cloud offering. This solution provides an alternative path where organizations do not need to share any data or API keys with Qdrant, yet can efficiently manage their vector database.

What’s interesting is that Qdrant Hybrid Cloud has capabilities similar to Qdrant’s own cloud platform. It uses Kubernetes clusters to unify environments — cloud, on-premises, or edge — into a single, enterprise-grade managed service.

In order to understand how the Qdrant Hybrid Cloud works, I decided to try it out. In this article, I will walk you through my experience of using it end to end. Let’s dive in.

How It Works

When an enterprise onboards a Kubernetes cluster as a Hybrid Cloud Environment, they can deploy the Qdrant Kubernetes Operator and Cloud Agent into this cluster. These components manage Qdrant databases within the organization’s Kubernetes cluster and establish an outgoing connection to Qdrant Cloud at cloud.qdrant.io on port 443.

This setup allows it to benefit from the same cloud management features and transport telemetry as available with any managed Qdrant Cloud cluster.

Platform Deployment Options

Qdrant Hybrid Cloud supports deployment on various managed Kubernetes platforms, including but not limited to:

  • Akamai (Linode)
  • Amazon Web Services (AWS)
  • Civo
  • DigitalOcean
  • Google Cloud Platform
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • OVHcloud
  • Red Hat OpenShift
  • Scaleway
  • STACKIT
  • Vultr

Each platform has specific prerequisites and installation steps, which are detailed in the Qdrant Hybrid Cloud Setup Guide.

Setting Up the Hybrid Cloud

In this blog, I will show you how to set up Qdrant’s hybrid cloud environment using Digital Ocean Cluster.

Set Up the Kubernetes Cluster in DigitalOcean

To start with the Qdrant Hybrid Cloud setup, you’ll need a Kubernetes cluster. This can be deployed on any cloud platform, on-premises, or in an edge environment. In this example, we’re using DigitalOcean to create and manage our Kubernetes cluster.

Go to this link and fire up a Kubernetes cluster on Digital Ocean. In the image given below, you can see that I have a cluster on my Digital Ocean account.

After deploying the Kubernetes cluster, the next step is to verify all its components to ensure everything is set up correctly.

We can see that our cluster is now up and running. It should be noted that this cluster is not tied to any Qdrant infrastructure yet. We will now integrate this DigitalOcean Kubernetes cluster with the Qdrant Hybrid Cloud infrastructure.

Set Up a Hybrid Cloud Environment on Qdrant

To integrate our Kubernetes cluster with the Qdrant Hybrid Cloud infrastructure, we’ll navigate to Qdrant’s Dashboard and access the Hybrid Cloud section.

We will then create a Hybrid Cloud Environment, as shown in the image below. We will need to enter the name of the hybrid cloud environment as well as the Kubernetes namespace for the Qdrant component. Once set, we will be able to see this component with the same name on our Digital Ocean cluster. For now, we will keep the rest of the stuff to default. You can try and experiment with different configurations. To learn more about the advanced setup, follow this link.

Once we’ve created the environment, we will be provided with a one-time installation command that we need to execute in our DigitalOcean cluster. Qdrant doesn’t need any API keys of our cluster in order to maintain data sovereignty, and this is why we will need to run the one-time installation command provided by Qdrant on our own.

Below is the one-time command that is generated for our DigitalOcean Cluster.

Configure Your DigitalOcean Cluster with the Hybrid Cloud Environment of Qdrant

Now, we will run this one-time installation command in the terminal.

We can see that our cluster, which is deployed on DigitalOcean, is now integrated with the Hybrid Cloud Environment of Qdrant, and we can verify this by looking at the last four namespaces of the cluster.

The dashboard waits for you to run the above command in the cluster. Once that is done, you can go ahead and click on the Continue button.

Create New Clusters on DigitalOcean from Hybrid Cloud

Now, let’s go ahead and create clusters on DigitalOcean from the Hybrid Cloud itself. First, we’ll check if all the states are ready; if they are, we’ll proceed to create the new clusters.

Next, we will choose the hardware specs of the hybrid cluster to be created on our Digital Ocean from the Qdrant dashboard itself.

Once that is done, we can see that our cluster is firing up. All of these things are happening in our digital ocean cluster. We are only managing this cluster from Qdrant’s Hybrid Cloud Environment.

We can also verify this from the terminal.

Finally, we can see from the Qdrant dashboard that the cluster is up and running.

Now, using this endpoint, we can easily connect with our Qdrant vector database and build our RAG application like normal.

If you’re interested in further enhancing the data security and privacy of your vector database, you can follow one of my blogs, where I explored the use of role-based access control (RBAC) in the Qdrant vector database.

If you’re also interested in learning how to build an RAG application, you can follow my other blog, where I built a chatbot using the RAG Stack. You could also try to integrate Qdrant’s Hybrid Cloud and RBAC into this RAG-powered chatbot I built.

Future Notes

In this blog, I explored how organizations can manage their data for RAG-based applications efficiently without their data ever leaving their infrastructure. If you are looking to build a data sovereign architecture for your AI application, do give it a spin!

References

  1. https://qdrant.tech/documentation/hybrid-cloud/
  2. https://cloud.qdrant.io/overview
  3. https://www.youtube.com/@qdrant

--

--

M Quamer Nasim

Data Scientist with 3.5+ years in Computer Vision & LLMS. Expert in building RAG-Powered apps. Passionate about turning data into actionable insights.