Qdrant

RAG Pipeline Quickstart with Qdrant

Approximate time to complete: 5-10 minutes, excluding prerequisites

This quickstart will walk you through creating and scheduling a pipeline that uses a web crawler to ingest data from the Vectorize documentation, creates vector embeddings using an OpenAI embedding model, and writes the vectors to a Qdrant vector database.

Before you begin

Before starting, ensure you have access to the credentials, connection parameters, and API keys as appropriate for the following:

A Vectorize account (Create one free here ↗ )
An OpenAI API Key (How to article)
An Qdrant account (Create one on Qdrant ↗)

Step 1: Create a Qdrant Cluster

Create Your Database

This quickstart shows how to create a free cluster. The steps are the same if you'd like to create a production cluster instead.

Log in to Qdrant, navigate to Clusters, and click Create Free Cluster.
While your cluster is creating, generate your API key by clicking Generate API Key.
Save and securely store your API key.
Scroll down and click on Manage your Cluster.
Copy and securely store your cluster's endpoint.

Step 2: Create a RAG Pipeline on Vectorize

Create a New RAG Pipeline

Open the Vectorize Application Console ↗
From the dashboard, click on + New RAG Pipeline under the "RAG Pipelines" section.
Enter a name for your pipeline. For example, you can name it quickstart-pipeline.
Click on New Vector DB to create a new vector database integration.
Select Qdrant from the list of vector databases.
Enter the parameters in the form using the Qdrant Parameters table below as a guide, then click Create Qdrant Integration.

Qdrant Parameters

Field	Description	Required
Name	A descriptive name to identify the integration within Vectorize.	Yes
Host	The Qdrant cluster's endpoint.	Yes
API key	The Qdrant cluster's API key.	Yes

Configure the Qdrant integration in your RAG Pipeline

You can think of the Qdrant integration as having two parts to it. The first is authorization with your Qdrant cluster. This part is re-usable across pipelines and allows you to connect to this same application in different pipelines without providing the credentials every time.

The second part is the configuration that's specific to your RAG Pipeline. This is where you specify the name of the collection in your Qdrant database. If the collection does not already exist, Vectorize will create it for you.

Enter your collection name to complete configuration of your Qdrant integration for your RAG pipeline.

Create Qdrant Integration

Configure AI Platform

Click on New AI Platform.
Select OpenAI from the AI platform options.
In the OpenAI configuration screen:
- Enter a descriptive name for your OpenAI integration.
- Enter your OpenAI API Key.
Leave the default values for embedding model, chunk size, and chunk overlap for the quickstart.

Add Source Connectors

Click on Add Source Connector.

Web Crawler Source

Choose the type of source connector you'd like to use. In this example, select Web Crawler.

Choose Web Crawler

Configure Web Crawler Integration

Name your web crawler source connector, e.g., vectorize-docs.
Set Seed URL(s) to https://docs.vectorize.io.

Configure Web Crawler

Click Create Web Crawler Integration to proceed.

Configure Web Crawler Pipeline

Accept all the default values for the web crawler pipeline configuration:
- Throttle Wait Between Requests: 500 ms
- Maximum Error Count: 5
- Maximum URLs: 1000
- Maximum Depth: 50
- Reindex Interval: 3600 seconds

Web Crawler Pipeline Configuration

Click Save Configuration.

Verify Source Connector and Schedule Pipeline

Verify that your web crawler connector is visible under Source Connectors.
Click Next: Schedule RAG Pipeline to continue.

Verify Source Connector

Schedule RAG Pipeline

Accept the default schedule configuration
Click Create RAG Pipeline.

Schedule RAG Pipeline

Step 3: Monitor and Test Your Pipeline

Monitor Pipeline Creation and Backfilling

The system will now create, deploy, and backfill the pipeline.
You can monitor the status changes from Creating Pipeline to Deploying Pipeline and Starting Backfilling Process.

Pipeline Creation

Once the initial population is complete, the RAG pipeline will begin crawling the Vectorize docs and writing vectors to your Pinecone index.

Pipeline Backfilling

View RAG Pipeline Status

Once the website crawling is complete, your RAG pipeline will switch to the Listening state, where it will stay until more updates are available.

Pipeline Listening State

Test Your Pipeline in the RAG Sandbox

In the RAG Sandbox, you can ask questions about the data ingested by the web crawler. Click on RAG Sandbox.

RAG Pipelines Page

Type a question into the input field (e.g., "What are the key features of Vectorize?"), and click Submit.

Ask Questions in Sandbox

The system will return the most relevant chunks of information from your indexed data, along with an LLM response.

This completes the RAG pipeline quickstart. Your RAG pipeline is now set up and ready for use with Qdrant and Vectorize.

RAG Pipeline Quickstart with Qdrant​

Before you begin​

Step 1: Create a Qdrant Cluster​

Create Your Database​

Step 2: Create a RAG Pipeline on Vectorize​

Create a New RAG Pipeline​

Qdrant Parameters​

Configure the Qdrant integration in your RAG Pipeline​

Configure AI Platform​

Add Source Connectors​

Configure Web Crawler Integration​

Configure Web Crawler Pipeline​

Verify Source Connector and Schedule Pipeline​

Schedule RAG Pipeline​

Step 3: Monitor and Test Your Pipeline​

Monitor Pipeline Creation and Backfilling​

View RAG Pipeline Status​

Test Your Pipeline in the RAG Sandbox​

RAG Pipeline Quickstart with Qdrant

Before you begin

Step 1: Create a Qdrant Cluster

Create Your Database

Step 2: Create a RAG Pipeline on Vectorize

Create a New RAG Pipeline

Qdrant Parameters

Configure the Qdrant integration in your RAG Pipeline

Configure AI Platform

Add Source Connectors

Configure Web Crawler Integration

Configure Web Crawler Pipeline

Verify Source Connector and Schedule Pipeline

Schedule RAG Pipeline

Step 3: Monitor and Test Your Pipeline

Monitor Pipeline Creation and Backfilling

View RAG Pipeline Status

Test Your Pipeline in the RAG Sandbox