OneDrive

The OneDrive Source Connector allows you to integrate OneDrive as a data source for your pipelines. This guide explains the configuration options available when setting up a OneDrive connector.

Before you begin

Before starting, you'll need:

  • A Microsoft Entra ID application.

  • One or more Microsoft 365 users.

  • The following information for your app:

    • Client Id

    • Tenant Id

    • Client Secret

    • User email address(es)

If you don't have an application created yet, check out our guide How to Create a Microsoft Entra ID Application.

Configure the connector

To configure a OneDrive connector to your Microsoft Entra ID application:

  1. Click Source Connectors from the main menu.

  2. Click New Source Connector from the Source Connectors page.

  3. Select the OneDrive card.

    OneDrive Card
  4. Enter the name, client id, tenant id, client secret, and user email address(es) in the form using the OneDrive Parameters table below as a guide, then click Create OneDrive Integration.

    OneDrive Connection Parameters

OneDrive Parameters

FieldDescriptionRequired

Name

A descriptive name to identify the connector within Vectorize.

Yes

Client Id

The Microsoft Entra Id application's client id.

Yes

Tenant Id

The Microsoft Entra Id application's tenant

Yes

Client Secret

The Microsoft Entra Id application's client secret value.

Yes

Users

The email addresses for the OneDrive users whose files will be ingested. Enter one address per line, and select + Add to add each additional user.

Yes

Configuring the OneDrive Connector in a RAG Pipeline

You can think of the OneDrive connector as having two parts to it. The first is authorization with your Microsoft Entra ID application. This part is re-usable across pipelines and allows you to connect to this same application in different pipelines without providing the credentials every time.

The second part is the configuration that's specific to your RAG Pipeline, such as which files and directories should be processed.

Configuring OneDrive for RAG Pipeline

The following table outlines the fields available when configuring a OneDrive source for use within a Retrieval-Augmented Generation (RAG) pipeline.

FieldDescriptionRequired

File Extensions

Specifies the types of files to be included (e.g., PDF, HTML, Markdown, Text, DOCX).

Yes

Read starting from this folder

Specifies the OneDrive root folder to pull content from. Content will be ingested from any subfolders under the root folder.

No

What's next?

  • If you haven't yet built a connector to your vector database, go to Configuring Vector Database Connectors and select the platform you prefer to use for storing output vectors.

    OR

  • If you're ready to start producing vector embeddings from your input data, head to Pipeline Basics. Select your new connector as the data source to use it in your pipeline.

Last updated