Azure Blob Storage

Each pipeline requires at least one data source to supply unstructured data. Unless you're using a local file as input, you must first configure your preferred platform and supply its access credentials and connectivity parameters to the connector configuration.

Follow the instructions below to set up a reusable Azure Blob Storage connector.

What you'll need

  1. Configure your Azure Blob Storage instance for remote access.

  2. In the Azure UI, locate your:

    • Container

    • Storage Account Name

    • Storage Account Key

To configure a connector to your Azure Blob Storage instance

  1. Click Source Connectors from the main menu.

  2. Click New Source Connector from the Source Connectors page.

  3. Select the Azure Blob Storage card.

  1. Enter connection parameters in the form using the Azure Blob Parameters table below as a guide.

  1. Click Create Azure Blob Integration to test connector connectivity and save your configuration.

Azure Blob Parameters

FieldBlob parameterNotes

Name

n/a

An alphanumeric name of your choosing.

Container

Container name

Your source data files must be inside a container.

Storage Account Name

Your Azure Blob Storage account name.

Storage Account Key

Secret key

Your Azure Blob Storage instance key.

When you specify your configured Azure Blob Storage source in your pipeline configuration, Vectorize ingests all compatible files at the specified endpoint.

What's next?

  • If you haven't yet built a connector to your vector database, go to Configuring Vector Database Connectors and select the platform you prefer to use for storing output vectors.

    OR

  • If you're ready to start producing vector embeddings from your input data, head to Pipeline Basics. Select your new connector as the data source to use it in your pipeline.

Last updated