Azure Blob Storage
Each pipeline requires at least one data source to supply unstructured data. Unless you're using a local file as input, you must first configure your preferred platform and supply its access credentials and connectivity parameters to the connector configuration.
Follow the instructions below to set up a reusable Azure Blob Storage connector.
What you'll need
Configure your Azure Blob Storage instance for remote access.
In the Azure UI, locate your:
Container
Storage Account Name
Storage Account Key
To configure a connector to your Azure Blob Storage instance
Click Source Connectors from the main menu.
Click New Source Connector from the Source Connectors page.
Select the Azure Blob Storage card.
Enter connection parameters in the form using the Azure Blob Parameters table below as a guide.
Click Create Azure Blob Integration to test connector connectivity and save your configuration.
Azure Blob Parameters
Field | Blob parameter | Notes |
---|---|---|
Name | n/a | An alphanumeric name of your choosing. |
Container | Container name | Your source data files must be inside a container. |
Storage Account Name | Your Azure Blob Storage account name. | |
Storage Account Key | Secret key | Your Azure Blob Storage instance key. |
When you specify your configured Azure Blob Storage source in your pipeline configuration, Vectorize ingests all compatible files at the specified endpoint.
What's next?
If you haven't yet built a connector to your vector database, go to Configuring Vector Database Connectors and select the platform you prefer to use for storing output vectors.
OR
If you're ready to start producing vector embeddings from your input data, head to Pipeline Basics. Select your new connector as the data source to use it in your pipeline.
Last updated