Using the Retrieval Endpoint
Last updated
Last updated
This guide shows how to access and use the retrieval endpoint for a RAG Pipeline in Vectorize.
Navigate to the RAG Pipelines section in the Vectorize dashboard.
Click on the name of your desired pipeline (e.g., "friends-scripts").
In the pipeline details view, click the Connect tab.
In the Connect tab, you'll find:
Use the "Manage access tokens" button to create or manage authentication tokens.
The unique URL for your pipeline's retrieval endpoint is displayed here.
Click "Copy to Clipboard" to copy the URL.
The endpoint calculates a vector for the question text and performs a similarity search on the vector DB.
Results are passed to the Cohere reranking model. Relevance scores are included in each response.
Set rerank
to true
to order results by the rerank model's relevance score. Otherwise, results are ordered by similarity score.
An example cURL request is provided:
Metadata filtering in a retrieval endpoint allows you to narrow down the data retrieved from your vector database based on specific metadata attributes. This allows you to filter data not just by similarity to a query, but also by tags, categories, or other metadata properties, improving the precision of context provided to the large language model (LLM).
Metadata filters must be specified when you create a RAG pipeline. Each metadata field must exist in your documents for filtering to work.
When specifying your vector database index name, click Add Metadata Filters.
Enter a single metadata field in the text box that appears. To add an additional field, click on Add filter and a new text box will appear.
Pinecone supports multiple value matches per key, so the values you're filtering on must be formatted as a list.
Example using cURL:
All other databases support a single value per key.
Example using cURL: