Skip to main content

Updates

2025-04-30

Automatic Metadata Extraction

We've added automatic metadata extraction capabilities that allow you to define schemas for extracting structured information from your documents. This powerful feature uses our Iris model to analyze documents and extract both document-level and section-level metadata, enhancing your retrieval capabilities with additional context and filtering options. Learn more about Automatic Metadata Extraction.

Visual Schema Editor

The new visual schema editor makes it easy to define and manage metadata schemas without needing to write JSON directly. You can start from scratch, use pre-defined templates, or have the system automatically generate a schema by analyzing your documents. Learn more about creating metadata schemas.

Enhanced Retrieval with Metadata

Improve retrieval quality by adding extracted metadata to your text chunks. This feature makes important information available during semantic search, especially beneficial for documents that span many chunks or contain specific value strings. Learn more about metadata settings.

2025-04-09

Google Drive OAuth

We've added OAuth support for Google Drive, making it easier to connect and ingest data from your Google Drive documents. Learn more about Google Drive OAuth.

Chunk Metadata Improvements

Enhanced chunk metadata now includes page start/end information, making it easier to trace chunks back to their original document pages. Learn more.

Overages Spend Limit Management

Added improved overages spend limit management features, including better visibility of thresholds and clearer messages when limits are reached. Learn more about spending limits.

2025-03-14

Pipeline Editing Improvements

We've enhanced our pipeline editing capabilities, allowing you to modify existing pipelines without having to recreate them. You can now edit pipeline names directly from the pipeline view, add new data sources, change extraction techniques, and make other modifications to your RAG pipelines, making it simpler to organize and manage your data processing workflows.

2025-03-04

Deep Research

We've added the ability to do deep research on your private data. Using the data in your pipeline, our deep research feature will generate a report on any topic. You can combine your private data with internet data using the web search and you can save the report as a PDF when it is done. You can specify exactly how you want the report to be structured, or you can let deep research figure it out.

2025-02-24

Built-in Vectorize Database & Embedder

We've added a built-in vector database and embedder, allowing you to get your RAG pipeline up and running without needing to set up a separate vector database. You can still use your own database if preferred.

Interactive Pipeline Tour

The pipeline tour now not only walks you through the steps of creating a pipeline and retrieving data, but also creates a pipeline for you in real-time using the built-in vector database and embedder.

Visual RAG Pipeline Editor

We've released a completely new version of the pipeline editor. The new Visual RAG Pipeline Editor provides a cleaner, more intuitive experience, allowing you to build and deploy pipelines more efficiently. Check out our walkthrough.

Visual Pipeline Editor

Vectorize API Beta Enhancements

The API Beta continues to expand, adding more functionality for managing integrations and RAG pipelines. View the API docs.

Retrieval Performance

A new Retrieval Performance dashboard provides real-time monitoring and analysis of retrieval effectiveness. Learn more.

Support for Groq in our Chatbot Starter

Support for Groq has been added to the ready-to-use Next.js chatbot. The chatbot is pre-configured with your pipeline’s retrieval endpoint and can be downloaded from the Connect tab.

2025-02-03

Features

Extraction Tester

The Extraction Tester lets you test how different extraction methods process your documents before creating a RAG pipeline. This helps you choose the best extraction method for your specific documents and use case.

Documentation

Vectorize Iris

Vectorize Iris is a model-based extraction solution that combines extraction and chunking into one streamlined process, making it easier than ever to get clean, usable text from complex documents.

Documentation

2025-02-02

Features

Vectorize API (Beta)

Manage your connectors, AI platforms, vector databases, and pipelines using the Vectorize API. The API is Beta and may change.

Documentation

2025-01-23

Features

Query rewriting

Query rewriting uses conversation history to improve retrieval relevance. Before retrieving relevant documents, the system reformulates the user query based on the context of the conversation. This can help you provide more accurate answers to user queries.

Documentation

2025-01-16

Features

Performance metrics for RAG pipelines are now available in the Vectorize UI users on the Pro plan. The dashboard displays key metrics to help you evaluate retrieval performance, including:

  • Non-Rewritten Relevance
  • Rewritten Question Relevance
  • Overall Retrieval Health

New pipeline status: HIBERNATING

If a pipeline has been inactive for 14 days (no data processed, no use of the retrieval endpoint), it will hibernate. In order to use the pipeline it must be manually restarted.

If the pipeline remains inactive after restart, after 14 days it will hibernate again.

To prevent a pipeline from hibernating:

  • Perform a single retrieval on the pipeline's endpoint.
  • A new document being written/updated will also prevent hibernation.

Documentation

Integrations

New Vector Database: Weaviate

Documentation

2025-01-10 🎊

New Integrations

  • New Vector Databases
    • Qdrant

2024-12-26

New Integrations

  • New Vector Databases
    • PostgreSQL

2024-12-06

Features

  • Added a context recall metric to RAG evaluations, which measures whether the retrieved context contains the necessary information to answer the provided question. The higher the value, the better the system is at retrieving relevant context.

New Integrations

  • New Vector Databases
    • Milvus / Zilliz Cloud

2024-11-29

  • Added the ability to select multiple Dropbox folders to read data from.

New Integrations

  • Connectors
    • SharePoint
  • New AI Platforms
    • Vertex AI

2024-11-21

Features

  • Added support for Firecrawl's /scrape endpoint.
  • Integrations not in use by a RAG pipeline can now be edited.

New Integrations

  • Connectors
    • Firecrawl
    • Google Drive
    • Dropbox
    • OneDrive
  • New Vector Databases
    • SingleStore
  • New AI Platforms
    • Amazon Bedrock

2024-10-31 🎃

Features

  • Connectors and AI Platforms not being used by a pipeline can now be edited.

  • Added a tab which displays configuration for the source connector(s), AI platform, and vector database for the selected RAG pipeline.

Pipeline Configuration Tab

  • Added an AI Assistant which will answer context-specific questions as you use Vectorize. You can optionally give the answer a thumbs-up or thumbs-down.

AI Assistant

New Embedding Models

  • voyage-3
  • voyage-3-lite
  • voyage-finance-2
  • voyage-multilingual-2
  • voyage-law-2
  • voyage-code-2

Was this page helpful?