Skip to main content

Real-Time Pipelines

Real-time pipelines provide continuous data synchronization and processing, enabling your AI applications to work with the most current information without waiting for scheduled sync intervals.

Overview

Real-time pipelines continuously monitor your data sources for changes and immediately process new or updated content. Unlike scheduled pipelines that run at predefined intervals, real-time pipelines ensure your vector database is always up-to-date.

Key Benefits

  • Continuous Data Sync: Process changes as soon as they're detected
  • Always-On Processing: Continuous monitoring without manual intervention
  • Reduced Latency: Minimize wait times between data updates and availability
  • Event-Driven: Respond immediately to data changes in your sources

Real-Time vs Scheduled Pipelines

FeatureScheduled PipelineReal-Time Pipeline
Data FreshnessUpdated on schedule (hourly, daily, etc.)Updated within minutes
Processing ModelBatch processing at intervalsContinuous stream processing
Resource UsageBursts during scheduled runsSteady, continuous processing
Best ForStatic datasets, periodic reportsLive data, time-sensitive applications

Creating a Real-Time Pipeline

Prerequisites

  • Organization on a premium plan
  • Available real-time pipeline quota (purchased separately)

Step 1: Purchase Real-Time Pipeline Quota

Real-time pipelines are available as an add-on to your subscription:

  1. Navigate to Organization SettingsBilling
  2. Find the Real-Time Pipelines section
  3. Click Manage Real-Time Pipelines
  4. Select the number of real-time pipelines you need
  5. Complete the purchase

For current pricing, see Vectorize Pricing.

Step 2: Create Pipeline and Enable Real-Time Mode

  1. Create a new pipeline with your desired source and destination connectors
  2. After the pipeline is created, navigate to the pipeline details page
  3. Go to the Schedule tab
  4. Enable real-time mode:
    • If you have purchased real-time pipelines: Select Real-time instead of Scheduled
    • If you haven't purchased real-time pipelines: You'll see an "Unlock Real-time Processing" message with a "Purchase Real-time Pipelines" button
  5. Save the changes

The pipeline will begin processing changes continuously once real-time mode is enabled.

Step 3: Monitor Real-Time Processing

Real-time pipelines display a special indicator in the pipeline list:

  • ⚡ Real-time badge shows the pipeline is processing continuously
  • Pipeline metrics update in real-time to show processing activity
  • Event logs capture all change detection and processing events

Converting Between Pipeline Modes

Scheduled to Real-Time

Convert an existing scheduled pipeline to real-time mode:

  1. Open the pipeline details page
  2. Navigate to the Schedule tab
  3. Toggle from Scheduled to Real-time
  4. Confirm the conversion
caution

Ensure you have available real-time pipeline quota before converting. The system will prevent conversion if you've reached your limit.

Real-Time to Scheduled

Convert a real-time pipeline back to scheduled:

  1. Open the pipeline details page
  2. Navigate to the Schedule tab
  3. Toggle from Real-time to Scheduled
  4. Configure your desired schedule (daily, hourly, etc.)
  5. Save changes

This frees up real-time pipeline quota for use with other pipelines.

Data Source Support

All data sources are supported with real-time pipelines. The pipeline continuously monitors for changes and processes them as they're detected.

Processing time depends on several factors including the source connector's change detection method, document size, and content complexity.

Managing Real-Time Pipeline Quota

Viewing Usage

Check your current real-time pipeline usage:

  1. Go to Organization SettingsBilling
  2. View Real-Time Pipelines section
  3. See "X of Y pipelines in use"

Adjusting Quota

To increase your real-time pipeline quota:

  1. Click Manage Real-Time Pipelines
  2. Increase the count to desired number
  3. Confirm billing changes

To decrease quota:

  1. First convert any active real-time pipelines to scheduled mode
  2. Then reduce your quota in billing settings
  3. Billing adjusts on next cycle
warning

You cannot reduce your real-time pipeline quota below the number of active real-time pipelines. Convert pipelines to scheduled mode first.

Best Practices

When to Use Real-Time Pipelines

Ideal Use Cases:

  • Customer support knowledge bases that need instant updates
  • Financial data feeds requiring immediate processing
  • Collaborative documents with frequent changes
  • News and content aggregation systems
  • Compliance and regulatory document tracking

Consider Scheduled Pipelines When:

  • Data changes infrequently (weekly/monthly)
  • Batch processing is more efficient
  • Resource optimization is a priority

Resource Management

  • Start with scheduled pipelines and upgrade to real-time for critical data
  • Group related data sources into single pipelines when possible
  • Monitor usage patterns to identify pipelines that don't need real-time
  • Use scheduled pipelines for historical data imports

What's next?

Was this page helpful?