Scheduling RAG Pipelines

The pipeline scheduler is the key to keeping your vector indexes up to date with fresh data. When your pipeline is running, Vectorize will immediately process any changes it finds. Depending on the source connector(s) you are using, Vectorize will either poll the source system, or listen for change notifications from the source system.

There are two top level modes you can use for the scheduler.

Scheduler Execution Mode

Pipelines must be configured using either a "Scheduled" or "Real-time" execution mode. This is not a permanent decision, and you choose to start with a scheduled pipeline and later decide to run the pipeline real-time and vice versa.

Real-Time Pipelines

A pipeline that is configured in real-time mode is referred to as a "Real-Time Pipeline". These pipelines will run continuously until you explicitly stop them. As soon as changes are detected, your vector indexes will be updated in near-real time (usually within a few seconds).

If you are on the free plan your real time pipelines will run until you stop them or until you run out of free hours for the month.

Scheduled Pipelines

A pipeline that is configured in scheduled mode is referred to as a "Scheduled Pipeline". These pipelines will start running automatically based on the configuration that you select.

Regardless of the schedule settings, Vectorize will run your pipeline for an hour immediately to backfill your vector indexes assuming you have pipeline hours available in your free plan or that you are on a paid plan. After that, your pipeline will run as scheduled.

Scheduled Pipeline Options

When configuring a scheduled pipeline, you must specify how often, and for how long, you want your pipeline to run. You start this process by selecting the Schedule Type

  1. Schedule Type options include:

  • Manual: Run the pipeline on-demand

  • Weekly: Run once a week on a specific day, Vectorize decides the time the pipeline will run.

  • Daily: Run every day, including weekends, starting at a set time and ending at a set time

  • Weekdays: Run Monday through Friday, starting a a set time and ending at a set time

  • Custom: Similar behavior to Daily and Weekdays, but you can pick specific days of the week when you want the schedule to run.

  1. Set Days (for Weekly or Custom types)

    • For Weekly: Choose one day of the week

    • For Custom: Select multiple days as needed

  1. Set Time Range (except for Manual and Weekly types)

    • Start Time: When the pipeline should begin running

      • Start times can be configured in 15-minute increments

    • End Time: When the pipeline should stop running

    • Time ranges must be at least one hour long

    • Pipeline schedules are always in increments of whole hours

  1. Choose Timezone

    • Select your preferred timezone to ensure accurate scheduling

  1. Review Current Schedule

    • The scheduler will display a summary of your selected options

    • It will also show the estimated runtime and any associated costs

Note: The scheduler summary considers free hours for free plan users and paid users' hourly rate when calculating costs. While these estimates are reasonably accurate, factors such as the number of days per month will result in minor cost fluctuations from month to month.

Note for Free Plan Users

If you are using the Vectorize free plan and you exceed your free hours for the month, you will need to manually restart your pipeline at the start of the next month when your hours refresh for the month.

Last updated