Upload Files to File Upload Connectors

Learn how to programmatically manage files in your File Upload connectors using the Vectorize API.

Before You Start

This guide assumes you've already set up your Vectorize API client and have access to your organization’s API key and ID.

What are File Upload Connectors?

File Upload connectors allow you to manually upload files for processing by your RAG pipelines. Unlike automated connectors that sync from external sources (like AWS S3 or Google Drive), File Upload connectors give you direct control over which files to process and when.

List Files in a Connector

Use the Uploads API to list all files currently in your connector.

Python
Node.js
# Create API instance
uploads_api = v.UploadsApi(api)

# List files
try:
    response = uploads_api.get_upload_files_from_connector(organization_id, source_connector_id)
    print(f"Found {len(response.files)} files in connector")

    for file in response.files:
        print(f"  📄 {file.name} ({file.size:,} bytes, Uploaded: {file.last_modified})")
        if file.metadata:
            print(f"    Metadata: {file.metadata}")
        print()

    # Test execution continues with the same variables
    self.test_runner.log_success(
        "List connector files", 
        f"Found {len(response.files)} files",
        status_code=200
    )

    # Store for later use
    self.current_files = response.files
    return True

except Exception as e:
    print(f"Error listing files: {e}")
    self.test_runner.log_failure("List connector files", error=e)
    # Initialize empty list so cleanup can still work
    self.current_files = []
    return False  # Changed to False to match JS behavior
const { UploadsApi } = vectorize;

// Create API instance
const uploadsApi = new UploadsApi(apiConfig);

// List files
try {
    const response = await uploadsApi.getUploadFilesFromConnector({
        organizationId: organizationId,
        connectorId: sourceConnectorId
    });

    console.log(`Found ${response.files.length} files in connector`);

    for (const file of response.files) {
        console.log(`  📄 ${file.name} (${file.size.toLocaleString()} bytes, Uploaded: ${file.lastModified})`);
        if (file.metadata) {
            console.log(`    Metadata: ${file.metadata}`);
        }
        console.log();
    }

    // Test execution continues with the same variables
    this.testRunner.logSuccess(
        "List connector files", 
        `Found ${response.files.length} files`,
        200
    );

    // Store for later use
    this.currentFiles = response.files;
    return true;

} catch (error) {
    console.log(`Error listing files: ${error.message}`);
    this.testRunner.logFailure("List connector files", error.message);
    // Initialize empty list so cleanup can still work
    this.currentFiles = [];
    return false;
}

Upload a File

Uploading a file to a connector is a two-step process:

Request a pre-signed upload URL from the API
Upload your file to that URL

Python
Node.js
import urllib3
import os
import json

# Create API instances
uploads_api = v.UploadsApi(api)

# File details
content_type = "application/pdf"  # Set appropriate content type

# Optional metadata - all values as strings
metadata = {
    "category": "research",
    "tags": "machine-learning,2024",  # Store as comma-separated string
    "processed": "false"  # Store boolean as string
}

try:
    # Step 1: Get upload URL
    start_response = uploads_api.start_file_upload_to_connector(
        organization_id,
        source_connector_id,
        start_file_upload_to_connector_request=v.StartFileUploadToConnectorRequest(
            name=file_name,
            content_type=content_type,
            metadata=json.dumps(metadata) if metadata else None  # Convert to JSON string
        )
    )

    # Step 2: Upload file to the URL
    http = urllib3.PoolManager()

    with open(file_path, "rb") as f:
        response = http.request(
            "PUT",
            start_response.upload_url,
            body=f,
            headers={
                "Content-Type": content_type,
                "Content-Length": str(os.path.getsize(file_path))
            }
        )

    if response.status != 200:
        print(f"Upload failed: {response.data}")
    else:
        print(f"Successfully uploaded {file_name}")

except Exception as e:
    print(f"Error during upload: {e}")
const { UploadsApi, StartFileUploadToConnectorRequest } = vectorize;

// Create API instances
const uploadsApi = new UploadsApi(apiConfig);

// File details
const contentType = "application/pdf"; // Set appropriate content type

// Optional metadata - all values as strings
const metadata = {
    "category": "research",
    "tags": "machine-learning,2024", // Store as comma-separated string
    "processed": "false" // Store boolean as string
};

let uploadResponse;

try {
    // Step 1: Get upload URL
    const startRequest = {
        name: fileName,
        contentType: contentType,
        metadata: metadata ? JSON.stringify(metadata) : undefined // Convert to JSON string
    };

    const startResponse = await uploadsApi.startFileUploadToConnector({
        organizationId: organizationId,
        connectorId: sourceConnectorId,
        startFileUploadToConnectorRequest: startRequest
    });

    // Step 2: Upload file to the URL
    const fileBuffer = fs.readFileSync(filePath);
    const fileStats = fs.statSync(filePath);

    uploadResponse = await fetch(startResponse.uploadUrl, {
        method: 'PUT',
        body: fileBuffer,
        headers: {
            'Content-Type': contentType,
            'Content-Length': fileStats.size.toString()
        }
    });

    if (uploadResponse.status !== 200) {
        const errorText = await uploadResponse.text();
        console.log(`Upload failed: ${errorText}`);
    } else {
        console.log(`Successfully uploaded ${fileName}`);
    }

} catch (error) {
    console.log(`Error during upload: ${error.message}`);
    return false;
}

note

If a file with the same name already exists in the connector, it will be overwritten.

Working with Metadata

Metadata allows you to attach additional information to your files that will be preserved throughout processing and can be used for filtering and organization in your RAG pipelines.

Metadata Examples

# Simple key-value pairs
metadata = {
    "department": "engineering",
    "year": 2024,
    "confidential": True
}

# Arrays and nested objects
metadata = {
    "authors": ["John Doe", "Jane Smith"],
    "project": {
        "name": "AI Research",
        "phase": "development"
    },
    "tags": ["ml", "nlp", "research"]
}

Retrieving Files with Metadata

When you list files, the metadata is included in the response:

response = uploads_api.get_upload_files_from_connector(organization_id, connector_id)
for file in response.files:
    if file.metadata and file.metadata.get("department") == "engineering":
        print(f"Engineering file: {file.name}")

What are File Upload Connectors?​

List Files in a Connector​

Upload a File​

Working with Metadata​

Metadata Examples​

Retrieving Files with Metadata​