Skip to main content

Configuring Neo4j Graph Database Connector

The Neo4j Graph Database Connector (Beta) enables you to integrate Neo4j as a destination for your pipeline data. This connector stores your vectorized content in Neo4j, laying the foundation for future graph-enhanced retrieval capabilities.

Prerequisites

Before configuring Neo4j in Vectorize, you'll need:

  1. Neo4j Instance: Either Neo4j Aura (cloud) or self-hosted Neo4j database
  2. Connection Details:
    • Connection URI (e.g., neo4j://your-instance.neo4j.io or neo4j://localhost:7687)
    • Username (typically neo4j for new instances)
    • Password

Configuring the Integration

  1. From the main menu, click on Vector Databases
  2. Click New Vector Database Integration
  3. Select the Neo4j card (Beta)
  4. Enter your configuration details

Authentication Configuration

Provide the following connection details:

  1. Integration Name: Enter a descriptive name for this Neo4j integration
  2. Connection URI: Your Neo4j instance connection string
    • For Neo4j Aura: neo4j+s://xxxxxxxx.databases.neo4j.io
    • For local development: neo4j://localhost:7687
    • For self-hosted: neo4j://your-server:7687
  3. Username: Your Neo4j username (default is usually neo4j)
  4. Password: Your Neo4j instance password

Pipeline Configuration

When using Neo4j in a pipeline, you'll configure:

Label Name: A unique label for this pipeline's data nodes

  • This label organizes your data in the graph
  • Each pipeline should use a distinct label to separate its data
  • If the label doesn't exist, it will be created automatically
  • Example: DocumentChunks, ProductDocs, CustomerSupport

Using Neo4j with Vectorize

Neo4j functions as a destination connector where:

  • Your pipeline data is stored in the Neo4j graph database
  • Documents are organized using the label you specify
  • The connector supports the standard Vectorize retrieval endpoint

To retrieve data from your Neo4j-backed pipeline, use the standard retrieval endpoint:

curl --location 'https://client.app.vectorize.io/api/gateways/service/{org-id}/{pipeline-id}/retrieve' \
--header 'Content-Type: application/json' \
--header 'Authorization: <token>' \
--data '{
"question": "What are the key features?",
"numResults": 5,
"rerank": true
}'

Note: While Neo4j is a graph database, the current integration uses it as a vector store. Graph-specific retrieval features may be added in future releases.

Best Practices

Label Naming

  • Use descriptive, domain-specific labels
  • Follow Neo4j naming conventions (PascalCase)
  • Separate different data types with different labels
  • Examples: TechnicalDocs, CustomerTickets, ProductSpecs

Data Organization

  • Use separate pipelines (and labels) for different data domains
  • Each pipeline's data will be isolated under its own label
  • Consider your future graph structure needs when organizing data

Troubleshooting

Connection Issues

  • Connection refused: Verify your Neo4j instance is running and accessible
  • Authentication failed: Check username/password and ensure user has appropriate permissions
  • SSL/TLS errors: For Neo4j Aura, use neo4j+s:// protocol; for local, use neo4j://

Connector Status

  • Check the Vector Databases page in the Vectorize dashboard to verify your connector status
  • If the connector shows an error, review your connection settings
  • Ensure your Neo4j instance allows connections from Vectorize IP addresses

What's Next?

For Neo4j-specific documentation and Cypher query language, visit the Neo4j Documentation.

Was this page helpful?