Configuring Neo4j Graph Database Connector
The Neo4j Graph Database Connector (Beta) enables you to integrate Neo4j as a destination for your pipeline data. This connector stores your vectorized content in Neo4j, laying the foundation for future graph-enhanced retrieval capabilities.
Prerequisites
Before configuring Neo4j in Vectorize, you'll need:
- Neo4j Instance: Either Neo4j Aura (cloud) or self-hosted Neo4j database
- Connection Details:
- Connection URI (e.g.,
neo4j://your-instance.neo4j.io
orneo4j://localhost:7687
) - Username (typically
neo4j
for new instances) - Password
- Connection URI (e.g.,
Configuring the Integration
- From the main menu, click on Vector Databases
- Click New Vector Database Integration
- Select the Neo4j card (Beta)
- Enter your configuration details
Authentication Configuration
Provide the following connection details:
- Integration Name: Enter a descriptive name for this Neo4j integration
- Connection URI: Your Neo4j instance connection string
- For Neo4j Aura:
neo4j+s://xxxxxxxx.databases.neo4j.io
- For local development:
neo4j://localhost:7687
- For self-hosted:
neo4j://your-server:7687
- For Neo4j Aura:
- Username: Your Neo4j username (default is usually
neo4j
) - Password: Your Neo4j instance password
Pipeline Configuration
When using Neo4j in a pipeline, you'll configure:
Label Name: A unique label for this pipeline's data nodes
- This label organizes your data in the graph
- Each pipeline should use a distinct label to separate its data
- If the label doesn't exist, it will be created automatically
- Example:
DocumentChunks
,ProductDocs
,CustomerSupport
Using Neo4j with Vectorize
Neo4j functions as a destination connector where:
- Your pipeline data is stored in the Neo4j graph database
- Documents are organized using the label you specify
- The connector supports the standard Vectorize retrieval endpoint
To retrieve data from your Neo4j-backed pipeline, use the standard retrieval endpoint:
curl --location 'https://client.app.vectorize.io/api/gateways/service/{org-id}/{pipeline-id}/retrieve' \
--header 'Content-Type: application/json' \
--header 'Authorization: <token>' \
--data '{
"question": "What are the key features?",
"numResults": 5,
"rerank": true
}'
Note: While Neo4j is a graph database, the current integration uses it as a vector store. Graph-specific retrieval features may be added in future releases.
Best Practices
Label Naming
- Use descriptive, domain-specific labels
- Follow Neo4j naming conventions (PascalCase)
- Separate different data types with different labels
- Examples:
TechnicalDocs
,CustomerTickets
,ProductSpecs
Data Organization
- Use separate pipelines (and labels) for different data domains
- Each pipeline's data will be isolated under its own label
- Consider your future graph structure needs when organizing data
Troubleshooting
Connection Issues
- Connection refused: Verify your Neo4j instance is running and accessible
- Authentication failed: Check username/password and ensure user has appropriate permissions
- SSL/TLS errors: For Neo4j Aura, use
neo4j+s://
protocol; for local, useneo4j://
Connector Status
- Check the Vector Databases page in the Vectorize dashboard to verify your connector status
- If the connector shows an error, review your connection settings
- Ensure your Neo4j instance allows connections from Vectorize IP addresses
What's Next?
- Using the Retrieval Endpoint - Learn about standard retrieval features
- Advanced Retrieval - Explore query rewriting and reranking
- Understanding Metadata - Learn how metadata becomes graph properties
For Neo4j-specific documentation and Cypher query language, visit the Neo4j Documentation.