Retrieval Performance
The Retrieval Performance dashboard provides real-time monitoring and analysis of your retrieval system's effectiveness. This guide explains the key metrics and features available in the dashboard.
Why We Measure Top 2 Chunks
The focus on top 2 chunks in our metrics provides a reliable signal for system health while avoiding common measurement pitfalls:
Measuring too many chunks can dilute your metrics. Poor relevance in lower-ranked chunks doesn't necessarily indicate system issues.
RAG systems usually form their answers primarily from the most relevant pieces of information. Having high relevance in the top 2 chunks means the LLM is more likely to generate accurate and focused responses.
By focusing on top 2 chunks, we get clean, actionable metrics that reliably indicate system health.
Core Metrics
The dashboard displays three primary metrics to help you evaluate retrieval performance:
Non-Rewritten Relevance
Measures the relevance of the top 2 chunks for queries in their original form
Ranges from 0 to 1, where higher values indicate better relevance
Color-coded for quick assessment:
Green (≥ 0.7): High relevance
Yellow (≥ 0.3): Moderate relevance
Red (< 0.3): Low relevance
Rewritten Question Relevance
Shows the relevance of the top 2 chunks for queries after they've been rewritten
Uses the same 0-1 scale and color coding as non-rewritten relevance
Helps evaluate if query rewriting improves retrieval quality
Overall Retrieval Health
Combines all relevance metrics into a single health score
Provides a high-level view of system performance
Uses the same color-coded thresholds to indicate overall health
Calculated as the running average of all available metrics
Interactive Visualization
The dashboard includes an interactive time series chart that shows how metrics change over time:
View performance trends over the desired time range
Toggle individual metrics by clicking on their cards
Hover over data points to see exact values
Compare rewritten vs non-rewritten performance
Track overall health progression
Usage Monitoring
The dashboard also includes usage statistics to help you track resource utilization:
Retrievals
Shows current usage against your monthly quota
Tracks standard retrieval operations
Helps monitor usage patterns and limits
Advanced Retrievals
Displays usage of enhanced retrieval features
Includes operations like rewritten queries
Helps manage resource allocation
Tips for Using the Dashboard
Metric Toggle: Click on any metric card to show/hide its line in the graph
Performance Analysis: Use the time series visualization to identify:
Sudden changes in performance
Impact of system updates
Time-based patterns
Health Monitoring: Keep an eye on the Overall Retrieval Health score for:
System-wide performance issues
Long-term trends
Impact of optimizations
Interpreting Results
When analyzing your metrics, consider:
A consistent Overall Health score above 0.7 indicates strong retrieval performance
Large gaps between rewritten and non-rewritten relevance suggest opportunities for query optimization
Sudden drops in metrics may indicate underlying issues that need investigation
Usage patterns can help with capacity planning and resource allocation
Best Practices
Regular Monitoring: Check the dashboard regularly to catch issues early
Comparative Analysis: Compare rewritten vs non-rewritten metrics to optimize your system
Trend Analysis: Use the time series data to identify patterns and make informed improvements
Last updated
Was this helpful?