What is Retrieval in AI?
Retrieval is the process of finding and fetching relevant documents or information from a knowledge base to augment AI model responses. It's a core component of RAG (Retrieval-Augmented Generation) systems.
Retrieval Methods
Sparse Retrieval
- Keyword-based (BM25, TF-IDF)
- Fast and interpretable
- Exact match focused
Dense Retrieval
- Embedding-based
- Semantic similarity
- Vector databases
Hybrid Retrieval
- Combines sparse and dense
- Best of both approaches
- Improved recall
Retrieval Pipeline
Query → Query Processing → Retrieval → Ranking → Results
↓
Vector DB / Search Index
Key Metrics
Recall@K Percentage of relevant docs in top K.
Precision@K Relevance of retrieved docs.
MRR (Mean Reciprocal Rank) Position of first relevant result.
NDCG Normalized Discounted Cumulative Gain.
Chunking Strategies
Fixed Size
- Simple implementation
- May split context
Semantic
- Preserve meaning
- Variable sizes
Hierarchical
- Parent-child chunks
- Summary + details
Ranking and Reranking
Initial Retrieval Fast, approximate ranking.
Reranking
- Cross-encoder models
- Better relevance scoring
- More compute intensive
Best Practices
- Optimize chunk size for use case
- Use hybrid retrieval
- Implement reranking
- Monitor retrieval quality
- Regular index updates