|
|
RAG with BM25 Sparse Retrieval using Elasticsearch
Author: Venkata Sudhakar
RAG with BM25 sparse retrieval gives ShopMax India a fast, interpretable search layer for their product knowledge base. While dense vector search excels at semantic similarity, BM25 shines when customers use exact product model numbers or brand names like 'Sony WH-1000XM5' or 'Dell XPS 9530' - terms that dense embeddings sometimes dilute. Sparse retrieval ensures an exact keyword match always scores high regardless of semantic distance.
BM25 scores documents based on term frequency, inverse document frequency, and document length normalization. In Python, the rank_bm25 library provides a lightweight BM25 implementation that operates on tokenized documents. For production systems, Elasticsearch and OpenSearch expose BM25 as their default full-text scoring algorithm and can be queried via their Python clients alongside a vector similarity field for hybrid ranking.
The following example builds a BM25 retriever for ShopMax India product FAQs using rank_bm25. The retriever tokenizes the document corpus, scores each document against a query, and returns the top-k most relevant passages for the LLM context window.
It gives the following output,
Q: What is the price of Sony WH-1000XM5?
A: The Sony WH-1000XM5 headphones are priced at Rs 29,990 and are available in Mumbai and Bangalore.
Sources: 3 docs retrieved
Q: Dell XPS 15 RAM and storage specs
A: The Dell XPS 15 9530 comes with 32GB DDR5 RAM and a 1TB NVMe SSD storage configuration.
Sources: 3 docs retrieved
Q: Which phones have USB-C charging?
A: The Apple iPhone 15 Pro supports USB-C charging. The Sony WH-1000XM5 headphones also support USB-C charging.
Sources: 3 docs retrieved
For ShopMax India at production scale, replace rank_bm25 with Elasticsearch BM25 to handle millions of product documents. Index products with both BM25 fields and dense vector fields, then combine scores using a linear blend: final_score = 0.4 * bm25_score + 0.6 * vector_score. This hybrid approach outperforms either method alone - BM25 catches exact model number matches while vectors handle semantic queries like 'noise-cancelling headphones for travel'. Re-tune blend weights using your RAGAS evaluation metrics on a held-out query set.
|
|