|
|
LangChain Multi-Query Retriever - Improving Search with Query Variations
Author: Venkata Sudhakar
When ShopMax India customers search for products, a single embedding query often misses relevant results because the query wording does not match how the documents are indexed. LangChain's MultiQueryRetriever solves this by using an LLM to generate multiple variations of the original query, runs all of them against the vector store, and deduplicates the results - dramatically improving recall without manual query tuning.
MultiQueryRetriever wraps any existing retriever. When invoked, it sends the original question to an LLM with a prompt asking for alternative phrasings, then runs each variation as a separate similarity search. Results from all queries are merged and deduplicated by document ID. This approach is especially useful when customers use informal language, abbreviations, or different terminology from what is in the product catalog.
The example below shows ShopMax India using MultiQueryRetriever on a product FAQ vector store to improve retrieval for a query about headphone battery life.
It gives the following output,
INFO:langchain.retrievers.multi_query:Generated queries: ['What is the battery duration of Sony headphones?', 'Sony headphone playtime on full charge', 'How many hours does Sony WH-1000XM5 battery last?']
Sony WH-1000XM5 : Sony WH-1000XM5 offers up to 30 hours of battery life on a single charge.
Sony WH-1000XM5 : The WH-1000XM5 can be charged via USB-C and supports quick charge - 3 minu
In production, set include_original=True to include the original query alongside the generated variations. Tune the number of query variations by customizing the prompt - 3 variations is a good balance between recall and latency. MultiQueryRetriever adds one LLM call per retrieval, so cache frequently asked questions at the application level to avoid repeated query generation for common ShopMax product questions. Monitor which generated queries produce the most unique results to assess retrieval quality.
|
|