tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > RAG Pipelines > RAG with Reranking using Cohere Rerank API

RAG with Reranking using Cohere Rerank API

Author: Venkata Sudhakar

Reranking with the Cohere Rerank API adds a powerful second-stage ranking pass to ShopMax India's RAG pipeline. The first-stage retriever (BM25 or vector search) quickly narrows the corpus to a candidate set of 20-50 documents. The Cohere Rerank model then scores each candidate document against the original query using a cross-encoder architecture, which considers the query and document together rather than separately - producing much higher quality relevance scores than first-stage retrieval alone.

The reranking step is particularly valuable for ShopMax India's complex product comparison queries. A query like 'lightweight laptop with long battery under Rs 100000 for travel' requires the ranker to understand the combined intent, not just keyword overlap. Cohere's rerank-english-v3.0 and rerank-multilingual-v3.0 models are optimized for this task and integrate via a simple API call: send the query plus a list of documents, receive back each document with a relevance_score. Retain only documents above a score threshold for the LLM context.

The following example implements two-stage RAG for ShopMax India: BM25 first-stage retrieval followed by Cohere reranking. The pipeline fetches 5 candidates with BM25, reranks them, and passes only the top-2 reranked documents to Claude.


It gives the following output,

Q: Which is the lightest laptop with long battery life under Rs 100000 for travel?
Top docs after reranking:
  - LG Gram 14: 999g, 22-hour battery, Intel i5, Rs 82000, Del...
  - Acer Swift 5: 1.06kg, 15-hour battery, Intel i7, Rs 72000,...
A: The LG Gram 14 is the best option for travel - it is the lightest at 999g with an impressive 22-hour battery, priced at Rs 82,000. The Acer Swift 5 at Rs 72,000 is also excellent at 1.06kg with 15-hour battery life.

For ShopMax India, set first_stage_k to 10-20 to give the reranker a broad candidate pool, then reduce to 2-3 for the final LLM context. Cache reranked results using the query hash with a 30-minute TTL since reranking is more expensive than first-stage retrieval. Monitor the relevance scores from the reranker - if the top document scores below 0.3, the query is likely outside your product catalog and should be routed to a fallback response. Cohere also offers a multilingual reranker (rerank-multilingual-v3.0) for ShopMax India's Hindi and regional language customers.


 
  


  
bl  br