In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Prompt Engineering > Context Stuffing - Maximizing Relevant Information in Prompts

Context Stuffing - Maximizing Relevant Information in Prompts

Author: Venkata Sudhakar

Context stuffing is the technique of packing the LLM prompt with the maximum amount of relevant information before asking a question, so the model has everything it needs to give a precise, grounded answer without hallucinating. ShopMax India applies this when handling customer queries about specific products - rather than relying on the model's training data, the system injects the full product specification sheet, current price, stock status, and recent reviews directly into the prompt.

The key challenge is fitting all relevant context within the model's context window while staying under token limits. Effective context stuffing involves ranking chunks by relevance using embeddings, truncating less relevant sections, and structuring the injected content so the model can extract the answer efficiently. Techniques like XML tags, section headers, and explicit 'Source:' labels help the model navigate dense context.

The following example shows ShopMax India building a context-stuffed prompt for a product Q and A system. The code retrieves product data from a local dictionary (representing a product database), constructs a structured context block, and sends it with the customer question to the Anthropic API.

import anthropic

client = anthropic.Anthropic(api_key="sk-ant-...")

products = {
    "LAPTOP-001": {
        "name": "Dell XPS 15 9530",
        "price": 135000,
        "stock": 8,
        "specs": "15.6 inch OLED, Intel Core i7-13700H, 32GB DDR5, 1TB NVMe SSD, Nvidia RTX 4060",
        "warranty": "1 year onsite, 2 year battery",
        "cities": ["Mumbai", "Bangalore", "Delhi"],
        "reviews_summary": "Excellent build quality. Battery life averages 6-7 hours. Runs warm under load."
    }
}

def answer_product_question(product_id, customer_question):
    p = products[product_id]
    context = f"""<product>
<name>{p["name"]}</name>
<price>Rs {p["price"]}</price>
<stock>{p["stock"]} units available</stock>
<specs>{p["specs"]}</specs>
<warranty>{p["warranty"]}</warranty>
<available_in>{", ".join(p["cities"])}</available_in>
<customer_reviews>{p["reviews_summary"]}</customer_reviews>
</product>"""
    prompt = f"Using only the product information above, answer the customer question.\n\nCustomer: {customer_question}"
    msg = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=300,
        system="You are a helpful ShopMax India product specialist. Answer based only on provided product data.",
        messages=[{"role": "user", "content": context + "\n\n" + prompt}]
    )
    return msg.content[0].text

questions = [
    "How much does this laptop cost and is it available in Chennai?",
    "What is the RAM and storage configuration?",
    "What do customers say about battery life?"
]

for q in questions:
    answer = answer_product_question("LAPTOP-001", q)
    print(f"Q: {q}")
    print(f"A: {answer}")
    print()

It gives the following output,

Q: How much does this laptop cost and is it available in Chennai?
A: The Dell XPS 15 9530 is priced at Rs 135,000. Currently, it is available in Mumbai, Bangalore, and Delhi. Chennai is not listed as an available city at this time.

Q: What is the RAM and storage configuration?
A: The Dell XPS 15 9530 comes with 32GB DDR5 RAM and a 1TB NVMe SSD for storage.

Q: What do customers say about battery life?
A: Customer reviews indicate that battery life averages 6-7 hours. The laptop runs warm under heavy load, which may affect battery performance during intensive tasks.

For ShopMax India at scale, pre-compute and cache context blocks for each product so they can be injected instantly without hitting the database on every query. Use token counting (tiktoken for OpenAI, Anthropic's token counter) to ensure the stuffed context plus the question stays within limits. For very long product catalogs, combine context stuffing with a retrieval step - first find the top-3 relevant products using embeddings, then stuff only those product blocks into the prompt.

Send your comments, suggestions or queries regarding this site to [email protected].