In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Anthropic Claude API > Claude Memory Patterns

Claude Memory Patterns

Author: Venkata Sudhakar

Claude has no built-in memory between API calls - each conversation starts completely fresh. For a customer service chatbot this means a customer who explained their problem yesterday must repeat everything today. Memory patterns solve this by storing important context outside the model and injecting it at the start of each new conversation. There are three practical patterns: conversation history replay, entity memory (store key facts per customer), and summary memory (compress old sessions into a rolling summary). Entity memory is the most cost-effective for most production business chatbots.

Entity memory works like this: after each session ends, a background job uses a cheap Claude call to extract key facts - customer name, account type, issues raised, preferences, pending actions. These are stored as a JSON record keyed to the customer ID in your database. At the start of the next session, retrieve the record and inject it into the system prompt. Claude now knows the customer without replaying hundreds of expensive tokens. The extraction job is very cheap at claude-haiku-4-5 pricing and runs once per session end.

The below example shows a banking chatbot that extracts and stores customer memory after each session, then injects it into the next conversation so the customer never has to repeat their account details or previously raised issues.

import anthropic, json
from datetime import datetime

client = anthropic.Anthropic(api_key="your-api-key")

# Simulated DB - use Redis or PostgreSQL keyed by customer_id in production
customer_memory_db = {}

def extract_and_store_memory(conversation: list, customer_id: str) -> dict:
    history_text = "\n".join(
        msg["role"].upper() + ": " + msg["content"] for msg in conversation
    )
    resp = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=300,
        system=(
            "Extract key customer facts from this bank support conversation. "
            "Return ONLY valid JSON with these keys (null if not mentioned): "
            "customer_name, account_type, issues_raised (list), "
            "preferences (list), pending_actions (list)"
        ),
        messages=[{"role": "user", "content": history_text}]
    )
    try:
        new_facts = json.loads(resp.content[0].text)
    except json.JSONDecodeError:
        return customer_memory_db.get(customer_id, {})
    existing = customer_memory_db.get(customer_id, {})
    for key, val in new_facts.items():
        if val is None:
            continue
        if isinstance(val, list) and isinstance(existing.get(key), list):
            combined = existing[key] + [v for v in val if v not in existing[key]]
            existing[key] = combined
        else:
            existing[key] = val
    existing["last_contact"] = datetime.now().strftime("%Y-%m-%d")
    customer_memory_db[customer_id] = existing
    return existing

def build_system_with_memory(customer_id: str) -> str:
    base = (
        "You are a helpful customer service agent for PrimBank India. "
        "Be warm, precise, and proactive. Address the customer by name if known."
    )
    memory = customer_memory_db.get(customer_id)
    if not memory:
        return base
    lines = ["\n\nCUSTOMER CONTEXT (from previous sessions):"]
    if memory.get("customer_name"):
        lines.append("Name: " + memory["customer_name"])
    if memory.get("account_type"):
        lines.append("Account: " + memory["account_type"])
    if memory.get("issues_raised"):
        lines.append("Past issues: " + ", ".join(memory["issues_raised"]))
    if memory.get("pending_actions"):
        lines.append("Pending: " + ", ".join(memory["pending_actions"]))
    if memory.get("preferences"):
        lines.append("Preferences: " + ", ".join(memory["preferences"]))
    if memory.get("last_contact"):
        lines.append("Last contact: " + memory["last_contact"])
    return base + "\n" + "\n".join(lines)

Simulating two sessions to show memory extraction and reuse,

def chat(customer_id: str, user_message: str) -> str:
    resp = client.messages.create(
        model="claude-haiku-4-5", max_tokens=200,
        system=build_system_with_memory(customer_id),
        messages=[{"role": "user", "content": user_message}]
    )
    return resp.content[0].text

CID = "CUST-4491"

# Session 1 - first contact, no prior memory
print("=== SESSION 1 (no memory) ===")
msg1 = ("Hi, I am Priya Sharma. I have a savings account. "
        "I was charged Rs 500 SMS alert fee I did not authorise. "
        "I prefer email alerts, not SMS.")
reply1 = chat(CID, msg1)
print("Priya:", msg1[:80])
print("Bot:  ", reply1[:180])

# Extract and store memory after session ends
session1 = [{"role": "user", "content": msg1},
            {"role": "assistant", "content": reply1}]
extract_and_store_memory(session1, CID)
print("\n[Memory stored:]")
print(json.dumps(customer_memory_db[CID], indent=2))

# Session 2 - next day, memory injected automatically
print("\n=== SESSION 2 (memory injected) ===")
reply2 = chat(CID, "Hi, just checking if my refund came through?")
print("Priya: Hi, just checking if my refund came through?")
print("Bot:  ", reply2[:250])

It gives the following output showing memory extraction and reuse across sessions,

=== SESSION 1 (no memory) ===
Priya: Hi, I am Priya Sharma. I have a savings account. I was charged Rs 50...
Bot:   I am sorry about the unauthorised Rs 500 SMS charge, Priya. I will
       raise a reversal request right now - it should reflect in 2-3 business
       days. I have also flagged your preference for email alerts only.

[Memory stored:]
{
  "customer_name": "Priya Sharma",
  "account_type": "savings account",
  "issues_raised": ["unauthorised Rs 500 SMS alert charge"],
  "preferences": ["email alerts preferred over SMS"],
  "pending_actions": ["Rs 500 SMS charge reversal pending"],
  "last_contact": "2025-04-01"
}

=== SESSION 2 (memory injected) ===
Priya: Hi, just checking if my refund came through?
Bot:   Hello Priya! Good to hear from you. Regarding the Rs 500 SMS charge
       reversal we raised - it typically reflects within 2-3 business days.
       If you have not seen it yet it should appear shortly. Your email
       notification preference has also been updated. Anything else I can
       help with?

# Priya said only 9 words in Session 2
# Claude knew her name, account type, exact refund amount, and preference
# All injected from the tiny memory JSON in the system prompt - zero extra tokens

Memory pattern selection guide: use conversation history replay for short sessions under 20 messages where you need full context. Use entity memory for customer-facing chatbots where key facts matter across sessions - this is the most cost-efficient approach. Use summary memory for long-running agent sessions where you need to compress 50-plus message histories into a 200-word rolling summary. For production systems combine entity and summary memory: structured facts for fast lookup, compressed narrative for nuanced context. Always store a timestamp with each memory record so you can audit what the chatbot knew at any interaction.

Send your comments, suggestions or queries regarding this site to [email protected].