tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Anthropic Claude API > Claude Memory Patterns

Claude Memory Patterns

Author: Venkata Sudhakar

Claude has no built-in memory between API calls - each conversation starts completely fresh. For a customer service chatbot this means a customer who explained their problem yesterday must repeat everything today. Memory patterns solve this by storing important context outside the model and injecting it at the start of each new conversation. There are three practical patterns: conversation history replay, entity memory (store key facts per customer), and summary memory (compress old sessions into a rolling summary). Entity memory is the most cost-effective for most production business chatbots.

Entity memory works like this: after each session ends, a background job uses a cheap Claude call to extract key facts - customer name, account type, issues raised, preferences, pending actions. These are stored as a JSON record keyed to the customer ID in your database. At the start of the next session, retrieve the record and inject it into the system prompt. Claude now knows the customer without replaying hundreds of expensive tokens. The extraction job is very cheap at claude-haiku-4-5 pricing and runs once per session end.

The below example shows a banking chatbot that extracts and stores customer memory after each session, then injects it into the next conversation so the customer never has to repeat their account details or previously raised issues.


Simulating two sessions to show memory extraction and reuse,


It gives the following output showing memory extraction and reuse across sessions,

=== SESSION 1 (no memory) ===
Priya: Hi, I am Priya Sharma. I have a savings account. I was charged Rs 50...
Bot:   I am sorry about the unauthorised Rs 500 SMS charge, Priya. I will
       raise a reversal request right now - it should reflect in 2-3 business
       days. I have also flagged your preference for email alerts only.

[Memory stored:]
{
  "customer_name": "Priya Sharma",
  "account_type": "savings account",
  "issues_raised": ["unauthorised Rs 500 SMS alert charge"],
  "preferences": ["email alerts preferred over SMS"],
  "pending_actions": ["Rs 500 SMS charge reversal pending"],
  "last_contact": "2025-04-01"
}

=== SESSION 2 (memory injected) ===
Priya: Hi, just checking if my refund came through?
Bot:   Hello Priya! Good to hear from you. Regarding the Rs 500 SMS charge
       reversal we raised - it typically reflects within 2-3 business days.
       If you have not seen it yet it should appear shortly. Your email
       notification preference has also been updated. Anything else I can
       help with?

# Priya said only 9 words in Session 2
# Claude knew her name, account type, exact refund amount, and preference
# All injected from the tiny memory JSON in the system prompt - zero extra tokens

Memory pattern selection guide: use conversation history replay for short sessions under 20 messages where you need full context. Use entity memory for customer-facing chatbots where key facts matter across sessions - this is the most cost-efficient approach. Use summary memory for long-running agent sessions where you need to compress 50-plus message histories into a 200-word rolling summary. For production systems combine entity and summary memory: structured facts for fast lookup, compressed narrative for nuanced context. Always store a timestamp with each memory record so you can audit what the chatbot knew at any interaction.


 
  


  
bl  br