Files
claude-cookbooks/tool_use/memory_cookbook.ipynb

1260 lines
47 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Memory & Context Management with Claude Sonnet 4.5\n",
"\n",
"Learn how to build AI agents that learn and improve across conversations using Claude's memory tool and context editing capabilities."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table of Contents\n",
"\n",
"1. [Introduction: Why Memory Matters](#introduction)\n",
"2. [Use Cases](#use-cases)\n",
"3. [Quick Start Examples](#quick-start)\n",
"4. [How It Works](#how-it-works)\n",
"5. [Code Review Assistant Demo](#demo)\n",
"6. [Real-World Applications](#real-world)\n",
"7. [Best Practices](#best-practices)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"### For VSCode Users\n",
"\n",
"```bash\n",
"# 1. Create virtual environment\n",
"python -m venv .venv\n",
"\n",
"# 2. Activate it\n",
"source .venv/bin/activate # macOS/Linux\n",
"# or: .venv\\Scripts\\activate # Windows\n",
"\n",
"# 3. Install dependencies\n",
"pip install -r requirements.txt\n",
"\n",
"# 4. In VSCode: Select .venv as kernel (top right)\n",
"```\n",
"\n",
"### API Key\n",
"\n",
"```bash\n",
"cp .env.example .env\n",
"# Edit .env and add your ANTHROPIC_API_KEY\n",
"```\n",
"\n",
"Get your API key from: https://console.anthropic.com/\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Introduction: Why Memory Matters {#introduction}\n",
"\n",
"### The Problem\n",
"\n",
"Large language models have finite context windows (200k tokens for Claude 4). While this seems large, several challenges emerge:\n",
"\n",
"- **Context limits**: Long conversations or complex tasks can exceed available context\n",
"- **Computational cost**: Processing large contexts is expensive - attention mechanisms scale quadratically\n",
"- **Repeated patterns**: Similar tasks across conversations require re-explaining context every time\n",
"- **Information loss**: When context fills up, earlier important information gets lost\n",
"\n",
"### The Solution\n",
"\n",
"Claude Sonnet 4.5 introduces two powerful capabilities:\n",
"\n",
"1. **Memory Tool** (`memory_20250818`): Enables cross-conversation learning\n",
" - Claude can write down what it learns for future reference\n",
" - File-based system under `/memories` directory\n",
" - Client-side implementation gives you full control\n",
"\n",
"2. **Context Editing** (`clear_tool_uses_20250919`): Automatically manages context\n",
" - Clears old tool results when context grows large\n",
" - Keeps recent context while preserving memory\n",
" - Configurable triggers and retention policies\n",
"\n",
"### The Benefit\n",
"\n",
"Build AI agents that **get better at your specific tasks over time**:\n",
"\n",
"- **Session 1**: Claude solves a problem, writes down the pattern\n",
"- **Session 2**: Claude applies the learned pattern immediately (faster!)\n",
"- **Long sessions**: Context editing keeps conversations manageable\n",
"\n",
"Think of it as giving Claude a notebook to take notes and refer back to - just like humans do."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Use Cases {#use-cases}\n",
"\n",
"Memory and context management enable powerful new workflows:\n",
"\n",
"### 🔍 Code Review Assistant\n",
"- Learns debugging patterns from past reviews\n",
"- Recognizes similar bugs instantly in future sessions\n",
"- Builds team-specific code quality knowledge\n",
"- **Production ready**: Integrate with [claude-code-action](https://github.com/anthropics/claude-code-action) for GitHub PR reviews\n",
"\n",
"### 📚 Research Assistant\n",
"- Accumulates knowledge on topics over multiple sessions\n",
"- Connects insights across different research threads\n",
"- Maintains bibliography and source tracking\n",
"\n",
"### 💬 Customer Support Bot\n",
"- Learns user preferences and communication style\n",
"- Remembers common issues and solutions\n",
"- Builds product knowledge base from interactions\n",
"\n",
"### 📊 Data Analysis Helper\n",
"- Remembers dataset patterns and anomalies\n",
"- Stores analysis techniques that work well\n",
"- Builds domain-specific insights over time\n",
"\n",
"**Supported Models**: Claude Opus 4 (`claude-opus-4-20250514`), Claude Opus 4.1 (`claude-opus-4-1-20250805`), Claude Sonnet 4 (`claude-sonnet-4-20250514`), and Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`)\n",
"\n",
"**This cookbook focuses on the Code Review Assistant** as it clearly demonstrates both memory (learning patterns) and context editing (handling long reviews)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Quick Start Examples {#quick-start}\n",
"\n",
"Let's see memory and context management in action with simple examples."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"\n",
"First, install dependencies and configure your environment:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"# Install required packages\n",
"# Option 1: From requirements.txt\n",
"# %pip install -q -r requirements.txt\n",
"\n",
"# Option 2: Direct install\n",
"%pip install -q anthropic python-dotenv ipykernel\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**⚠️ Important**: Create a `.env` file in this directory:\n",
"\n",
"```bash\n",
"# Copy .env.example to .env and add your API key\n",
"cp .env.example .env\n",
"```\n",
"\n",
"Then edit `.env` to add your Anthropic API key from https://console.anthropic.com/"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"✓ API key loaded\n",
"✓ Using model: claude-sonnet-4-5-20250929\n"
]
}
],
"source": [
"import os\n",
"from typing import Any, cast\n",
"\n",
"from anthropic import Anthropic\n",
"from dotenv import load_dotenv\n",
"\n",
"# Load environment variables\n",
"load_dotenv()\n",
"\n",
"API_KEY = os.getenv(\"ANTHROPIC_API_KEY\")\n",
"MODEL = os.getenv(\"ANTHROPIC_MODEL\")\n",
"\n",
"if not API_KEY:\n",
" raise ValueError(\n",
" \"ANTHROPIC_API_KEY not found. \"\n",
" \"Copy .env.example to .env and add your API key.\"\n",
" )\n",
"\n",
"if not MODEL:\n",
" raise ValueError(\n",
" \"ANTHROPIC_MODEL not found. \"\n",
" \"Copy .env.example to .env and set the model.\"\n",
" )\n",
"\n",
"MODEL = cast(str, MODEL)\n",
"\n",
"client = Anthropic(api_key=API_KEY)\n",
"\n",
"print(\"✓ API key loaded\")\n",
"print(f\"✓ Using model: {MODEL}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example 1: Basic Memory Usage\n",
"\n",
"Let's see Claude use memory to store information for future reference."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Helper Functions**\n",
"\n",
"These examples use helper functions from `demo_helpers.py`:\n",
"\n",
"- **`run_conversation_loop()`**: Handles the API conversation loop\n",
" - Calls Claude's API with memory tool enabled\n",
" - Executes tool uses (memory operations)\n",
" - Continues until Claude stops using tools\n",
" - Returns the final response\n",
"\n",
"- **`run_conversation_turn()`**: Single turn (used in Example 3)\n",
" - Same as above but returns after one API call\n",
" - Useful when you need fine-grained control\n",
"\n",
"- **`print_context_management_info()`**: Displays context clearing stats\n",
" - Shows tokens saved, tool uses cleared\n",
" - Helps visualize when context editing triggers"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**⚠️ Note on Memory Clearing**\n",
"\n",
"The following cell clears all memory files to provide a clean slate for this demonstration. This is useful for running the notebook multiple times to see consistent results.\n",
"\n",
"**In production applications**, you should carefully consider whether to clear all memory, as it permanently removes learned patterns. Consider using selective deletion or organizing memory into project-specific directories instead."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"🧹 Clearing previous memories...\n",
"✓ Memory cleared\n",
"\n",
"============================================================\n",
"📝 SESSION 1: Learning from a bug\n",
"============================================================\n",
"\n",
"🔄 Turn 1:\n",
"💬 Claude: I'll review this code to identify the race condition issues. Let me start by checking my memory and then analyzing the problem.\n",
"\n",
" 🔧 Memory tool: view /memories\n",
" ✓ Result: Directory: /memories\n",
"(empty)\n",
"\n",
"🔄 Turn 2:\n",
" 🔧 Memory tool: create /memories/review_progress.md\n",
" ✓ Result: File created successfully at /memories/review_progress.md\n",
"\n",
"🔄 Turn 3:\n",
"💬 Claude: \n",
"\n",
"## Code Review: Race Condition in Multi-threaded Web Scraper\n",
"\n",
"### 🐛 **Critical Issues Found**\n",
"\n",
"#### **1. Race Condition on Shared Lists (Lines 44-48)**\n",
"\n",
"**Problem:**\n",
"```python\n",
"self.results = [] # Shared across threads\n",
"self.failed_urls = [] # Shared across threads\n",
"\n",
"# In scrape_urls():\n",
"self.failed_urls.append(result[\"url\"]) # NOT THREAD-SAFE\n",
"self.results.append(result) # NOT THREAD-SAFE\n",
"```\n",
"\n",
"**Why This Fails:**\n",
"- Python's `list.append()` is **not atomic** for the overall operation\n",
"- Multiple threads can simultaneously read and modify the list's internal state\n",
"- This causes **lost updates** where some appends are overwritten\n",
"- Results: Inconsistent counts, missing data\n",
"\n",
"**Race Condition Scenario:**\n",
"1. Thread A reads `self.results` size: 5\n",
"2. Thread B reads `self.results` size: 5 \n",
"3. Thread A appends → size becomes 6\n",
"4. Thread B appends → overwrites position, still size 6 (one result lost!)\n",
"\n",
"---\n",
"\n",
"### ✅ **Solutions**\n",
"\n",
"#### **Option 1: Use Thread-Safe Queue (Recommended)**\n",
"```python\n",
"import queue\n",
"from concurrent.futures import ThreadPoolExecutor, as_completed\n",
"\n",
"class WebScraper:\n",
" def __init__(self, max_workers: int = 10):\n",
" self.max_workers = max_workers\n",
" # Use thread-safe queues instead of lists\n",
" self.results = queue.Queue()\n",
" self.failed_urls = queue.Queue()\n",
"\n",
" def scrape_urls(self, urls: List[str]) -> List[Dict[str, any]]:\n",
" with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n",
" futures = [executor.submit(self.fetch_url, url) for url in urls]\n",
"\n",
" for future in as_completed(futures):\n",
" result = future.result()\n",
" if \"error\" in result:\n",
" self.failed_urls.put(result[\"url\"]) # Thread-safe\n",
" else:\n",
" self.results.put(result) # Thread-safe\n",
"\n",
" # Convert queue to list for return\n",
" return list(self.results.queue)\n",
"\n",
" def get_stats(self) -> Dict[str, int]:\n",
" total = self.results.qsize()\n",
" failed = self.failed_urls.qsize()\n",
" return {\n",
" \"total_results\": total,\n",
" \"failed_urls\": failed,\n",
" \"success_rate\": total / (total + failed) if (total + failed) > 0 else 0,\n",
" }\n",
"```\n",
"\n",
"#### **Option 2: Use Threading Lock**\n",
"```python\n",
"import threading\n",
"\n",
"class WebScraper:\n",
" def __init__(self, max_workers: int = 10):\n",
" self.max_workers = max_workers\n",
" self.results = []\n",
" self.failed_urls = []\n",
" self.lock = threading.Lock() # Add lock for synchronization\n",
"\n",
" def scrape_urls(self, urls: List[str]) -> List[Dict[str, any]]:\n",
" with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n",
" futures = [executor.submit(self.fetch_url, url) for url in urls]\n",
"\n",
" for future in as_completed(futures):\n",
" result = future.result()\n",
" \n",
" # Protect shared state with lock\n",
" with self.lock:\n",
" if \"error\" in result:\n",
" self.failed_urls.append(result[\"url\"])\n",
" else:\n",
" self.results.append(result)\n",
"\n",
" return self.results\n",
"```\n",
"\n",
"#### **Option 3: Collect Results from Futures (Best for this use case)**\n",
"```python\n",
"class WebScraper:\n",
" def __init__(self, max_workers: int = 10):\n",
" self.max_workers = max_workers\n",
" # No shared state needed!\n",
"\n",
" def scrape_urls(self, urls: List[str]) -> List[Dict[str, any]]:\n",
" results = []\n",
" failed_urls = []\n",
" \n",
" with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n",
" futures = [executor.submit(self.fetch_url, url) for url in urls]\n",
"\n",
" # Collect results from futures - no race condition!\n",
" for future in as_completed(futures):\n",
" result = future.result()\n",
" if \"error\" in result:\n",
" failed_urls.append(result[\"url\"])\n",
" else:\n",
" results.append(result)\n",
"\n",
" self.results = results # Store after collection complete\n",
" self.failed_urls = failed_urls\n",
" return results\n",
"```\n",
"\n",
"---\n",
"\n",
"### 📊 **Why Option 3 is Best Here**\n",
"\n",
"1. **No synchronization overhead** - Single thread collects results\n",
"2. **Simpler code** - No locks or queues needed\n",
"3. **Natural fit** - The main thread iterates `as_completed()` sequentially anyway\n",
"4. **No performance penalty** - Collection happens in calling thread, not worker threads\n",
"\n",
"---\n",
"\n",
"### 🔍 **Additional Issues**\n",
"\n",
"**Type hint error (Line 20):**\n",
"```python\n",
"def fetch_url(self, url: str) -> Dict[str, any]: # Should be 'Any' not 'any'\n",
"```\n",
"Should be:\n",
"```python\n",
"from typing import Any\n",
"def fetch_url(self, url: str) -> Dict[str, Any]:\n",
"```\n",
"\n",
"---\n",
"\n",
"### 📝 **Summary**\n",
"\n",
"**Root Cause:** Unsynchronized concurrent access to `self.results` and `self.failed_urls` lists\n",
"\n",
"**Impact:** Lost updates, inconsistent result counts\n",
"\n",
"**Fix:** Use Option 3 (local variables during collection) - simplest and most efficient\n",
"\n",
" 🔧 Memory tool: str_replace /memories/review_progress.md\n",
" ✓ Result: File /memories/review_progress.md has been edited successfully\n",
"\n",
"🔄 Turn 4:\n",
"\n",
"============================================================\n",
"✅ Session 1 complete!\n",
"============================================================\n"
]
}
],
"source": [
"# Import helper functions\n",
"from memory_demo.demo_helpers import run_conversation_loop, run_conversation_turn, print_context_management_info\n",
"from memory_tool import MemoryToolHandler\n",
"\n",
"# Initialize\n",
"client = Anthropic(api_key=API_KEY)\n",
"memory = MemoryToolHandler(base_path=\"./demo_memory\")\n",
"\n",
"# Clear any existing memories to start fresh\n",
"print(\"🧹 Clearing previous memories...\")\n",
"memory.clear_all_memory()\n",
"print(\"✓ Memory cleared\\n\")\n",
"\n",
"# Load example code with a race condition bug\n",
"with open(\"memory_demo/sample_code/web_scraper_v1.py\", \"r\") as f:\n",
" code_to_review = f.read()\n",
"\n",
"messages = [\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"I'm reviewing a multi-threaded web scraper that sometimes returns fewer results than expected. The count is inconsistent across runs. Can you find the issue?\\n\\n```python\\n{code_to_review}\\n```\"\n",
" }\n",
"]\n",
"\n",
"print(\"=\" * 60)\n",
"print(\"📝 SESSION 1: Learning from a bug\")\n",
"print(\"=\" * 60)\n",
"\n",
"# Run conversation loop\n",
"response = run_conversation_loop(\n",
" client=client,\n",
" model=MODEL,\n",
" messages=messages,\n",
" memory_handler=memory,\n",
" system=\"You are a code reviewer.\",\n",
" max_tokens=2048,\n",
" max_turns=5,\n",
" verbose=True\n",
")\n",
"\n",
"print(\"\\n\" + \"=\" * 60)\n",
"print(\"✅ Session 1 complete!\")\n",
"print(\"=\" * 60)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**What happened?**\n",
"\n",
"1. Claude checked its memory (empty on first run)\n",
"2. Identified the bug: **race condition** - multiple threads modifying shared state (`self.results` and `self.failed_urls`) without synchronization\n",
"3. Stored the concurrency pattern in memory for future reference\n",
"\n",
"Now let's see the magic - Claude applying this learned pattern in a **new conversation**:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example 2: Cross-Conversation Learning\n",
"\n",
"Start a completely new conversation - memory persists!"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"============================================================\n",
"🚀 SESSION 2: Applying learned pattern\n",
"============================================================\n",
"\n",
"🔄 Turn 1:\n",
" 🔧 Memory tool: view /memories\n",
" ✓ Result: Directory: /memories\n",
"- review_progress.md\n",
"\n",
"🔄 Turn 2:\n",
" 🔧 Memory tool: view /memories/review_progress.md\n",
" ✓ Result: 1: # Code Review: Multi-threaded Web Scraper\n",
" 2: \n",
" 3: ## Issue Type\n",
" 4:...\n",
"\n",
"🔄 Turn 3:\n",
" 🔧 Memory tool: str_replace /memories/review_progress.md\n",
" ✓ Result: File /memories/review_progress.md has been edited successfully\n",
"\n",
"🔄 Turn 4:\n",
"💬 Claude: \n",
"\n",
"Now let me review this async API client code:\n",
"\n",
"## Code Review: Async API Client\n",
"\n",
"### ✅ **Correct Assessment of Issues**\n",
"\n",
"The comments in the code correctly identify the problems! Let me elaborate:\n",
"\n",
"---\n",
"\n",
"### 🔴 **Critical Issues**\n",
"\n",
"#### **1. Race Condition on Shared State (Lines 13-14, 44-48)**\n",
"\n",
"**Problem:**\n",
"```python\n",
"self.responses = [] # Shared across coroutines\n",
"self.error_count = 0 # Race condition on increment\n",
"```\n",
"\n",
"While Python's GIL prevents *some* threading issues, **asyncio doesn't have the same protection**. When you `await`, other coroutines can run, leading to interleaving:\n",
"\n",
"```python\n",
"# What could happen:\n",
"# Coroutine A: reads self.error_count (0)\n",
"# Coroutine B: reads self.error_count (0)\n",
"# Coroutine A: increments and writes (1)\n",
"# Coroutine B: increments and writes (1) ← Should be 2!\n",
"```\n",
"\n",
"Similarly, `self.responses.append()` can be interrupted mid-operation.\n",
"\n",
"#### **2. Incorrect Pattern (Lines 41-48)**\n",
"\n",
"The `as_completed` loop is inefficient and still buggy:\n",
"```python\n",
"for coro in asyncio.as_completed(tasks):\n",
" result = await coro\n",
" # Modifying shared state...\n",
"```\n",
"\n",
"---\n",
"\n",
"### 🟡 **Minor Issues**\n",
"\n",
"#### **3. Type Hint Error (Lines 25, 34)**\n",
"```python\n",
"Dict[str, any] # ❌ Wrong: 'any' is not defined\n",
"```\n",
"Should be:\n",
"```python\n",
"Dict[str, Any] # ✅ Correct (import from typing)\n",
"```\n",
"\n",
"#### **4. Missing Error Handling**\n",
"Errors are stored in results but never counted properly due to the race condition.\n",
"\n",
"#### **5. Reusability Issue**\n",
"Calling `fetch_all()` multiple times will accumulate results incorrectly.\n",
"\n",
"---\n",
"\n",
"### ✅ **Recommended Fixes**\n",
"\n",
"#### **Option 1: Use Local Variables (Simplest)**\n",
"\n",
"```python\n",
"async def fetch_all(self, endpoints: List[str]) -> List[Dict[str, Any]]:\n",
" \"\"\"Fetch multiple endpoints concurrently.\"\"\"\n",
" async with aiohttp.ClientSession() as session:\n",
" tasks = [self.fetch_endpoint(session, endpoint) for endpoint in endpoints]\n",
" results = await asyncio.gather(*tasks) # Collect all results\n",
" \n",
" # Now safely update instance variables\n",
" self.responses = [r for r in results if \"error\" not in r]\n",
" self.error_count = sum(1 for r in results if \"error\" in r)\n",
" \n",
" return results # Return ALL results (success + errors)\n",
"```\n",
"\n",
"**Advantages:**\n",
"- Simple and clean\n",
"- No race conditions\n",
"- Uses `asyncio.gather()` which is more efficient\n",
"\n",
"#### **Option 2: Use asyncio.Lock (If Shared State is Required)**\n",
"\n",
"```python\n",
"class AsyncAPIClient:\n",
" def __init__(self, base_url: str):\n",
" self.base_url = base_url\n",
" self.responses = []\n",
" self.error_count = 0\n",
" self._lock = asyncio.Lock() # Add lock\n",
" \n",
" async def fetch_all(self, endpoints: List[str]) -> List[Dict[str, Any]]:\n",
" async with aiohttp.ClientSession() as session:\n",
" tasks = [self.fetch_endpoint(session, endpoint) for endpoint in endpoints]\n",
" \n",
" for coro in asyncio.as_completed(tasks):\n",
" result = await coro\n",
" \n",
" async with self._lock: # Protect shared state\n",
" if \"error\" in result:\n",
" self.error_count += 1\n",
" else:\n",
" self.responses.append(result)\n",
" \n",
" return self.responses\n",
"```\n",
"\n",
"**Advantages:**\n",
"- Explicitly safe\n",
"- Good if multiple methods need to update shared state\n",
"\n",
"---\n",
"\n",
"### 📊 **Comparison: Async vs Threading Races**\n",
"\n",
"| Aspect | Threading | Asyncio |\n",
"|--------|-----------|---------|\n",
"| **Protection** | GIL provides *some* protection | No GIL - explicit coordination needed |\n",
"| **Interleaving** | Can happen anytime | Only happens at `await` points |\n",
"| **Solution** | `threading.Lock()` | `asyncio.Lock()` or avoid shared state |\n",
"\n",
"---\n",
"\n",
"### 🎯 **Final Recommendations**\n",
"\n",
"1. **Use Option 1** (local variables + `asyncio.gather()`) - it's cleaner and faster\n",
"2. Fix the type hints: `any` → `Any`\n",
"3. Consider returning ALL results (including errors) for better observability\n",
"4. Reset `self.responses` and `self.error_count` at the start of `fetch_all()` if you want to reuse the client\n",
"\n",
" 🔧 Memory tool: str_replace /memories/review_progress.md\n",
" ✓ Result: File /memories/review_progress.md has been edited successfully\n",
"\n",
"🔄 Turn 5:\n",
"💬 Claude: \n",
"\n",
"---\n",
"\n",
"### 📝 **Summary**\n",
"\n",
"This code demonstrates a **common misconception**: that asyncio is automatically \"safe\" because it's not multi-threaded. While the GIL prevents *some* threading issues, asyncio coroutines can still interleave at `await` points, causing race conditions on shared mutable state.\n",
"\n",
"**Bottom line:** Use `asyncio.gather()` and local variables instead of modifying instance variables from multiple coroutines. It's simpler, safer, and faster! 🚀\n",
"\n",
"\n",
"============================================================\n",
"✅ Session 2 complete!\n",
"============================================================\n"
]
}
],
"source": [
"# NEW conversation (empty messages)\n",
"# Load API client code with similar concurrency issue\n",
"with open(\"memory_demo/sample_code/api_client_v1.py\", \"r\") as f:\n",
" code_to_review = f.read()\n",
"\n",
"messages = [\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"Review this API client code:\\n\\n```python\\n{code_to_review}\\n```\"\n",
" }\n",
"]\n",
"\n",
"print(\"=\" * 60)\n",
"print(\"🚀 SESSION 2: Applying learned pattern\")\n",
"print(\"=\" * 60)\n",
"\n",
"# Run conversation loop\n",
"response = run_conversation_loop(\n",
" client=client,\n",
" model=MODEL,\n",
" messages=messages,\n",
" memory_handler=memory,\n",
" system=\"You are a code reviewer.\",\n",
" max_tokens=2048,\n",
" max_turns=5,\n",
" verbose=True\n",
")\n",
"\n",
"print(\"\\n\" + \"=\" * 60)\n",
"print(\"✅ Session 2 complete!\")\n",
"print(\"=\" * 60)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Notice the difference:**\n",
"\n",
"- Claude **immediately checked memory** and found the thread-safety/concurrency pattern\n",
"- Recognized the similar issue in async code **instantly** without re-learning\n",
"- Response was **faster** because it applied stored knowledge about shared mutable state\n",
"\n",
"This is **cross-conversation learning** in action!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example 3: Context Clearing While Preserving Memory\n",
"\n",
"What happens during a **long review session** with many code files?\n",
"\n",
"- Context fills up with tool results from previous reviews\n",
"- But memory (learned patterns) must persist!\n",
"\n",
"Let's trigger **context editing** to see how Claude manages this automatically."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"============================================================\n",
"📚 SESSION 3: Long review session with context clearing\n",
"============================================================\n",
"\n",
"📝 Review 1: Data processor\n",
" 🔧 Memory tool: str_replace /memories/review_progress.md\n",
" ✓ Result: File /memories/review_progress.md has been edited successfully\n",
" 📊 Input tokens: 5,977\n",
" Context below threshold - no clearing triggered\n",
"\n",
"📝 Review 2: SQL query builder\n",
" 🔧 Memory tool: str_replace /memories/review_progress.md\n",
" ✓ Result: File /memories/review_progress.md has been edited successfully\n",
" 📊 Input tokens: 7,359\n",
" Context below threshold - no clearing triggered\n",
"\n",
"============================================================\n",
"✅ Session 3 complete!\n",
"============================================================\n"
]
}
],
"source": [
"# Configure context management to clear aggressively for demo\n",
"CONTEXT_MANAGEMENT = {\n",
" \"edits\": [\n",
" {\n",
" \"type\": \"clear_tool_uses_20250919\",\n",
" \"trigger\": {\"type\": \"input_tokens\", \"value\": 5000}, # Lower threshold to trigger clearing sooner\n",
" \"keep\": {\"type\": \"tool_uses\", \"value\": 2}, # Keep only the last 2 tool uses\n",
" \"clear_at_least\": {\"type\": \"input_tokens\", \"value\": 3000}\n",
" }\n",
" ]\n",
"}\n",
"\n",
"# Continue from previous session - memory persists!\n",
"# Add multiple code reviews to build up context\n",
"\n",
"print(\"=\" * 60)\n",
"print(\"📚 SESSION 3: Long review session with context clearing\")\n",
"print(\"=\" * 60)\n",
"print()\n",
"\n",
"# Review 1: Data processor (larger file)\n",
"with open(\"memory_demo/sample_code/data_processor_v1.py\", \"r\") as f:\n",
" data_processor_code = f.read()\n",
"\n",
"messages.extend([\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"Review this data processor:\\n\\n```python\\n{data_processor_code}\\n```\"\n",
" }\n",
"])\n",
"\n",
"print(\"📝 Review 1: Data processor\")\n",
"response = run_conversation_turn(\n",
" client=client,\n",
" model=MODEL,\n",
" messages=messages,\n",
" memory_handler=memory,\n",
" system=\"You are a code reviewer.\",\n",
" context_management=CONTEXT_MANAGEMENT,\n",
" max_tokens=2048,\n",
" verbose=True\n",
")\n",
"\n",
"# Add response to messages\n",
"messages.append({\"role\": \"assistant\", \"content\": response[1]})\n",
"if response[2]:\n",
" messages.append({\"role\": \"user\", \"content\": response[2]})\n",
"\n",
"print(f\" 📊 Input tokens: {response[0].usage.input_tokens:,}\")\n",
"context_cleared, saved = print_context_management_info(response[0])\n",
"print()\n",
"\n",
"# Review 2: Add SQL code\n",
"with open(\"memory_demo/sample_code/sql_query_builder.py\", \"r\") as f:\n",
" sql_code = f.read()\n",
"\n",
"messages.extend([\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"Review this SQL query builder:\\n\\n```python\\n{sql_code}\\n```\"\n",
" }\n",
"])\n",
"\n",
"print(\"📝 Review 2: SQL query builder\")\n",
"response = run_conversation_turn(\n",
" client=client,\n",
" model=MODEL,\n",
" messages=messages,\n",
" memory_handler=memory,\n",
" system=\"You are a code reviewer.\",\n",
" context_management=CONTEXT_MANAGEMENT,\n",
" max_tokens=2048,\n",
" verbose=True\n",
")\n",
"\n",
"messages.append({\"role\": \"assistant\", \"content\": response[1]})\n",
"if response[2]:\n",
" messages.append({\"role\": \"user\", \"content\": response[2]})\n",
"\n",
"print(f\" 📊 Input tokens: {response[0].usage.input_tokens:,}\")\n",
"context_cleared, saved = print_context_management_info(response[0])\n",
"print()\n",
"\n",
"print(\"=\" * 60)\n",
"print(\"✅ Session 3 complete!\")\n",
"print(\"=\" * 60)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**What just happened?**\n",
"\n",
"As context grew during multiple reviews:\n",
"1. **Context clearing triggered automatically** when input tokens exceeded the threshold\n",
"2. **Old tool results were removed** (data processor review details)\n",
"3. **Memory files remained intact** - Claude can still query learned patterns\n",
"4. **Token usage decreased** - saved thousands of tokens while preserving knowledge\n",
"\n",
"This demonstrates the key benefit:\n",
"- **Short-term memory** (conversation context) → Cleared to save space\n",
"- **Long-term memory** (stored patterns) → Persists across sessions\n",
"\n",
"Let's verify memory survived the clearing:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Verify memory persists after context clearing\n",
"import os\n",
"\n",
"print(\"📂 Memory files in demo_memory/:\")\n",
"print()\n",
"\n",
"for root, dirs, files in os.walk(\"./demo_memory\"):\n",
" # Calculate relative path for display\n",
" level = root.replace(\"./demo_memory\", \"\").count(os.sep)\n",
" indent = \" \" * level\n",
" folder_name = os.path.basename(root) or \"demo_memory\"\n",
" print(f\"{indent}{folder_name}/\")\n",
" \n",
" sub_indent = \" \" * (level + 1)\n",
" for file in files:\n",
" file_path = os.path.join(root, file)\n",
" size = os.path.getsize(file_path)\n",
" print(f\"{sub_indent}├── {file} ({size} bytes)\")\n",
"\n",
"print()\n",
"print(\"✅ All learned patterns preserved despite context clearing!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. How It Works {#how-it-works}\n",
"\n",
"### Memory Tool Architecture\n",
"\n",
"The memory tool is **client-side** - you control the storage. Claude makes tool calls, your application executes them.\n",
"\n",
"#### Memory Tool Commands\n",
"\n",
"| Command | Description | Example |\n",
"|---------|-------------|---------|\n",
"| `view` | Show directory or file contents | `{\"command\": \"view\", \"path\": \"/memories\"}` |\n",
"| `create` | Create or overwrite a file | `{\"command\": \"create\", \"path\": \"/memories/notes.md\", \"file_text\": \"...\"}` |\n",
"| `str_replace` | Replace text in a file | `{\"command\": \"str_replace\", \"path\": \"...\", \"old_str\": \"...\", \"new_str\": \"...\"}` |\n",
"| `insert` | Insert text at line number | `{\"command\": \"insert\", \"path\": \"...\", \"insert_line\": 2, \"insert_text\": \"...\"}` |\n",
"| `delete` | Delete a file or directory | `{\"command\": \"delete\", \"path\": \"/memories/old.txt\"}` |\n",
"| `rename` | Rename or move a file | `{\"command\": \"rename\", \"old_path\": \"...\", \"new_path\": \"...\"}` |\n",
"\n",
"See `memory_tool.py` for the complete implementation with path validation and security measures."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Understanding the Demo Code\n",
"\n",
"Key implementation details from `code_review_demo.py`:\n",
"\n",
"```python\n",
"class CodeReviewAssistant:\n",
" def __init__(self, memory_storage_path=\"./memory_storage\"):\n",
" self.client = Anthropic(api_key=API_KEY)\n",
" self.memory_handler = MemoryToolHandler(base_path=memory_storage_path)\n",
" self.messages = []\n",
" \n",
" def review_code(self, code, filename, description=\"\"):\n",
" # 1. Add user message\n",
" self.messages.append({...})\n",
" \n",
" # 2. Conversation loop with tool execution\n",
" while True:\n",
" response = self.client.beta.messages.create(\n",
" model=MODEL,\n",
" system=self._create_system_prompt(),\n",
" messages=self.messages,\n",
" tools=[{\"type\": \"memory_20250818\", \"name\": \"memory\"}],\n",
" betas=[\"context-management-2025-06-27\"],\n",
" context_management=CONTEXT_MANAGEMENT\n",
" )\n",
" \n",
" # 3. Execute tool uses\n",
" tool_results = []\n",
" for content in response.content:\n",
" if content.type == \"tool_use\":\n",
" result = self._execute_tool_use(content)\n",
" tool_results.append({...})\n",
" \n",
" # 4. Continue if there are tool uses, otherwise done\n",
" if tool_results:\n",
" self.messages.append({\"role\": \"user\", \"content\": tool_results})\n",
" else:\n",
" break\n",
"```\n",
"\n",
"**The key pattern**: Keep calling the API while there are tool uses, executing them and feeding results back."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What Claude Actually Learns\n",
"\n",
"This is what makes memory powerful - **semantic pattern recognition**, not just syntax:\n",
"\n",
"**Session 1: Thread-Based Web Scraper**\n",
"\n",
"```python\n",
"# Bug: Race condition\n",
"class WebScraper:\n",
" def __init__(self):\n",
" self.results = [] # Shared state!\n",
" \n",
" def scrape_urls(self, urls):\n",
" with ThreadPoolExecutor() as executor:\n",
" for future in as_completed(futures):\n",
" self.results.append(future.result()) # RACE!\n",
"```\n",
"\n",
"**What Claude Stores in Memory** (example file: `/memories/concurrency_patterns/thread_safety.md`):\n",
"\n",
"When Claude encounters this pattern, it stores the following insights to its memory files:\n",
"- **Symptom**: Inconsistent results in concurrent operations\n",
"- **Cause**: Shared mutable state (lists/dicts) modified from multiple threads\n",
"- **Solution**: Use locks, thread-safe data structures, or return results instead\n",
"- **Red flags**: Instance variables in thread callbacks, unused locks, counter increments\n",
"\n",
"---\n",
"\n",
"**Session 2: Async API Client** (New conversation!)\n",
"\n",
"Claude checks memory FIRST, finds the thread-safety pattern, then:\n",
"1. **Recognizes** similar pattern in async code (coroutines can interleave too)\n",
"2. **Applies** the solution immediately (no re-learning needed)\n",
"3. **Explains** with reference to stored knowledge\n",
"\n",
"```python\n",
"# Claude spots this immediately:\n",
"async def fetch_all(self, endpoints):\n",
" for coro in asyncio.as_completed(tasks):\n",
" self.responses.append(await coro) # Same pattern!\n",
"```\n",
"\n",
"---\n",
"\n",
"**Why This Matters:**\n",
"\n",
"- ❌ **Syntax checkers** miss race conditions entirely\n",
"- ✅ **Claude learns** architectural patterns and applies them across contexts\n",
"- ✅ **Cross-language**: Pattern applies to Go, Java, Rust concurrency too\n",
"- ✅ **Gets better**: Each review adds to the knowledge base"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sample Code Files\n",
"\n",
"The demo uses these sample files (all have concurrency/thread-safety bugs):\n",
"\n",
"- `memory_demo/sample_code/web_scraper_v1.py` - Race condition: threads modifying shared state\n",
"- `memory_demo/sample_code/api_client_v1.py` - Similar concurrency bug in async context\n",
"- `memory_demo/sample_code/data_processor_v1.py` - Multiple concurrency issues for long session demo\n",
"\n",
"Let's look at one:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**`memory_demo/sample_code/web_scraper_v1.py`**\n",
"\n",
"```python\n",
"\"\"\"\n",
"Concurrent web scraper with a race condition bug.\n",
"Multiple threads modify shared state without synchronization.\n",
"\"\"\"\n",
"\n",
"import time\n",
"from concurrent.futures import ThreadPoolExecutor, as_completed\n",
"from typing import List, Dict\n",
"\n",
"import requests\n",
"\n",
"\n",
"class WebScraper:\n",
" \"\"\"Web scraper that fetches multiple URLs concurrently.\"\"\"\n",
"\n",
" def __init__(self, max_workers: int = 10):\n",
" self.max_workers = max_workers\n",
" self.results = [] # BUG: Shared mutable state accessed by multiple threads!\n",
" self.failed_urls = [] # BUG: Another race condition!\n",
"\n",
" def fetch_url(self, url: str) -> Dict[str, any]:\n",
" \"\"\"Fetch a single URL and return the result.\"\"\"\n",
" try:\n",
" response = requests.get(url, timeout=5)\n",
" response.raise_for_status()\n",
" return {\n",
" \"url\": url,\n",
" \"status\": response.status_code,\n",
" \"content_length\": len(response.content),\n",
" }\n",
" except requests.exceptions.RequestException as e:\n",
" return {\"url\": url, \"error\": str(e)}\n",
"\n",
" def scrape_urls(self, urls: List[str]) -> List[Dict[str, any]]:\n",
" \"\"\"\n",
" Scrape multiple URLs concurrently.\n",
"\n",
" BUG: self.results is accessed from multiple threads without locking!\n",
" This causes race conditions where results can be lost or corrupted.\n",
" \"\"\"\n",
" with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n",
" futures = [executor.submit(self.fetch_url, url) for url in urls]\n",
"\n",
" for future in as_completed(futures):\n",
" result = future.result()\n",
"\n",
" # RACE CONDITION: Multiple threads append to self.results simultaneously\n",
" if \"error\" in result:\n",
" self.failed_urls.append(result[\"url\"]) # RACE CONDITION\n",
" else:\n",
" self.results.append(result) # RACE CONDITION\n",
"\n",
" return self.results\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Bug**: Multiple threads modify `self.results` and `self.failed_urls` without locking!\n",
"\n",
"Claude will:\n",
"1. Identify the race conditions\n",
"2. Store the pattern in `/memories/concurrency_patterns/thread_safety.md`\n",
"3. Apply this concurrency pattern to async code in Session 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Demo Overview\n",
"\n",
"We've built a complete Code Review Assistant. The implementation is in `memory_demo/code_review_demo.py`.\n",
"\n",
"**To run the interactive demo:**\n",
"```bash\n",
"python memory_demo/code_review_demo.py\n",
"```\n",
"\n",
"The demo demonstrates:\n",
"1. **Session 1**: Review Python code with a bug → Claude learns the pattern\n",
"2. **Session 2**: Review similar code (new conversation) → Claude applies the pattern\n",
"3. **Session 3**: Long review session → Context editing keeps it manageable"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Best Practices & Security {#best-practices}\n",
"\n",
"### Memory Management\n",
"\n",
"**Do:**\n",
"- ✅ Store task-relevant patterns, not conversation history\n",
"- ✅ Organize with clear directory structure\n",
"- ✅ Use descriptive file names\n",
"- ✅ Periodically review and clean up memory\n",
"\n",
"**Don't:**\n",
"- ❌ Store sensitive information (passwords, API keys, PII)\n",
"- ❌ Let memory grow unbounded\n",
"- ❌ Store everything indiscriminately\n",
"\n",
"### Security: Path Traversal Protection\n",
"\n",
"**Critical**: Always validate paths to prevent directory traversal attacks. See `memory_tool.py` for implementation.\n",
"\n",
"### Security: Memory Poisoning\n",
"\n",
"**⚠️ Critical Risk**: Memory files are read back into Claude's context, making them a potential vector for prompt injection.\n",
"\n",
"**Mitigation strategies:**\n",
"1. **Content Sanitization**: Filter dangerous patterns before storing\n",
"2. **Memory Scope Isolation**: Per-user/per-project isolation \n",
"3. **Memory Auditing**: Log and scan all memory operations\n",
"4. **Prompt Engineering**: Instruct Claude to ignore instructions in memory\n",
"\n",
"See `memory_tool.py` for complete security implementation and tests in `tests/`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": "## Next Steps\n\n### Resources\n\n- **API Docs**: [docs.claude.com](https://docs.claude.com)\n- **GitHub Action**: [claude-code-action](https://github.com/anthropics/claude-code-action)\n- **Support**: [support.anthropic.com](https://support.anthropic.com)\n\n### Feedback\n\nMemory and context management are in **beta**. Share your feedback to help us improve!"
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}