suggested changes to memory introduction and agents tmp handling

2025-10-06 01:00:28 +03:00 · 2025-05-23 17:23:18 -07:00
parent 25cc8b82a2
commit 740073e41c
1 changed files with 27 additions and 113 deletions
--- a/tool_use/memory_cookbook.ipynb
+++ b/tool_use/memory_cookbook.ipynb
@@ -29,12 +29,19 @@
   "source": [
    "### Introduction\n",
    "\n",
-    "Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we're going to demonstrate a few different strategies for \"self-managed\" (llm-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing memory tooling, and we're excited to see how teams extend the ideas below.\n",
+    "Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we demonstrate a few different strategies for \"self-managed\" (LLM-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing [memory tooling](https://www.anthropic.com/news/claude-4#:~:text=more%20on%20methodology.-,Model%20improvements,-In%20addition%20to), and we're excited to see how teams extend the ideas below.\n",
    "\n",
    "\n",
    "#### Why do we need to manage memory?\n",
    "\n",
-    "LLMs have finite context windows (200k tokens for Claude-4 Sonnet & Opus). Tactically this means that any request > 200k tokens will be truncated. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* context window of an LLM. Often, in practice, most tasks see performance degregation at thresholds significantly less that the maximum available context window. Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at hand."
+    "LLMs have finite context windows (200k tokens for Claude 4 Sonnet & Opus). Tactically, this means that any request greater than 200k tokens won't work. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* [context window](https://docs.anthropic.com/en/docs/build-with-claude/context-windows) of an LLM. See our tips for [long context prompting](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips) to learn more about effective context windows and best practices.\n",
+    "\n",
+    "In addition to the above, memory is important for the following reasons:\n",
+    "- **Context windows are a moving target:** Even if we could build infinitely long context windows, they'd never be enough—the real world produces far more data than any window can hold. When we expand from 200k to 2M tokens, users immediately need 20M for their growing codebases, longer conversations, and larger document collections.\n",
+    "-  **Long context windows are computationally expensive:** Attention mechanisms scale quadratically—doubling context length quadruples compute cost. Most tasks only need a small fraction of available context, making it wasteful to process millions of irrelevant tokens. This is why humans don't memorize entire textbooks; we take notes and build mental models instead.\n",
+    "- **More efficient processing:** When LLMs write and maintain their own notes—saving successful strategies, key insights, and relevant context—they're effectively updating their capabilities in real-time without retraining. Models that excel at these operations can maintain coherent behavior over extremely long time horizons while using only a fraction of the computational resources required for full context windows.\n",
+    "\n",
+    "Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at-hand."
   ]
  },
  {
@@ -46,17 +53,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Note: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
+   "outputs": [],
   "source": [
    "# install deps\n",
    "%pip install -q -U anthropic python-dotenv nest_asyncio PyPDF2"
@@ -64,7 +63,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -92,28 +91,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "fatal: destination path '/tmp/anthropic-quickstarts' already exists and is not an empty directory.\n"
-     ]
-    }
-   ],
-   "source": [
-    "import sys \n",
-    "\n",
-    "# clone the agents quickstart implementation\n",
-    "!git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n",
-    "\n",
-    "# navigate to the agents quickstart implementation\n",
-    "!cd /tmp/anthropic-quickstarts\n",
-    "\n",
-    "sys.path.append(os.path.abspath('.'))"
-   ]
+   "outputs": [],
+   "source": "import sys \nimport os\n\n# Check if the repo already exists\nif not os.path.exists('/tmp/anthropic-quickstarts'):\n    # Clone the agents quickstart implementation\n    !git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\nelse:\n    print(\"Repository already exists at /tmp/anthropic-quickstarts\")\n\n# IMPORTANT: Insert at the beginning of sys.path to override any existing 'agents' modules\nif '/tmp/anthropic-quickstarts' not in sys.path:\n    sys.path.insert(0, '/tmp/anthropic-quickstarts')\n\n# Clear any cached imports of 'agents' module\nif 'agents' in sys.modules:\n    del sys.modules['agents']\nif 'agents.agent' in sys.modules:\n    del sys.modules['agents.agent']"
  },
  {
   "cell_type": "markdown",
@@ -124,17 +104,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Oh joy, another laptop problem. What's it doing? Blue-screening? Making strange noises? Becoming self-aware? I need details before I can wave my magical tech support wand.\n"
-     ]
-    }
-   ],
+   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "nest_asyncio.apply()\n",
@@ -156,7 +128,7 @@
   "source": [
    "### Implementation 1: Simple Memory Tool\n",
    "\n",
-    "*Implementation borrowed from [Barry Zhang](https://github.com/ItsBarryZ)*. See the agents quick-start tools [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools) as well as the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).\n",
+    "*This implementation is a reflection of our agents quickstarts repo [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools). For more information on tool use, see the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).*\n",
    "\n",
    "The `SimpleMemory()` tool gives the model a scratchpad to manage memory. This is maintained as a single string that can be read or updated.\n",
    "\n",
@@ -164,7 +136,8 @@
    "\n",
    "\n",
    "<b>When would you use this?</b>\n",
-    "- You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n",
+    "\n",
+    "You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n",
    "\n",
    "<b><i>General Notes on Tool Use:</i></b> \n",
    "- Your tool descriptions should be clear and sufficiently detailed. The best way to guide model behavior around tools is by providing direction as to when / under what conditions tools should be used. \n",
@@ -173,7 +146,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -551,26 +524,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 57,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'type': 'file',\n",
-       " 'id': 'file_011CPN5QewZbKuHeB8gL1Fwr',\n",
-       " 'size_bytes': 32378962,\n",
-       " 'created_at': '2025-05-22T06:14:19.943000Z',\n",
-       " 'filename': 'SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf',\n",
-       " 'mime_type': 'application/pdf',\n",
-       " 'downloadable': False}"
-      ]
-     },
-     "execution_count": 57,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "import requests\n",
    "import mimetypes\n",
@@ -646,6 +602,7 @@
    "        \n",
    "# example usage\n",
    "file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n",
+    "file_path = \"/Users/alexander/Downloads/Ground Lease - Desert Valley Medical Campus.pdf\"\n",
    "storage_manager = StorageManager(os.getenv(\"ANTHROPIC_API_KEY\"))\n",
    "uploaded = storage_manager.upload_file(file_path)\n",
    "storage_manager.get_file_metadata(uploaded['id'])"
@@ -697,28 +654,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 55,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "memory\n",
-       "├── self_managed\n",
-       "│   ├── user_session_notes\n",
-       "│   │   ├── ongoing_projects.txt\n",
-       "│   │   └── preferences.txt\n",
-       "│   └── projects\n",
-       "│       └── building_agi.txt\n",
-       "└── files\n",
-       "    └── projects"
-      ]
-     },
-     "execution_count": 55,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "# example usage\n",
    "company_agent_memory = MemoryTree()\n",
@@ -1034,31 +972,7 @@
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/var/folders/40/m42jqbt54j90clf75tsn03kw0000gp/T/ipykernel_92531/3353802839.py:99: DeprecationWarning: on_submit is deprecated. Instead, set the .continuous_update attribute to False and observe the value changing with: mywidget.observe(callback, 'value').\n",
-      "  self.text_input.on_submit(self.on_send)\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "92bc4784ef0c462d9b737c14c040f508",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "HBox(children=(VBox(children=(Label(value='Chat'), Output(layout=Layout(border_bottom='1px solid #ccc', border…"
-      ]
-     },
-     "execution_count": 77,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "memory_tool = FileBasedMemoryTool() # or SimpleMemory() or CompactifyMemory(client) or FileBasedMemoryTool(storage_manager)\n",
    "model_config = {\n",
@@ -1100,4 +1014,4 @@
 },
 "nbformat": 4,
 "nbformat_minor": 2
-}
+}