claude-cookbooks/skills/retrieval_augmented_generation/guide.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Retrieval Augmented Generation\n",
    "\n",
    "Claude excels at a wide range of tasks, but it may struggle with queries specific to your unique business context. This is where Retrieval Augmented Generation (RAG) becomes invaluable. RAG enables Claude to leverage your internal knowledge bases or customer support documents, significantly enhancing its ability to answer domain-specific questions. Enterprises are increasingly building RAG applications to improve workflows in customer support, Q&A over internal company documents, financial & legal analysis, and much more.\n",
    "\n",
    "In this guide, we'll demonstrate how to build and optimize a RAG system using the Claude Documentation as our knowledge base. We'll walk you through:\n",
    "\n",
    "1) Setting up a basic RAG system using an in-memory vector database and embeddings from [Voyage AI](https://www.voyageai.com/).\n",
    "\n",
    "2) Building a robust evaluation suite. We'll go beyond 'vibes' based evals and show you how to measure the retrieval pipeine & end to end performance independently.\n",
    "\n",
    "3) Implementing advanced techniques to improve RAG including summary indexing and re-ranking with Claude.\n",
    "\n",
    "Through a series of targeted improvements, we achieved significant performance gains on the following metrics compared to a basic RAG pipeline (we'll explain what all these metrics *mean* in a bit)\n",
    "\n",
    "- Avg Precision: 0.43 --> 0.44\n",
    "- Avg Recall: 0.66 --> 0.69\n",
    "- Avg F1 Score: 0.52 --> 0.54\n",
    "- Avg Mean Reciprocal Rank (MRR): 0.74 --> 0.87\n",
    "- End-to-End Accuracy: 71% --> 81%\n",
    "\n",
    "#### Note:\n",
    "\n",
    "The evaluations in this cookbook are meant to mirror a production evaluation system, and you should keep in mind that they can take a while to run. Also of note: if you run the evaluations in full, you may come up against rate limits unless you are in [Tier 2 and above](https://docs.claude.com/en/api/rate-limits). Consider skipping the full end to end eval if you're trying to conserve token usage.\n",
    "\n",
    "## Table of Contents\n",
    "\n",
    "1) Setup\n",
    "\n",
    "2) Level 1 - Basic RAG\n",
    "\n",
    "3) Building an Evaluation System\n",
    "\n",
    "4) Level 2 - Summary Indexing\n",
    "\n",
    "5) Level 3 - Summary Indexing and Re-Ranking"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "We'll need a few libraries, including:\n",
    "\n",
    "1) `anthropic` - to interact with Claude\n",
    "\n",
    "2) `voyageai` - to generate high quality embeddings\n",
    "\n",
    "3) `pandas`, `numpy`, `matplotlib`, and `scikit-learn` for data manipulation and visualization\n",
    "\n",
    "\n",
    "You'll also need API keys from [Anthropic](https://www.anthropic.com/) and [Voyage AI](https://www.voyageai.com/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: anthropic in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (0.34.1)\n",
      "Requirement already satisfied: anyio<5,>=3.5.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (3.7.1)\n",
      "Requirement already satisfied: distro<2,>=1.7.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (1.8.0)\n",
      "Requirement already satisfied: httpx<1,>=0.23.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (0.25.2)\n",
      "Requirement already satisfied: jiter<1,>=0.4.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (0.4.0)\n",
      "Requirement already satisfied: pydantic<3,>=1.9.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (2.7.2)\n",
      "Requirement already satisfied: sniffio in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (1.3.0)\n",
      "Requirement already satisfied: tokenizers>=0.13.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (0.13.3)\n",
      "Requirement already satisfied: typing-extensions<5,>=4.7 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anthropic) (4.11.0)\n",
      "Requirement already satisfied: idna>=2.8 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from anyio<5,>=3.5.0->anthropic) (3.4)\n",
      "Requirement already satisfied: certifi in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from httpx<1,>=0.23.0->anthropic) (2023.11.17)\n",
      "Requirement already satisfied: httpcore==1.* in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from httpx<1,>=0.23.0->anthropic) (1.0.2)\n",
      "Requirement already satisfied: h11<0.15,>=0.13 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->anthropic) (0.14.0)\n",
      "Requirement already satisfied: annotated-types>=0.4.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->anthropic) (0.6.0)\n",
      "Requirement already satisfied: pydantic-core==2.18.3 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->anthropic) (2.18.3)\n",
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: voyageai in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (0.2.2)\n",
      "Requirement already satisfied: aiohttp<4.0,>=3.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from voyageai) (3.9.3)\n",
      "Requirement already satisfied: aiolimiter<2.0.0,>=1.1.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from voyageai) (1.1.0)\n",
      "Requirement already satisfied: numpy>=1.11 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from voyageai) (1.24.4)\n",
      "Requirement already satisfied: requests<3.0,>=2.20 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from voyageai) (2.31.0)\n",
      "Requirement already satisfied: tenacity>=8.0.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from voyageai) (8.4.1)\n",
      "Requirement already satisfied: aiosignal>=1.1.2 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from aiohttp<4.0,>=3.5->voyageai) (1.3.1)\n",
      "Requirement already satisfied: attrs>=17.3.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from aiohttp<4.0,>=3.5->voyageai) (22.1.0)\n",
      "Requirement already satisfied: frozenlist>=1.1.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from aiohttp<4.0,>=3.5->voyageai) (1.4.0)\n",
      "Requirement already satisfied: multidict<7.0,>=4.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from aiohttp<4.0,>=3.5->voyageai) (6.0.4)\n",
      "Requirement already satisfied: yarl<2.0,>=1.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from aiohttp<4.0,>=3.5->voyageai) (1.9.2)\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from requests<3.0,>=2.20->voyageai) (3.3.2)\n",
      "Requirement already satisfied: idna<4,>=2.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from requests<3.0,>=2.20->voyageai) (3.4)\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from requests<3.0,>=2.20->voyageai) (1.26.18)\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from requests<3.0,>=2.20->voyageai) (2023.11.17)\n",
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: pandas in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (2.0.3)\n",
      "Requirement already satisfied: python-dateutil>=2.8.2 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pandas) (2.8.2)\n",
      "Requirement already satisfied: pytz>=2020.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pandas) (2023.3)\n",
      "Requirement already satisfied: tzdata>=2022.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pandas) (2023.3)\n",
      "Requirement already satisfied: numpy>=1.21.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pandas) (1.24.4)\n",
      "Requirement already satisfied: six>=1.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)\n",
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: numpy in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (1.24.4)\n",
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: matplotlib in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (3.7.2)\n",
      "Requirement already satisfied: contourpy>=1.0.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (1.2.1)\n",
      "Requirement already satisfied: cycler>=0.10 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (0.11.0)\n",
      "Requirement already satisfied: fonttools>=4.22.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (4.41.1)\n",
      "Requirement already satisfied: kiwisolver>=1.0.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (1.4.4)\n",
      "Requirement already satisfied: numpy>=1.20 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (1.24.4)\n",
      "Requirement already satisfied: packaging>=20.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (23.2)\n",
      "Requirement already satisfied: pillow>=6.2.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (10.3.0)\n",
      "Requirement already satisfied: pyparsing<3.1,>=2.3.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (3.0.9)\n",
      "Requirement already satisfied: python-dateutil>=2.7 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib) (2.8.2)\n",
      "Requirement already satisfied: six>=1.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)\n",
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: seaborn in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (0.12.2)\n",
      "Requirement already satisfied: numpy!=1.24.0,>=1.17 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from seaborn) (1.24.4)\n",
      "Requirement already satisfied: pandas>=0.25 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from seaborn) (2.0.3)\n",
      "Requirement already satisfied: matplotlib!=3.6.1,>=3.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from seaborn) (3.7.2)\n",
      "Requirement already satisfied: contourpy>=1.0.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.2.1)\n",
      "Requirement already satisfied: cycler>=0.10 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (0.11.0)\n",
      "Requirement already satisfied: fonttools>=4.22.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (4.41.1)\n",
      "Requirement already satisfied: kiwisolver>=1.0.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.4.4)\n",
      "Requirement already satisfied: packaging>=20.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (23.2)\n",
      "Requirement already satisfied: pillow>=6.2.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (10.3.0)\n",
      "Requirement already satisfied: pyparsing<3.1,>=2.3.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (3.0.9)\n",
      "Requirement already satisfied: python-dateutil>=2.7 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (2.8.2)\n",
      "Requirement already satisfied: pytz>=2020.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pandas>=0.25->seaborn) (2023.3)\n",
      "Requirement already satisfied: tzdata>=2022.1 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from pandas>=0.25->seaborn) (2023.3)\n",
      "Requirement already satisfied: six>=1.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.1->seaborn) (1.16.0)\n",
      "Looking in indexes: https://reader2:****@artifactory.infra.ant.dev/artifactory/api/pypi/pypi-all/simple\n",
      "Requirement already satisfied: scikit-learn in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (1.5.1)\n",
      "Requirement already satisfied: numpy>=1.19.5 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from scikit-learn) (1.24.4)\n",
      "Requirement already satisfied: scipy>=1.6.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from scikit-learn) (1.11.1)\n",
      "Requirement already satisfied: joblib>=1.2.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from scikit-learn) (1.3.1)\n",
      "Requirement already satisfied: threadpoolctl>=3.1.0 in /opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages (from scikit-learn) (3.2.0)\n"
     ]
    }
   ],
   "source": [
    "## setup\n",
    "!pip install anthropic\n",
    "!pip install voyageai\n",
    "!pip install pandas\n",
    "!pip install numpy\n",
    "!pip install matplotlib\n",
    "!pip install seaborn\n",
    "!pip install -U scikit-learn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ['VOYAGE_API_KEY'] = \"VOYAGE KEY HERE\"\n",
    "os.environ['ANTHROPIC_API_KEY'] = \"ANTHROPIC KEY HERE\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "import anthropic\n",
    "import os\n",
    "\n",
    "client = anthropic.Anthropic(\n",
    "    # This is the default and can be omitted\n",
    "    api_key=os.getenv(\"ANTHROPIC_API_KEY\"),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Initialize a Vector DB Class\n",
    "\n",
    "In this example, we're using an in-memory vector DB, but for a production application, you may want to use a hosted solution. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import pickle\n",
    "import json\n",
    "import numpy as np\n",
    "import voyageai\n",
    "\n",
    "class VectorDB:\n",
    "    def __init__(self, name, api_key=None):\n",
    "        if api_key is None:\n",
    "            api_key = os.getenv(\"VOYAGE_API_KEY\")\n",
    "        self.client = voyageai.Client(api_key=api_key)\n",
    "        self.name = name\n",
    "        self.embeddings = []\n",
    "        self.metadata = []\n",
    "        self.query_cache = {}\n",
    "        self.db_path = f\"./data/{name}/vector_db.pkl\"\n",
    "\n",
    "    def load_data(self, data):\n",
    "        if self.embeddings and self.metadata:\n",
    "            print(\"Vector database is already loaded. Skipping data loading.\")\n",
    "            return\n",
    "        if os.path.exists(self.db_path):\n",
    "            print(\"Loading vector database from disk.\")\n",
    "            self.load_db()\n",
    "            return\n",
    "\n",
    "        texts = [f\"Heading: {item['chunk_heading']}\\n\\n Chunk Text:{item['text']}\" for item in data]\n",
    "        self._embed_and_store(texts, data)\n",
    "        self.save_db()\n",
    "        print(\"Vector database loaded and saved.\")\n",
    "\n",
    "    def _embed_and_store(self, texts, data):\n",
    "        batch_size = 128\n",
    "        result = [\n",
    "            self.client.embed(\n",
    "                texts[i : i + batch_size],\n",
    "                model=\"voyage-2\"\n",
    "            ).embeddings\n",
    "            for i in range(0, len(texts), batch_size)\n",
    "        ]\n",
    "        self.embeddings = [embedding for batch in result for embedding in batch]\n",
    "        self.metadata = data\n",
    "\n",
    "    def search(self, query, k=5, similarity_threshold=0.75):\n",
    "        if query in self.query_cache:\n",
    "            query_embedding = self.query_cache[query]\n",
    "        else:\n",
    "            query_embedding = self.client.embed([query], model=\"voyage-2\").embeddings[0]\n",
    "            self.query_cache[query] = query_embedding\n",
    "\n",
    "        if not self.embeddings:\n",
    "            raise ValueError(\"No data loaded in the vector database.\")\n",
    "\n",
    "        similarities = np.dot(self.embeddings, query_embedding)\n",
    "        top_indices = np.argsort(similarities)[::-1]\n",
    "        top_examples = []\n",
    "        \n",
    "        for idx in top_indices:\n",
    "            if similarities[idx] >= similarity_threshold:\n",
    "                example = {\n",
    "                    \"metadata\": self.metadata[idx],\n",
    "                    \"similarity\": similarities[idx],\n",
    "                }\n",
    "                top_examples.append(example)\n",
    "                \n",
    "                if len(top_examples) >= k:\n",
    "                    break\n",
    "        self.save_db()\n",
    "        return top_examples\n",
    "\n",
    "    def save_db(self):\n",
    "        data = {\n",
    "            \"embeddings\": self.embeddings,\n",
    "            \"metadata\": self.metadata,\n",
    "            \"query_cache\": json.dumps(self.query_cache),\n",
    "        }\n",
    "        os.makedirs(os.path.dirname(self.db_path), exist_ok=True)\n",
    "        with open(self.db_path, \"wb\") as file:\n",
    "            pickle.dump(data, file)\n",
    "\n",
    "    def load_db(self):\n",
    "        if not os.path.exists(self.db_path):\n",
    "            raise ValueError(\"Vector database file not found. Use load_data to create a new database.\")\n",
    "        with open(self.db_path, \"rb\") as file:\n",
    "            data = pickle.load(file)\n",
    "        self.embeddings = data[\"embeddings\"]\n",
    "        self.metadata = data[\"metadata\"]\n",
    "        self.query_cache = json.loads(data[\"query_cache\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Level 1 - Basic RAG\n",
    "\n",
    "To get started, we'll set up a basic RAG pipeline using a bare bones approach. This is sometimes called 'Naive RAG' by many in the industry. A basic RAG pipeline includes the following 3 steps:\n",
    "\n",
    "1) Chunk documents by heading - containing only the content from each subheading\n",
    "\n",
    "2) Embed each document\n",
    "\n",
    "3) Use Cosine similarity to retrieve documents in order to answer query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading vector database from disk.\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "import matplotlib.pyplot as plt\n",
    "import xml.etree.ElementTree as ET\n",
    "from tqdm import tqdm\n",
    "import logging\n",
    "from typing import Callable, List, Dict, Any, Tuple, Set\n",
    "\n",
    "# Load the evaluation dataset\n",
    "with open('evaluation/docs_evaluation_dataset.json', 'r') as f:\n",
    "    eval_data = json.load(f)\n",
    "\n",
    "# Load the Claude Documentation\n",
    "with open('data/anthropic_docs.json', 'r') as f:\n",
    "    anthropic_docs = json.load(f)\n",
    "\n",
    "# Initialize the VectorDB\n",
    "db = VectorDB(\"anthropic_docs\")\n",
    "db.load_data(anthropic_docs)\n",
    "\n",
    "def retrieve_base(query, db):\n",
    "    results = db.search(query, k=3)\n",
    "    context = \"\"\n",
    "    for result in results:\n",
    "        chunk = result['metadata']\n",
    "        context += f\"\\n{chunk['text']}\\n\"\n",
    "    return results, context\n",
    "\n",
    "def answer_query_base(query, db):\n",
    "    documents, context = retrieve_base(query, db)\n",
    "    prompt = f\"\"\"\n",
    "    You have been tasked with helping us to answer the following query: \n",
    "    <query>\n",
    "    {query}\n",
    "    </query>\n",
    "    You have access to the following documents which are meant to provide context as you answer the query:\n",
    "    <documents>\n",
    "    {context}\n",
    "    </documents>\n",
    "    Please remain faithful to the underlying context, and only deviate from it if you are 100% sure that you know the answer already. \n",
    "    Answer the question now, and avoid providing preamble such as 'Here is the answer', etc\n",
    "    \"\"\"\n",
    "    response = client.messages.create(\n",
    "        model=\"claude-3-haiku-20240307\",\n",
    "        max_tokens=2500,\n",
    "        messages=[\n",
    "            {\"role\": \"user\", \"content\": prompt}\n",
    "        ],\n",
    "        temperature=0\n",
    "    )\n",
    "    return response.content[0].text"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Eval Setup\n",
    "\n",
    "When evaluating RAG applications, it's critical to evaluate the performance of the retrieval system and end to end system separately.\n",
    "\n",
    "We synthetically generated an evaluation dataset consisting of 100 samples which include the following:\n",
    "- A question\n",
    "- Chunks from our docs which are relevant to that question. This is what we expect our retrieval system to retrieve when the question is asked\n",
    "- A correct answer to the question.\n",
    "\n",
    "This is a relatively challenging dataset. Some of our questions require synthesis between more than one chunk in order to be answered correctly, so it's important that our system can load in more than one chunk at a time. You can inspect the dataset by opening `evaluation/docs_evaluation_dataset.json`\n",
    "\n",
    "Run the next cell to see a preview of the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Preview of the first 3 items from evaluation/docs_evaluation_dataset.json:\n",
      "[\n",
      "  {\n",
      "    \"id\": \"efc09699\",\n",
      "    \"question\": \"How can you create multiple test cases for an evaluation in the Anthropic Evaluation tool?\",\n",
      "    \"correct_chunks\": [\n",
      "      \"https://docs.claude.com/en/docs/test-and-evaluate/eval-tool#creating-test-cases\",\n",
      "      \"https://docs.claude.com/en/docs/build-with-claude/develop-tests#building-evals-and-test-cases\"\n",
      "    ],\n",
      "    \"correct_answer\": \"To create multiple test cases in the Anthropic Evaluation tool, click the 'Add Test Case' button, fill in values for each variable in your prompt, and repeat the process to create additional test case scenarios.\"\n",
      "  },\n",
      "  {\n",
      "    \"id\": \"1305ea00\",\n",
      "    \"question\": \"What embeddings provider does Anthropic recommend for customized domain-specific models, and what capabilities does this provider offer?\",\n",
      "    \"correct_chunks\": [\n",
      "      \"https://docs.claude.com/en/docs/build-with-claude/embeddings#before-implementing-embeddings\",\n",
      "      \"https://docs.claude.com/en/docs/build-with-claude/embeddings#how-to-get-embeddings-with-anthropic\"\n",
      "    ],\n",
      "    \"correct_answer\": \"Anthropic recommends Voyage AI for embedding models. Voyage AI offers customized models for specific industry domains like finance and healthcare, as well as bespoke fine-tuned models for individual customers. They have a wide variety of options and capabilities.\"\n",
      "  },\n",
      "  {\n",
      "    \"id\": \"1811c10d\",\n",
      "    \"question\": \"What are some key success metrics to consider when evaluating Claude's performance on a classification task, and how do they relate to choosing the right model to reduce latency?\",\n",
      "    \"correct_chunks\": [\n",
      "      \"https://docs.claude.com/en/docs/about-claude/use-cases/classification#evaluation-metrics\",\n",
      "      \"https://docs.claude.com/en/docs/test-and-evaluate/strengthen-guardrails/reduce-latency#1-choose-the-right-model\"\n",
      "    ],\n",
      "    \"correct_answer\": \"When evaluating Claude's performance on a classification task, some key success metrics to consider include accuracy, F1 score, consistency, structure, speed, bias and fairness. Choosing the right model that fits your specific requirements in terms of speed and output quality is a straightforward way to reduce latency and meet the acceptable response time for your use case.\"\n",
      "  }\n",
      "]\n",
      "\n",
      "Total number of items: 100\n"
     ]
    }
   ],
   "source": [
    "#previewing our eval dataset\n",
    "import json\n",
    "\n",
    "def preview_json(file_path, num_items=3):\n",
    "    try:\n",
    "        with open(file_path, 'r') as file:\n",
    "            data = json.load(file)\n",
    "            \n",
    "        if isinstance(data, list):\n",
    "            preview_data = data[:num_items]\n",
    "        elif isinstance(data, dict):\n",
    "            preview_data = dict(list(data.items())[:num_items])\n",
    "        else:\n",
    "            print(f\"Unexpected data type: {type(data)}. Cannot preview.\")\n",
    "            return\n",
    "        \n",
    "        print(f\"Preview of the first {num_items} items from {file_path}:\")\n",
    "        print(json.dumps(preview_data, indent=2))\n",
    "        print(f\"\\nTotal number of items: {len(data)}\")\n",
    "        \n",
    "    except FileNotFoundError:\n",
    "        print(f\"File not found: {file_path}\")\n",
    "    except json.JSONDecodeError:\n",
    "        print(f\"Invalid JSON in file: {file_path}\")\n",
    "    except Exception as e:\n",
    "        print(f\"An error occurred: {str(e)}\")\n",
    "\n",
    "preview_json('evaluation/docs_evaluation_dataset.json')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Metric Definitions\n",
    "We'll evaluate our system based on 5 key metrics: Precision, Recall, F1 Score, Mean Reciprocal Rank (MRR), and End-to-End Accuracy.\n",
    "\n",
    "## Retrieval Metrics:\n",
    "\n",
    "### Precision\n",
    "Precision represents the proportion of retrieved chunks that are actually relevant. It answers the question: \"Of the chunks we retrieved, how many were correct?\"\n",
    "\n",
    "Key points:\n",
    "- High precision indicates an efficient system with few false positives.\n",
    "- Low precision suggests many irrelevant chunks are being retrieved.\n",
    "- Our system retrieves a minimum of 3 chunks per query, which may affect precision scores.\n",
    "\n",
    "Formula:\n",
    "$$\n",
    "\\text{Precision} = \\frac{\\text{True Positives}}{\\text{Total Retrieved}} = \\frac{|\\text{Retrieved} \\cap \\text{Correct}|}{|\\text{Retrieved}|}\n",
    "$$\n",
    "\n",
    "### Recall\n",
    "Recall measures the completeness of our retrieval system. It answers the question: \"Of all the correct chunks that exist, how many did we manage to retrieve?\"\n",
    "\n",
    "Key points:\n",
    "- High recall indicates comprehensive coverage of necessary information.\n",
    "- Low recall suggests important chunks are being missed.\n",
    "- Recall is crucial for ensuring the LLM has access to all needed information.\n",
    "\n",
    "Formula:\n",
    "$$\n",
    "\\text{Recall} = \\frac{\\text{True Positives}}{\\text{Total Correct}} = \\frac{|\\text{Retrieved} \\cap \\text{Correct}|}{|\\text{Correct}|}\n",
    "$$\n",
    "\n",
    "### F1 Score\n",
    "The F1 score provides a balanced measure between precision and recall. It's particularly useful when you need a single metric to evaluate system performance, especially with uneven class distributions.\n",
    "\n",
    "Key points:\n",
    "- F1 score ranges from 0 to 1, with 1 representing perfect precision and recall.\n",
    "- It's the harmonic mean of precision and recall, tending towards the lower of the two values.\n",
    "- Useful in scenarios where both false positives and false negatives are important.\n",
    "\n",
    "Formula:\n",
    "$$\n",
    "\\text{F1 Score} = 2 \\times \\frac{\\text{Precision} \\times \\text{Recall}}{\\text{Precision} + \\text{Recall}}\n",
    "$$\n",
    "\n",
    "Interpreting F1 score:\n",
    "- An F1 score of 1.0 indicates perfect precision and recall.\n",
    "- An F1 score of 0.0 indicates the worst performance.\n",
    "- Generally, the higher the F1 score, the better the overall performance.\n",
    "\n",
    "### Balancing Precision, Recall, and F1 Score:\n",
    "- There's often a trade-off between precision and recall.\n",
    "- Our system's minimum chunk retrieval favors recall over precision.\n",
    "- The optimal balance depends on the specific use case.\n",
    "- In many RAG systems, high recall is often prioritized, as LLMs can filter out less relevant information during generation.\n",
    "\n",
    "### Mean Reciprocal Rank (MRR) @k\n",
    "MRR measures how well our system ranks relevant information. It helps us understand how quickly a user would find what they're looking for if they started from the top of our retrieved results.\n",
    "\n",
    "Key points:\n",
    "- MRR ranges from 0 to 1, where 1 is perfect (correct answer always first).\n",
    "- It only considers the rank of the first correct result for each query.\n",
    "- Higher MRR indicates better ranking of relevant information.\n",
    "\n",
    "Formula:\n",
    "$$\n",
    "\\text{MRR} = \\frac{1}{|Q|} \\sum_{i=1}^{|Q|} \\frac{1}{\\text{rank}_i}\n",
    "$$\n",
    "\n",
    "Where:\n",
    "- |Q| is the total number of queries\n",
    "- rank_i is the position of the first relevant item for the i-th query\n",
    "\n",
    "## End to End Metrics:\n",
    "\n",
    "### End to End Accuracy\n",
    "We use an LLM-as-judge (Claude 3.5 Sonnet) to evaluate whether the generated answer is correct based on the question and ground truth answer.\n",
    "\n",
    "Formula:\n",
    "$$\n",
    "\\text{End to End Accuracy} = \\frac{\\text{Number of Correct Answers}}{\\text{Total Number of Questions}}\n",
    "$$\n",
    "\n",
    "This metric evaluates the entire pipeline, from retrieval to answer generation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining Our Metric Calculation Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def calculate_mrr(retrieved_links: List[str], correct_links: Set[str]) -> float:\n",
    "    for i, link in enumerate(retrieved_links, 1):\n",
    "        if link in correct_links:\n",
    "            return 1 / i\n",
    "    return 0\n",
    "\n",
    "def evaluate_retrieval(retrieval_function: Callable, evaluation_data: List[Dict[str, Any]], db: Any) -> Tuple[float, float, float, float, List[float], List[float], List[float]]:\n",
    "    precisions = []\n",
    "    recalls = []\n",
    "    mrrs = []\n",
    "    \n",
    "    for i, item in enumerate(tqdm(evaluation_data, desc=\"Evaluating Retrieval\")):\n",
    "        try:\n",
    "            retrieved_chunks, _ = retrieval_function(item['question'], db)\n",
    "            retrieved_links = [chunk['metadata'].get('chunk_link', chunk['metadata'].get('url', '')) for chunk in retrieved_chunks]\n",
    "        except Exception as e:\n",
    "            logging.error(f\"Error in retrieval function: {e}\")\n",
    "            continue\n",
    "\n",
    "        correct_links = set(item['correct_chunks'])\n",
    "        \n",
    "        true_positives = len(set(retrieved_links) & correct_links)\n",
    "        precision = true_positives / len(retrieved_links) if retrieved_links else 0\n",
    "        recall = true_positives / len(correct_links) if correct_links else 0\n",
    "        mrr = calculate_mrr(retrieved_links, correct_links)\n",
    "        \n",
    "        precisions.append(precision)\n",
    "        recalls.append(recall)\n",
    "        mrrs.append(mrr)\n",
    "        \n",
    "        if (i + 1) % 10 == 0:\n",
    "            print(f\"Processed {i + 1}/{len(evaluation_data)} items. Current Avg Precision: {sum(precisions) / len(precisions):.4f}, Avg Recall: {sum(recalls) / len(recalls):.4f}, Avg MRR: {sum(mrrs) / len(mrrs):.4f}\")\n",
    "    \n",
    "    avg_precision = sum(precisions) / len(precisions) if precisions else 0\n",
    "    avg_recall = sum(recalls) / len(recalls) if recalls else 0\n",
    "    avg_mrr = sum(mrrs) / len(mrrs) if mrrs else 0\n",
    "    f1 = 2 * (avg_precision * avg_recall) / (avg_precision + avg_recall) if (avg_precision + avg_recall) > 0 else 0\n",
    "    \n",
    "    return avg_precision, avg_recall, avg_mrr, f1, precisions, recalls, mrrs\n",
    "\n",
    "def evaluate_end_to_end(answer_query_function, db, eval_data):\n",
    "    correct_answers = 0\n",
    "    results = []\n",
    "    total_questions = len(eval_data)\n",
    "    \n",
    "    for i, item in enumerate(tqdm(eval_data, desc=\"Evaluating End-to-End\")):\n",
    "        query = item['question']\n",
    "        correct_answer = item['correct_answer']\n",
    "        generated_answer = answer_query_function(query, db)\n",
    "        \n",
    "        prompt = f\"\"\"\n",
    "        You are an AI assistant tasked with evaluating the correctness of answers to questions about Anthropic's documentation.\n",
    "        \n",
    "        Question: {query}\n",
    "        \n",
    "        Correct Answer: {correct_answer}\n",
    "        \n",
    "        Generated Answer: {generated_answer}\n",
    "        \n",
    "        Is the Generated Answer correct based on the Correct Answer? You should pay attention to the substance of the answer, and ignore minute details that may differ. \n",
    "        \n",
    "        Small differences or changes in wording don't matter. If the generated answer and correct answer are saying essentially the same thing then that generated answer should be marked correct. \n",
    "        \n",
    "        However, if there is any critical piece of information which is missing from the generated answer in comparison to the correct answer, then we should mark this as incorrect. \n",
    "        \n",
    "        Finally, if there are any direct contradictions between the correect answer and generated answer, we should deem the generated answer to be incorrect.\n",
    "        \n",
    "        Respond in the following XML format:\n",
    "        <evaluation>\n",
    "        <content>\n",
    "        <explanation>Your explanation here</explanation>\n",
    "        <is_correct>true/false</is_correct>\n",
    "        </content>\n",
    "        </evaluation>\n",
    "        \"\"\"\n",
    "        \n",
    "        try:\n",
    "            response = client.messages.create(\n",
    "                model=\"claude-3-5-sonnet-20241022\",\n",
    "                max_tokens=1500,\n",
    "                messages=[\n",
    "                    {\"role\": \"user\", \"content\": prompt},\n",
    "                    {\"role\": \"assistant\", \"content\": \"<evaluation>\"}\n",
    "                ],\n",
    "                temperature=0,\n",
    "                stop_sequences=[\"</evaluation>\"]\n",
    "            )\n",
    "            \n",
    "            response_text = response.content[0].text\n",
    "            print(response_text)\n",
    "            evaluation = ET.fromstring(response_text)\n",
    "            is_correct = evaluation.find('is_correct').text.lower() == 'true'\n",
    "            \n",
    "            if is_correct:\n",
    "                correct_answers += 1\n",
    "            results.append(is_correct)\n",
    "            \n",
    "            logging.info(f\"Question {i + 1}/{total_questions}: {query}\")\n",
    "            logging.info(f\"Correct: {is_correct}\")\n",
    "            logging.info(\"---\")\n",
    "            \n",
    "        except ET.ParseError as e:\n",
    "            logging.error(f\"XML parsing error: {e}\")\n",
    "            is_correct = 'true' in response_text.lower()\n",
    "            results.append(is_correct)\n",
    "        except Exception as e:\n",
    "            logging.error(f\"Unexpected error: {e}\")\n",
    "            results.append(False)\n",
    "        \n",
    "        if (i + 1) % 10 == 0:\n",
    "            current_accuracy = correct_answers / (i + 1)\n",
    "            print(f\"Processed {i + 1}/{total_questions} questions. Current Accuracy: {current_accuracy:.4f}\")\n",
    "        # time.sleep(2)\n",
    "    accuracy = correct_answers / total_questions\n",
    "    return accuracy, results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Helper Function to Plot Performance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import json\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "\n",
    "def plot_performance(results_folder='evaluation/json_results', include_methods=None, colors=None):\n",
    "    # Set default colors\n",
    "    default_colors = ['skyblue', 'lightgreen', 'salmon']\n",
    "    if colors is None:\n",
    "        colors = default_colors\n",
    "    \n",
    "    # Load JSON files\n",
    "    results = []\n",
    "    for filename in os.listdir(results_folder):\n",
    "        if filename.endswith('.json'):\n",
    "            file_path = os.path.join(results_folder, filename)\n",
    "            with open(file_path, 'r') as f:\n",
    "                try:\n",
    "                    data = json.load(f)\n",
    "                    if 'name' not in data:\n",
    "                        print(f\"Warning: {filename} does not contain a 'name' field. Skipping.\")\n",
    "                        continue\n",
    "                    if include_methods is None or data['name'] in include_methods:\n",
    "                        results.append(data)\n",
    "                except json.JSONDecodeError:\n",
    "                    print(f\"Warning: {filename} is not a valid JSON file. Skipping.\")\n",
    "    \n",
    "    if not results:\n",
    "        print(\"No JSON files found with matching 'name' fields.\")\n",
    "        return\n",
    "    \n",
    "    # Validate data\n",
    "    required_metrics = [\"average_precision\", \"average_recall\", \"average_f1\", \"average_mrr\", \"end_to_end_accuracy\"]\n",
    "    for result in results.copy():\n",
    "        if not all(metric in result for metric in required_metrics):\n",
    "            print(f\"Warning: {result['name']} is missing some required metrics. Skipping.\")\n",
    "            results.remove(result)\n",
    "    \n",
    "    if not results:\n",
    "        print(\"No valid results remaining after validation.\")\n",
    "        return\n",
    "    \n",
    "    # Sort results based on end-to-end accuracy\n",
    "    results.sort(key=lambda x: x['end_to_end_accuracy'])\n",
    "    \n",
    "    # Prepare data for plotting\n",
    "    methods = [result['name'] for result in results]\n",
    "    metrics = required_metrics\n",
    "    \n",
    "    # Set up the plot\n",
    "    plt.figure(figsize=(14, 6))\n",
    "    sns.set_style(\"whitegrid\")\n",
    "    \n",
    "    x = range(len(metrics))\n",
    "    width = 0.8 / len(results)\n",
    "    \n",
    "    # Create color palette\n",
    "    num_methods = len(results)\n",
    "    color_palette = colors[:num_methods] + sns.color_palette(\"husl\", num_methods - len(colors))\n",
    "    \n",
    "    # Plot bars for each method\n",
    "    for i, (result, color) in enumerate(zip(results, color_palette)):\n",
    "        values = [result[metric] for metric in metrics]\n",
    "        offset = (i - len(results)/2 + 0.5) * width\n",
    "        bars = plt.bar([xi + offset for xi in x], values, width, label=result['name'], color=color)\n",
    "        \n",
    "        # Add value labels on the bars\n",
    "        for bar in bars:\n",
    "            height = bar.get_height()\n",
    "            plt.text(bar.get_x() + bar.get_width()/2., height,\n",
    "                     f'{height:.2f}', ha='center', va='bottom', fontsize=8)\n",
    "    \n",
    "    # Customize the plot\n",
    "    plt.xlabel('Metrics', fontsize=12)\n",
    "    plt.ylabel('Values', fontsize=12)\n",
    "    plt.title('RAG Performance Metrics (Sorted by End-to-End Accuracy)', fontsize=16)\n",
    "    plt.xticks(x, metrics, rotation=45, ha='right')\n",
    "    plt.legend(title='Methods', bbox_to_anchor=(1.05, 1), loc='upper left')\n",
    "    plt.ylim(0, 1)\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluating Our Base Case"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  13%|\u2588\u258e        | 13/100 [00:00<00:04, 17.92it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 10/100 items. Current Avg Precision: 0.5000, Avg Recall: 0.8000, Avg MRR: 0.8333\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  23%|\u2588\u2588\u258e       | 23/100 [00:01<00:04, 15.81it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 20/100 items. Current Avg Precision: 0.3833, Avg Recall: 0.6500, Avg MRR: 0.6333\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  33%|\u2588\u2588\u2588\u258e      | 33/100 [00:01<00:04, 16.36it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 30/100 items. Current Avg Precision: 0.4000, Avg Recall: 0.6556, Avg MRR: 0.6667\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  43%|\u2588\u2588\u2588\u2588\u258e     | 43/100 [00:02<00:03, 16.35it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 40/100 items. Current Avg Precision: 0.4500, Avg Recall: 0.6917, Avg MRR: 0.7250\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  53%|\u2588\u2588\u2588\u2588\u2588\u258e    | 53/100 [00:03<00:02, 16.13it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 50/100 items. Current Avg Precision: 0.4333, Avg Recall: 0.6733, Avg MRR: 0.7200\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  63%|\u2588\u2588\u2588\u2588\u2588\u2588\u258e   | 63/100 [00:03<00:02, 16.34it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 60/100 items. Current Avg Precision: 0.4278, Avg Recall: 0.6722, Avg MRR: 0.7333\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  73%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e  | 73/100 [00:04<00:01, 16.44it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 70/100 items. Current Avg Precision: 0.4167, Avg Recall: 0.6440, Avg MRR: 0.7048\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  83%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e | 83/100 [00:05<00:01, 16.29it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 80/100 items. Current Avg Precision: 0.4396, Avg Recall: 0.6823, Avg MRR: 0.7354\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  93%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e| 93/100 [00:05<00:00, 16.72it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 90/100 items. Current Avg Precision: 0.4352, Avg Recall: 0.6750, Avg MRR: 0.7333\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100/100 [00:06<00:00, 16.47it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 100/100 items. Current Avg Precision: 0.4283, Avg Recall: 0.6592, Avg MRR: 0.7367\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   1%|          | 1/100 [00:05<08:35,  5.21s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect. While it provides general guidance about test case creation, it misses the specific, critical information about HOW to actually create multiple test cases in the Anthropic Evaluation tool. The correct answer clearly states that you need to click the 'Add Test Case' button and fill in values for variables in your prompt. The generated answer instead talks about theoretical steps like organizing test cases in spreadsheets or JSON files, which isn't mentioned in the correct answer and may not be accurate. The generated answer seems to be providing general testing best practices rather than the specific mechanics of creating multiple test cases in the Anthropic Evaluation tool.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   2%|\u258f         | 2/100 [00:10<08:21,  5.12s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct in substance compared to the Correct Answer. Both answers identify Voyage AI as Anthropic's recommended embeddings provider and both mention that Voyage AI offers customized/fine-tuned models for specific domains and individual customers. While the Generated Answer provides more specific details about Voyage AI's model offerings that aren't mentioned in the Correct Answer, this additional information doesn't contradict the Correct Answer - it merely elaborates on it. The core claims about Voyage AI's capabilities for domain-specific customization and bespoke fine-tuning are consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   3%|\u258e         | 3/100 [00:16<08:45,  5.41s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the key points mentioned in the Correct Answer and even provides additional helpful details. Both answers mention the same key success metrics: accuracy, F1 score, consistency, structure, speed, and bias/fairness. Both answers also discuss how choosing the right model affects latency and performance. While the Generated Answer goes into more specific detail about model choices (mentioning claude-3-haiku and Sonnet specifically), this additional information doesn't contradict the Correct Answer - it simply elaborates on it. The core message about balancing speed and output quality is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   4%|\u258d         | 4/100 [00:20<08:18,  5.19s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the same two key benefits of Claude for Sheets mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers mention the ability to test prompts across evaluation suites in parallel (with the Correct Answer adding that this is faster than sequential chained prompts)\n",
      "\n",
      "2. Both answers mention that Claude for Sheets is better at office tasks like survey analysis and data processing\n",
      "\n",
      "While the Correct Answer explicitly mentions that parallel testing is \"faster than running chained prompts sequentially,\" this additional detail doesn't change the core point being made. The Generated Answer captures the essential substance of both key benefits, just with slightly different wording.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   5%|\u258c         | 5/100 [00:25<07:44,  4.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that missing the \"\\n\\nHuman:\" and \"\\n\\nAssistant:\" turns in the prompt will result in an API error. The Generated Answer actually provides slightly more context by explaining that these turns are expected to indicate the start of human input and assistant response, but this additional detail doesn't change the fundamental correctness of the answer. There are no contradictions between the two answers, and no critical information is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   6%|\u258c         | 6/100 [00:31<08:33,  5.46s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer:\n",
      "\n",
      "1. It correctly states that tool use requests are priced the same way as regular API requests\n",
      "2. It accurately lists all the additional token sources that contribute to the total cost:\n",
      "   - Tools parameter\n",
      "   - Tool use content blocks\n",
      "   - Tool result content blocks\n",
      "   - Special system prompt\n",
      "3. It explains that these additional tokens are added to the normal input/output tokens to calculate the total cost\n",
      "\n",
      "The Generated Answer actually provides slightly more detail than the Correct Answer, but doesn't contradict it in any way. The core message that tool use requests follow the same pricing structure but include additional tokens that affect the total cost is preserved in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   7%|\u258b         | 7/100 [00:35<07:37,  4.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the essential information from the Correct Answer - specifically the June 27th, 2024 release date and mentions all the key features (API usage, billing details, and rate limits). While the Correct Answer provides slightly more detail by mentioning the specific tabs (Usage, Cost, and Rate Limits), this is a minor difference that doesn't affect the core meaning. Both answers communicate the same fundamental information about what features are coming and when they will be available.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   8%|\u258a         | 8/100 [00:40<07:43,  5.03s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses a critical component from the Correct Answer. While both answers discuss latency implications of CoT, the Generated Answer fails to mention one of the key decision factors - whether the task requires in-depth thinking that a human would need to work through. The Generated Answer focuses heavily on performance and latency considerations, essentially repeating the same point twice, but doesn't address the fundamental question of whether the task's complexity actually warrants using CoT in the first place. This is a significant omission since it's one of the two key factors mentioned in the Correct Answer for determining when CoT is appropriate.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   9%|\u2589         | 9/100 [00:46<07:51,  5.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides more detailed steps and additional capabilities compared to the Correct Answer, the core functionality described is the same - being able to upload PDFs and have Claude summarize their content to make it easier to understand long documents without reading everything. The Generated Answer expands on this basic concept with additional features like question-answering and data extraction, but it doesn't contradict or miss any critical information from the Correct Answer. The extra detail simply provides more context and possibilities for how to use Claude with PDFs.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  10%|\u2588         | 10/100 [00:49<06:57,  4.64s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. According to the Correct Answer, rate limits can be viewed in the \"Rate Limits tab\" of the Developer Console. However, the Generated Answer states they can be found in the \"Plans and Billing section.\" These are two different locations, representing a direct contradiction. The Generated Answer provides incorrect information about where to find this specific information in the Claude Console.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 10/100 questions. Current Accuracy: 0.7000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  11%|\u2588         | 11/100 [00:56<07:54,  5.33s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect when compared to the correct answer. While the generated answer provides several valid metrics for evaluating a classification system (F1 score, consistency, structure, speed, bias/fairness), it misses the specific metrics mentioned in the correct answer. The correct answer specifically mentions two key metrics: 95th percentile response time and average cost per classification. While the generated answer does mention \"speed\" in general terms, it doesn't mention cost at all, which is a critical metric specified in the correct answer. The generated answer provides different metrics that, while potentially useful, are not the specific metrics outlined in what appears to be Anthropic's documentation.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  12%|\u2588\u258f        | 12/100 [01:02<08:05,  5.52s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately describes both methods of specifying system prompts:\n",
      "\n",
      "1. For Text Completions API: Both answers indicate that the system prompt goes before the first \"\\n\\nHuman:\" turn in the prompt text.\n",
      "\n",
      "2. For Messages API: Both answers specify that the system prompt is provided using the \"system\" parameter in the API request.\n",
      "\n",
      "The Generated Answer actually provides helpful concrete code examples to illustrate these concepts, which goes beyond but doesn't contradict the Correct Answer. The substance and core information about how to specify system prompts in both APIs is consistent between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 9, column 2\n",
      "Evaluating End-to-End:  13%|\u2588\u258e        | 13/100 [01:10<09:09,  6.32s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>\n",
      "The generated answer, while detailed and structured, misses a key element from the correct answer. The correct answer specifically mentions using tags like <thinking> and <answer> in combination with chain of thought reasoning where Claude explains its step-by-step thinking process. While the generated answer does discuss using XML tags and breaking down tasks into steps, it doesn't explicitly mention the core concept of using <thinking> tags to prompt Claude to show its reasoning process.\n",
      "\n",
      "The generated answer focuses more on a general methodology of breaking down tasks and using XML tags for structure, rather than the specific combination of XML tags with chain of thought reasoning that the correct answer describes. The correct answer provides a more focused and specific approach about using tags to explicitly prompt Claude's reasoning process.\n",
      "\n",
      "Additionally, the correct answer provides a specific example of how to prompt Claude (\"Before answering, explain your reasoning step-by-step in <thinking> tags\"), which is a crucial piece of information missing from the generated answer.\n",
      "</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  14%|\u2588\u258d        | 14/100 [01:16<08:59,  6.27s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect for several reasons:\n",
      "\n",
      "1. While it correctly identifies the three general types of metrics (accuracy, cost, and speed), the specific numerical results provided are significantly different from the correct answer:\n",
      "- Generated answer claims 92% accuracy vs correct 89.01%\n",
      "- Generated answer claims $0.03 cost vs correct $0.0004 per request\n",
      "- Generated answer claims 50ms speed vs correct 1.61 seconds (95th percentile)\n",
      "\n",
      "2. The speed metric is described differently - the correct answer specifically mentions \"95th percentile response time\" while the generated answer just refers to \"average latency\"\n",
      "\n",
      "3. The cost metric is described differently - the correct answer specifies \"cost per request routing\" while the generated answer refers to total cost\n",
      "\n",
      "These differences represent material discrepancies in the actual performance metrics, making the generated answer incorrect despite correctly identifying the three general categories of metrics being measured.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  15%|\u2588\u258c        | 15/100 [01:22<08:40,  6.12s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect. While it provides detailed steps for implementing Claude more broadly, it does not match the specific pre-prompt engineering requirements mentioned in the correct answer. The correct answer focuses on three key elements that should be in place before starting prompt engineering:\n",
      "1. Clear definition of success criteria\n",
      "2. Ways to empirically test against those criteria\n",
      "3. A first draft prompt to improve\n",
      "\n",
      "The generated answer instead discusses broader implementation steps like scoping use cases, designing integrations, and preparing data. While these may be useful steps generally, they are not the specific prerequisites for prompt engineering that Anthropic recommends according to the correct answer. The answers are fundamentally discussing different things.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  16%|\u2588\u258c        | 16/100 [01:28<08:12,  5.87s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is correct. The core information about how mid-response prompting works in both APIs matches the correct answer - specifically that the Messages API uses the assistant role in the last message to continue a response, while the Text Completions API allows pre-filling in the prompt string directly. While the generated answer includes additional details about streaming formats and response generation that aren't in the correct answer, these don't contradict the core information and are just supplementary details. The essential mechanism for handling mid-response prompting is accurately described in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  17%|\u2588\u258b        | 17/100 [01:34<08:27,  6.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the essential point made in the Correct Answer - that Claude's response with a role-based system prompt produces a more detailed, structured, and actionable financial analysis compared to not having a specific role. In fact, the Generated Answer goes into even more specific detail about how the analysis differs, breaking down concrete examples of the improvements (like flagging CAC concerns and providing strategic recommendations). While it provides more granular details than the Correct Answer, it does not contradict anything in the Correct Answer and maintains the same core message about the role-based prompt leading to more insightful and structured analysis. The key comparison point about the quality difference between role-based and non-role-based responses is preserved in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  18%|\u2588\u258a        | 18/100 [01:42<08:56,  6.55s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>\n",
      "The Generated Answer is correct as it covers the key aspects mentioned in the Correct Answer and expands on them appropriately:\n",
      "\n",
      "1. It mentions key quantitative metrics, including F1 score, accuracy, and other relevant metrics, which aligns with the Correct Answer's mention of \"F1 score, accuracy, precision, and recall.\"\n",
      "\n",
      "2. It discusses how targets should be determined, mentioning industry benchmarks and prior experiments, which directly matches the Correct Answer's statement about basing targets on \"industry benchmarks, prior experiments, AI research, or expert knowledge.\"\n",
      "\n",
      "While the Generated Answer provides more specific examples and additional metrics (like response time and toxicity), this additional detail doesn't contradict the Correct Answer - it merely expands upon it. The core substance of both answers is aligned: they both emphasize the importance of quantitative metrics and describe how targets should be determined based on industry standards and prior work.\n",
      "\n",
      "There are no critical omissions or contradictions between the two answers.\n",
      "</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 9, column 182\n",
      "Evaluating End-to-End:  19%|\u2588\u2589        | 19/100 [01:46<07:49,  5.80s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key elements from the Correct Answer:\n",
      "1. The core concept of combining XML tags with other prompt engineering techniques\n",
      "2. Specifically mentions multishot prompting using <examples> tags\n",
      "3. Mentions chain of thought using <thinking> and <answer> tags\n",
      "4. Notes that this creates \"super-structured, high-performance prompts\"\n",
      "\n",
      "While the wording is slightly different, the substance and meaning are identical. There are no missing critical pieces of information and no contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  20%|\u2588\u2588        | 20/100 [01:53<08:16,  6.20s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key elements from the Correct Answer:\n",
      "\n",
      "1. It explains that you need to provide both the output to grade and a detailed rubric to Claude\n",
      "2. It indicates that the LLM should evaluate based on the rubric criteria\n",
      "3. It mentions that the output should be a simple correct/incorrect judgment\n",
      "\n",
      "While the Generated Answer goes into more detail and provides additional implementation steps, it doesn't contradict anything in the Correct Answer. The core concept - using Claude to evaluate outputs against a rubric and provide a correct/incorrect judgment - is preserved. The additional detail simply elaborates on how to implement this approach practically.\n",
      "\n",
      "There are no critical pieces of information from the Correct Answer that are missing from the Generated Answer, and there are no contradictions between the two.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 20/100 questions. Current Accuracy: 0.6000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  21%|\u2588\u2588        | 21/100 [01:58<07:52,  5.99s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it contains all the essential steps and information present in the Correct Answer. Both answers outline the same key process:\n",
      "1. Accessing/subscribing to the model on AWS Marketplace\n",
      "2. Selecting the model and agreeing to terms\n",
      "3. Getting the Product ARN for the region\n",
      "4. Using JupyterLab in SageMaker Studio\n",
      "5. Following notebook instructions to deploy using the ARN\n",
      "\n",
      "The Generated Answer actually provides slightly more detail in its step-by-step breakdown, but the core information and process remains the same. There are no contradictions between the two answers, and no critical information is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  22%|\u2588\u2588\u258f       | 22/100 [02:04<07:43,  5.94s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect because it misses some key points from the correct answer and provides different guidance. The correct answer emphasizes three critical elements:\n",
      "1. Using a single tool\n",
      "2. Setting tool_choice explicitly\n",
      "3. Ensuring tool names/descriptions are from the model's perspective\n",
      "\n",
      "The generated answer instead focuses on schema definition, error handling, and testing, which, while potentially useful, are not the key points emphasized in the correct answer about tool setup and prompting. The generated answer does not mention the important aspect of tool_choice configuration or the perspective from which tools should be defined. While both answers discuss using tools to generate JSON, the specific guidance and key implementation details differ substantially.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  23%|\u2588\u2588\u258e       | 23/100 [02:12<08:17,  6.46s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect for two key reasons:\n",
      "\n",
      "1. It directly contradicts the correct answer regarding speed. The correct answer states that Claude 3 Haiku is \"faster\" than Claude Instant 1.2, while the generated answer incorrectly states that \"Claude Instant 1.2 model is described as a 'fast and efficient' predecessor\" and implies it's faster than Claude 3 Haiku.\n",
      "\n",
      "2. It includes potentially incorrect or unverified information about costs and context windows that isn't mentioned in the correct answer, which could be misleading.\n",
      "\n",
      "While the generated answer correctly identifies some aspects like vision capabilities and improved performance of Claude 3 Haiku, the contradiction regarding speed is a significant error that makes the answer incorrect overall. The correct answer is simpler and clearly states that Claude 3 Haiku is faster, more performant, and more intelligent, with vision capabilities and more up-to-date training data.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  24%|\u2588\u2588\u258d       | 24/100 [02:16<07:10,  5.67s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers emphasize the same key point - that using examples helps reduce misinterpretation of instructions and leads to more accurate outputs from Claude. While the Generated Answer includes some additional details about improving consistency and handling complex tasks, the core benefit about reducing misinterpretation is present in both answers. There are no contradictions between the answers, and the Generated Answer includes all critical information from the Correct Answer, just with additional elaboration.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  25%|\u2588\u2588\u258c       | 25/100 [02:21<06:45,  5.40s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer focuses on different advantages (resource efficiency, cost, and speed) compared to the Correct Answer, which emphasizes the ability to adapt models through context in prompts without retraining. While both answers discuss advantages of prompt engineering over fine-tuning, they highlight completely different benefits. The Generated Answer misses the key point about providing domain-specific context in prompts being the main advantage, and instead discusses technical implementation benefits. Since the Generated Answer does not capture the core advantage specified in the Correct Answer about easy domain adaptation through contextual prompts, it should be marked as incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  26%|\u2588\u2588\u258c       | 26/100 [02:24<06:03,  4.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core information as the Correct Answer - that users can get started quickly by making a copy of a provided Claude for Sheets template workbook. While the Generated Answer breaks this down into more detailed steps, the fundamental message is identical. There are no contradictions between the answers, and no critical information is missing from the Generated Answer compared to the Correct Answer. The additional detail in the Generated Answer doesn't change the essential correctness of the response.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  27%|\u2588\u2588\u258b       | 27/100 [02:30<06:15,  5.15s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the essential meaning of the Correct Answer. Both answers explain that:\n",
      "\n",
      "1. The \"index\" field identifies which specific content block the text delta applies to\n",
      "2. The field is used to track/update content for specific blocks in the response\n",
      "3. Both imply the relationship between the index and the streaming of text content\n",
      "\n",
      "While they use slightly different wording and structure, the fundamental explanation of how the index field relates to text streaming and content blocks is consistent between both answers. The Generated Answer may be more technical in its explanation about \"cumulative results\" and \"Message content array,\" but it doesn't contradict or miss any critical information from the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  28%|\u2588\u2588\u258a       | 28/100 [02:36<06:32,  5.46s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it contains a critical error in how images are included in API requests. The Correct Answer specifies that images must be provided as base64-encoded content blocks within the messages array, while the Generated Answer incorrectly states that image files can be uploaded directly to the API. This is a fundamental technical difference in how images should be formatted and included in the request.\n",
      "\n",
      "While the Generated Answer correctly lists the supported image formats (JPEG, PNG, GIF, and WebP) and provides additional useful information about file size limits and usage restrictions, it misses the crucial technical requirement of base64 encoding. This encoding requirement is essential for proper API implementation, making this a significant omission that affects the accuracy of the answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  29%|\u2588\u2588\u2589       | 29/100 [02:42<06:37,  5.60s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that TTFT is a specific component/measure of overall latency, specifically measuring the time to generate the first token of a response. The Generated Answer actually provides additional relevant context about factors affecting TTFT and latency, but this extra information doesn't contradict the Correct Answer. The key relationship between TTFT and latency is accurately captured in both answers - that TTFT is a subset/component of overall latency that specifically focuses on time to first token generation. The Generated Answer also maintains the emphasis on this being important for model performance evaluation, which aligns with the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  30%|\u2588\u2588\u2588       | 30/100 [02:49<07:01,  6.03s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that providing examples of edge cases to Claude can improve its performance in routing support tickets. The Generated Answer actually goes into more detail by breaking down specific types of edge cases (implicit requests, emotional prioritization, intent vs. routing, and issue prioritization) and explaining how each type of example can help improve Claude's performance. While it provides more detail than the Correct Answer, it doesn't contradict it and maintains the same fundamental point about examples improving Claude's ability to handle edge cases in ticket routing. The substance and main message are aligned between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 30/100 questions. Current Accuracy: 0.6000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  31%|\u2588\u2588\u2588       | 31/100 [02:55<07:00,  6.10s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the essential elements from the Correct Answer. Both answers explain that:\n",
      "\n",
      "1. The \"tool_use\" stop_reason indicates Claude has determined a tool is needed for the query\n",
      "2. This occurs when Claude constructs a tool use request\n",
      "3. The client needs to extract the tool input from Claude's request\n",
      "4. The tool code needs to be executed client-side\n",
      "5. The results need to be sent back to Claude\n",
      "\n",
      "While the Generated Answer uses slightly different wording and adds some additional context about this being the \"second step\" in the workflow, it maintains all the critical information and doesn't contradict anything in the Correct Answer. The core workflow and meaning are preserved between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  32%|\u2588\u2588\u2588\u258f      | 32/100 [03:00<06:29,  5.72s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key elements from the Correct Answer:\n",
      "1. It correctly identifies the error event as \"overloaded_error\"\n",
      "2. It specifies that this occurs during periods of high usage\n",
      "3. It correctly states that this corresponds to HTTP 529 error code in non-streaming contexts\n",
      "4. It properly contextualizes this within streaming responses\n",
      "\n",
      "The Generated Answer simply rephrases the same information in a slightly different way, but maintains all the critical substance and technical details. There are no contradictions or missing pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  33%|\u2588\u2588\u2588\u258e      | 33/100 [03:04<05:51,  5.24s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It identifies both types of deltas that can be contained in a content_block_delta event: text_delta and input_json_delta. While the formatting and presentation are slightly different (using a numbered list instead of prose), the substance and key information are exactly the same as the Correct Answer. Both answers convey the same two specific delta types without any omissions or contradictions.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  34%|\u2588\u2588\u2588\u258d      | 34/100 [03:09<05:25,  4.94s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. According to the Correct Answer, Claude 3.5 Sonnet and tool use became generally available on different dates:\n",
      "- Claude 3.5 Sonnet: June 20th, 2024\n",
      "- Tool use: May 30th, 2024\n",
      "\n",
      "The Generated Answer incorrectly states that both became available on the same date (June 20th, 2024). This is a critical factual error as it misses the key distinction that these were separate releases with different availability dates. The difference in timing between these releases is an important piece of information that is missing from the Generated Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  35%|\u2588\u2588\u2588\u258c      | 35/100 [03:13<05:08,  4.75s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same essential information: that Anthropic launched Claude.ai and the Claude iOS app in Europe first (in May 2024) and then in Canada (in June 2024). The Generated Answer provides specific dates (May 13th and June 5th) while the Correct Answer uses more general timing (May and June), but this level of detail doesn't change the fundamental accuracy of the sequence of events. The core substance - the order of launches and the months they occurred in - is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  36%|\u2588\u2588\u2588\u258c      | 36/100 [03:18<05:14,  4.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the essential elements from the Correct Answer:\n",
      "\n",
      "1. It correctly identifies that \"tool_use\" indicates Claude has decided to use a tool\n",
      "2. It outlines the same key steps that need to be taken:\n",
      "   - Extracting the tool name and input\n",
      "   - Executing the tool code on the client side\n",
      "   - Sending back a new message with a tool_result content block\n",
      "\n",
      "While the wording is slightly different, the substance and technical accuracy are completely aligned with the Correct Answer. There are no missing critical pieces of information and no contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  37%|\u2588\u2588\u2588\u258b      | 37/100 [03:22<04:52,  4.64s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same essential information as the Correct Answer. Both answers indicate that the anthropic library is used to interact with Claude/Anthropic's AI capabilities. While the Generated Answer provides slightly more detail by explaining what the anthropic library does, the core substance - that the anthropic library is the Python library used in the example - is consistent between both answers. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  38%|\u2588\u2588\u2588\u258a      | 38/100 [03:27<04:49,  4.67s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both main authentication methods described in the Correct Answer:\n",
      "\n",
      "1. Direct credential provision (AWS access key and secret key)\n",
      "2. Using default AWS credential providers (including ~/.aws/credentials file and environment variables)\n",
      "\n",
      "While the Correct Answer mentions the optional aws_session_token and has slightly different wording, these are minor details that don't change the core substance of the answer. The Generated Answer accurately conveys the two primary authentication methods and provides the same essential information about credential providers and environment variables.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  39%|\u2588\u2588\u2588\u2589      | 39/100 [03:33<05:03,  4.98s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the same two key factors mentioned in the Correct Answer:\n",
      "\n",
      "1. The risk/potential of prompt leaks (protecting sensitive information)\n",
      "2. The impact on model performance due to added complexity\n",
      "\n",
      "While the Generated Answer elaborates more on each factor with additional examples and details, the core substance and trade-off described is identical to the Correct Answer. Both answers emphasize the need to balance protecting against leaks with maintaining model performance. There are no contradictions between the two answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  40%|\u2588\u2588\u2588\u2588      | 40/100 [03:39<05:25,  5.42s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. Anthropic offers different Claude models with varying capabilities and performance characteristics\n",
      "2. Selecting the right model allows you to optimize for your specific needs\n",
      "3. The choice helps balance speed, intelligence, and cost\n",
      "\n",
      "While the Generated Answer provides more specific details about different models (like Haiku, Sonnet, and Opus) and includes additional information about evaluations and enterprise use cases, these details don't contradict the Correct Answer - they merely expand upon it. The fundamental point about choosing the appropriate model to reduce latency remains consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 40/100 questions. Current Accuracy: 0.6750\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  41%|\u2588\u2588\u2588\u2588      | 41/100 [03:44<05:18,  5.40s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the essential information from the Correct Answer and even provides more detailed implementation examples. Both answers highlight the key points that:\n",
      "\n",
      "1. You use the client.messages.stream() method\n",
      "2. You iterate over the stream.text_stream attribute\n",
      "\n",
      "The Generated Answer expands on this with a practical code example and additional context, but the core information matches the Correct Answer completely. There are no contradictions or missing critical pieces between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  42%|\u2588\u2588\u2588\u2588\u258f     | 42/100 [03:50<05:15,  5.44s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the two key points from the Correct Answer:\n",
      "\n",
      "1. It explains that you can guide/shape Claude's response by pre-filling content (though it describes this slightly differently as including text in the \"content\" field of the \"assistant\" message)\n",
      "\n",
      "2. It correctly identifies that the \"max_tokens\" parameter is used to generate short responses by limiting the length of the output\n",
      "\n",
      "While the exact wording differs, the substance and core concepts are the same. There are no critical missing pieces of information or contradictions between the two answers. Both answers effectively communicate how to pre-fill responses and use max_tokens to control response length.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  43%|\u2588\u2588\u2588\u2588\u258e     | 43/100 [03:55<04:56,  5.20s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core message: that when building an eval set, it's better to have a larger number of test cases with automated grading rather than fewer test cases with high-quality human grading. The Generated Answer expands on this with additional details about automated grading methods, but the fundamental point matches exactly with the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes all critical information from the Correct Answer. While the Generated Answer provides more detail, this additional context doesn't change or contradict the main point.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  44%|\u2588\u2588\u2588\u2588\u258d     | 44/100 [03:59<04:36,  4.94s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. According to the Correct Answer, the two required fields are \"index\" and \"delta\" (where \"delta\" contains the type and text), but the Generated Answer incorrectly states that the required fields are \"type\" and \"text\". This is a substantive difference in the structure of the content_block_delta event, not just a minor wording variation. The Generated Answer misses the critical \"index\" field requirement and incorrectly elevates \"type\" and \"text\" (which are actually nested within the \"delta\" field) to top-level required fields.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  45%|\u2588\u2588\u2588\u2588\u258c     | 45/100 [04:03<04:23,  4.79s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect. While it correctly mentions the Claude Cookbooks as one interactive learning resource, it fails to mention the Developer Console and its prompt generator tool, which is a key component mentioned in the correct answer. Instead, it references the \"More Resources\" section and documentation, which weren't identified in the correct answer as interactive learning methods. The generated answer therefore misses one of the two main interactive learning tools specified in the correct answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  46%|\u2588\u2588\u2588\u2588\u258c     | 46/100 [04:08<04:20,  4.82s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. The core concept from the Correct Answer - that breaking tasks into subtasks improves accuracy because each subtask gets Claude's full attention and reduces errors compared to handling everything at once - is fully captured in the Generated Answer's first point about accuracy. While the Generated Answer goes on to provide additional points about clarity and traceability, these are supplementary details that don't contradict the core concept. The essential reasoning about improved accuracy through focused attention on subtasks is present and aligned between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  47%|\u2588\u2588\u2588\u2588\u258b     | 47/100 [04:13<04:17,  4.85s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key distinction mentioned in the Correct Answer - that Messages streaming responses can contain multiple content blocks of varying types, making them more complex than Text Completions streaming. While the Generated Answer provides additional details about the specific implementation differences, its core message aligns with the Correct Answer's main point about the fundamental difference in complexity and structure between the two streaming formats. There are no contradictions between the answers, and the Generated Answer includes all critical information from the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  48%|\u2588\u2588\u2588\u2588\u258a     | 48/100 [04:17<04:00,  4.62s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. The Correct Answer states that users can experiment with Claude through claude.ai and Anthropic's web Console, while the Generated Answer mentions completely different methods - using the API Quickstart and Workbench. These are substantively different approaches and do not align with what's stated in the Correct Answer. The Generated Answer misses both key methods mentioned in the Correct Answer (claude.ai and web Console) and instead provides different information. This represents a material difference in the substance of the answers, not just minor wording variations.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  49%|\u2588\u2588\u2588\u2588\u2589     | 49/100 [04:23<04:12,  4.96s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that chain prompts help reduce errors and inconsistencies by breaking complex tasks into smaller, more manageable subtasks that Claude can focus on individually. While the Generated Answer provides more detailed explanations and additional benefits (like traceability and debugging), it doesn't contradict the Correct Answer. The fundamental principle - that breaking tasks into smaller pieces helps reduce errors and maintain consistency - is preserved in both answers. The additional details in the Generated Answer simply elaborate on the basic concept without changing its essential meaning.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  50%|\u2588\u2588\u2588\u2588\u2588     | 50/100 [04:27<03:47,  4.54s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers state that an overloaded_error event corresponds to HTTP status code 529 in a non-streaming context for the Claude API. While the Correct Answer uses slightly more formal language (\"would normally correspond to\"), the core information - the 529 status code - is identical in both answers. The difference in phrasing does not change the fundamental meaning or accuracy of the response.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 50/100 questions. Current Accuracy: 0.6800\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  51%|\u2588\u2588\u2588\u2588\u2588     | 51/100 [04:31<03:36,  4.42s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the exact same two ways to specify the embedding format as mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers indicate that leaving the format unspecified will return embeddings as lists of floating-point numbers\n",
      "2. Both answers state that setting the format to \"base64\" will return the embeddings as Base64 encodings\n",
      "\n",
      "The Generated Answer simply presents the information in a more structured bullet-point format, but conveys the same essential information as the Correct Answer. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  52%|\u2588\u2588\u2588\u2588\u2588\u258f    | 52/100 [04:37<03:57,  4.96s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same essential information as the Correct Answer. Both answers explain that:\n",
      "\n",
      "1. Tool use content blocks are sent as partial JSON strings in content_block_delta events\n",
      "2. The client needs to accumulate these partial JSON strings\n",
      "3. The complete JSON can be parsed once a content_block_stop event is received\n",
      "4. Parsing can be done using Pydantic or SDK helpers\n",
      "\n",
      "The Generated Answer actually provides additional helpful detail by showing an example of the delta structure, but this doesn't contradict anything in the Correct Answer. The core concepts and process are described accurately and consistently between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  53%|\u2588\u2588\u2588\u2588\u2588\u258e    | 53/100 [04:41<03:42,  4.73s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately identifies both tutorials (GitHub and Google Sheets) and correctly characterizes their key differences. The GitHub tutorial is described as more in-depth with examples, while the Google Sheets version is described as \"lighter weight,\" which aligns with the Correct Answer. The Generated Answer captures all the essential information from the Correct Answer, just with slightly different wording. There are no contradictions or missing critical pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  54%|\u2588\u2588\u2588\u2588\u2588\u258d    | 54/100 [04:50<04:35,  5.98s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more comprehensive detail than the Correct Answer. It covers all the key points mentioned in the Correct Answer:\n",
      "\n",
      "1. The 200K token context window\n",
      "2. Tool use capabilities for integration with specialized applications\n",
      "3. Multimodal input capabilities\n",
      "4. Enterprise-grade security and data handling for sensitive information\n",
      "\n",
      "The Generated Answer expands on these points and provides additional relevant details about Claude's enterprise capabilities, but does not contradict any information in the Correct Answer. While it is more detailed, the core capabilities mentioned in the Correct Answer are all present and accurately represented in the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  55%|\u2588\u2588\u2588\u2588\u2588\u258c    | 55/100 [04:53<03:53,  5.19s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it omits a key region where Claude.ai API and iOS app are available - the United States. While the Generated Answer correctly mentions Canada and Europe, leaving out the United States represents a significant omission of information. The availability in the United States is a critical piece of information present in the Correct Answer but missing from the Generated Answer. Therefore, despite getting some regions correct, the Generated Answer is incomplete and thus incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  56%|\u2588\u2588\u2588\u2588\u2588\u258c    | 56/100 [04:59<03:56,  5.37s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer:\n",
      "\n",
      "1. It correctly identifies the two main approaches (push-based with webhooks and pull-based)\n",
      "2. It accurately describes that the push-based approach is more scalable\n",
      "3. It mentions the security implications of exposing a public endpoint for the push-based approach\n",
      "4. It correctly states that the pull-based approach is easier to implement but has efficiency drawbacks due to unnecessary system calls\n",
      "\n",
      "The Generated Answer actually provides more detail and context than the Correct Answer, but all the core information matches. There are no contradictions between the two answers, and no critical pieces of information are missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  57%|\u2588\u2588\u2588\u2588\u2588\u258b    | 57/100 [05:03<03:29,  4.86s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it omits a critical piece of information - the release date (May 10th, 2024). While the Generated Answer correctly states that the tool is available through the Developer Console interface, the timing of the release is an important factual detail that was included in the Correct Answer but missing from the Generated Answer. When dealing with product releases and announcements, the timing is typically considered crucial information.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  58%|\u2588\u2588\u2588\u2588\u2588\u258a    | 58/100 [05:09<03:36,  5.16s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it directly contradicts the Correct Answer. The Correct Answer specifically states that Claude 3 Sonnet provides the best balance of intelligence and speed for high-throughput tasks, while the Generated Answer claims that Claude 3 Haiku is the best model for these tasks. This is a fundamental disagreement about which model is most appropriate for these use cases. While both answers discuss balance between speed and intelligence, they reach opposite conclusions about which model achieves the optimal balance for high-throughput tasks like sales forecasting and targeted marketing. The fact that the Generated Answer provides detailed reasoning doesn't make it correct when it contradicts the core claim of the Correct Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  59%|\u2588\u2588\u2588\u2588\u2588\u2589    | 59/100 [05:13<03:25,  5.01s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information:\n",
      "\n",
      "1. They both state that you can use either dot product or cosine similarity to calculate the similarity between Voyage embedding vectors\n",
      "2. They both explain that these methods are equivalent because Voyage embeddings are normalized to length 1\n",
      "\n",
      "While the Generated Answer presents the information in a slightly different order and with additional explanation, the core substance is identical to the Correct Answer. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  60%|\u2588\u2588\u2588\u2588\u2588\u2588    | 60/100 [05:19<03:31,  5.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer and even expands on them in a complementary way. Both answers emphasize that examples help:\n",
      "1. Reduce misinterpretation of instructions\n",
      "2. Enforce consistent structure and style\n",
      "3. Guide Claude toward desired output/performance\n",
      "\n",
      "The Generated Answer provides additional details and examples, but these don't contradict the core message of the Correct Answer - they simply elaborate on it. The substance of both answers is fundamentally the same, even though they're worded differently. There are no critical omissions or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 60/100 questions. Current Accuracy: 0.6833\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  61%|\u2588\u2588\u2588\u2588\u2588\u2588    | 61/100 [05:25<03:29,  5.38s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately identifies and describes both types of content block deltas:\n",
      "\n",
      "1. It correctly identifies \"Input JSON delta\" and explains that it contains partial JSON strings for tool input, which aligns with the Correct Answer's description of deltas containing a \"partial_json\" field with parts of the JSON object for tool input.\n",
      "\n",
      "2. It correctly identifies \"Text delta\" and explains that it contains text updates, which matches the Correct Answer's description of text deltas containing text strings.\n",
      "\n",
      "While the exact wording differs between the two answers, the substance and key information about both types of deltas are effectively the same. The Generated Answer even provides some additional context about how the deltas work (like mentioning content_block_stop events), but this extra information doesn't contradict the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  62%|\u2588\u2588\u2588\u2588\u2588\u2588\u258f   | 62/100 [05:30<03:20,  5.27s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it focuses on different capabilities than those mentioned in the Correct Answer. The Correct Answer specifically highlights Claude's question answering and text analysis capabilities (including sentiment analysis and preference understanding) as key enablers for interactive systems and personalization. In contrast, the Generated Answer discusses multimodal input and tool use/function calling capabilities. While these are indeed capabilities of Claude, they are not the specific capabilities highlighted in the Correct Answer for building interactive systems and personalizing user experiences. The Generated Answer, therefore, misses the core capabilities mentioned in the Correct Answer and instead provides different ones.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  63%|\u2588\u2588\u2588\u2588\u2588\u2588\u258e   | 63/100 [05:35<03:15,  5.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the essential elements and the correct sequence of events in a raw HTTP stream response that were mentioned in the Correct Answer:\n",
      "\n",
      "1. It mentions the message_start event coming first\n",
      "2. It describes the content block sequence (start, delta, stop)\n",
      "3. It includes the message_delta events\n",
      "4. It notes the final message_stop event\n",
      "5. It mentions that ping events can occur throughout\n",
      "\n",
      "The Generated Answer actually provides slightly more detail by mentioning that the message_start contains a Message object with empty content, but this additional detail doesn't contradict the Correct Answer. The substance and sequence of events is identical between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  64%|\u2588\u2588\u2588\u2588\u2588\u2588\u258d   | 64/100 [05:39<02:56,  4.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys the same key information as the Correct Answer - specifically that the Claude API allows up to 20 images per request while the claude.ai interface has a lower limit of 5 images per turn. While the Generated Answer is more concise and uses slightly different wording, it captures the essential numerical limits accurately and maintains the key comparison between the two interfaces. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  65%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 65/100 [05:45<03:03,  5.24s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. The two answers provide fundamentally different solutions to the problem:\n",
      "\n",
      "1. Correct Answer: Simply increase the max_tokens parameter to get the complete response from Claude, including the full tool use block.\n",
      "\n",
      "2. Generated Answer: Extract partial information from the incomplete tool use, execute the tool client-side, and continue the conversation with the tool results.\n",
      "\n",
      "These are contradictory approaches. The Correct Answer provides a straightforward solution of just increasing max_tokens, while the Generated Answer suggests a more complex workaround that involves parsing incomplete tool use blocks and executing them client-side, which is not the recommended approach according to the documentation.\n",
      "\n",
      "The Generated Answer misses the key point that the simple solution is to increase max_tokens, and instead proposes a different and more complicated approach that could potentially lead to errors or unexpected behavior.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  66%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 66/100 [05:50<02:48,  4.95s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While both answers mention \"develop your test cases\" as one of the steps, they differ on the second step. The Correct Answer states that you need to \"take a look at Anthropic's guide to developing test cases\" while the Generated Answer states you need to \"build a strong input prompt.\" These are substantively different steps. The Generated Answer misses the critical piece about consulting Anthropic's guide, which is specifically mentioned in the Correct Answer, and instead introduces a different step that isn't mentioned in the Correct Answer. This represents a meaningful difference in the substance of what needs to be done before running a classification evaluation.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  67%|\u2588\u2588\u2588\u2588\u2588\u2588\u258b   | 67/100 [05:54<02:38,  4.82s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it discusses using the content parameter to influence Claude's responses, it misses the critical specific detail from the Correct Answer about needing to place the content in the last position of the messages list with an \"assistant\" role. The Generated Answer instead talks more generally about using content to simulate conversations with user and assistant messages, which is not the same specific mechanism described in the Correct Answer. The key technical detail about position and role is missing, which makes this answer incomplete and potentially misleading.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  68%|\u2588\u2588\u2588\u2588\u2588\u2588\u258a   | 68/100 [06:00<02:40,  5.03s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both key advantages mentioned in the Correct Answer:\n",
      "\n",
      "1. It accurately conveys that prompt engineering is more effective at helping models understand and utilize external content/retrieved documents compared to fine-tuning.\n",
      "\n",
      "2. It correctly explains that prompt engineering preserves the model's general knowledge/capabilities, while fine-tuning risks catastrophic forgetting.\n",
      "\n",
      "The Generated Answer even uses very similar phrasing and presents the same core concepts, just structured slightly differently with additional quoted references. There are no missing critical pieces of information and no contradictions between the two answers. They are substantively identical in their key points.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  69%|\u2588\u2588\u2588\u2588\u2588\u2588\u2589   | 69/100 [06:05<02:34,  4.99s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it describes some technical aspects of using the API, it misses one of the key initial setup steps specified in the Correct Answer - installing and configuring the AWS CLI. The Generated Answer jumps straight into authentication and client creation details, but skips over the fundamental prerequisite of having the AWS CLI installed and configured. Additionally, the Correct Answer mentions the need to install an SDK for accessing Bedrock, which is not explicitly mentioned in the Generated Answer. These are important initial setup steps that are materially different from the authentication and client creation steps described in the Generated Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 3, column 601\n",
      "Evaluating End-to-End:  70%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 70/100 [06:09<02:27,  4.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is completely correct. It provides the exact same AWS CLI command as the Correct Answer (`aws bedrock list-foundation-models --region=<region> --by-provider anthropic --query \"modelSummaries[*].modelId\"`), explains that you need to replace `<region>` with your desired region (giving the same example of `us-west-2`), and correctly states that this will list the available Claude models in that region. The substance and technical details are identical between both answers, with only minor differences in phrasing that don't affect the accuracy of the information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 70/100 questions. Current Accuracy: 0.6429\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  71%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 71/100 [06:14<02:19,  4.80s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key information from the Correct Answer - specifically that the `input_type` argument can be passed with values of either \"query\" or \"document\" to specify the type of input text. In fact, the Generated Answer provides additional helpful context about how this parameter affects the embedding process, but this extra detail doesn't contradict or invalidate the core correct information. The substance of both answers is the same - they both accurately describe the `input_type` argument and its valid values.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  72%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f  | 72/100 [06:19<02:15,  4.86s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is missing a critical piece of information that is present in the Correct Answer. While it correctly describes the basic difference between tool_use deltas (partial JSON strings for input field) and text deltas (simple text updates), it fails to mention that tool_use deltas may have delays between streaming events as the model emits one complete key-value pair at a time. This timing/delay characteristic is an important distinction mentioned in the Correct Answer that is completely absent from the Generated Answer. Since this represents a meaningful omission of a key technical detail about how the streaming works, the Generated Answer cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  73%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e  | 73/100 [06:23<02:07,  4.72s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It provides the exact same file size limits as the Correct Answer - 5MB for API uploads and 10MB for claude.ai uploads. The Generated Answer simply presents this information in a slightly different format (bullet points) and adds a minor detail about error messages, but the core information about the file size limits matches perfectly with the Correct Answer. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  74%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d  | 74/100 [06:28<02:05,  4.81s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core concept as the Correct Answer - the need to choose an appropriate model that balances performance requirements (specifically speed/latency) with capabilities. While the Generated Answer provides more specific examples (mentioning Claude 3 Haiku, Sonnet, and Opus), and goes into more detail about model sizes, the fundamental message about selecting a model that meets latency requirements while maintaining necessary performance is consistent with the Correct Answer. The Generated Answer doesn't contradict anything in the Correct Answer, and includes the key consideration of evaluating models based on specific use case requirements.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  75%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 75/100 [06:33<02:00,  4.83s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the key points from the Correct Answer:\n",
      "1. It correctly identifies voyage-code-2 as the recommended embedding model\n",
      "2. It correctly states that according to Voyage AI, the model offers 17% better performance compared to alternatives\n",
      "\n",
      "The only minor detail missing from the Generated Answer is that the model \"achieves state-of-the-art results on general-purpose corpora.\" However, this is a supplementary detail rather than a critical piece of information about the core recommendation and performance claim. The essential substance about the model choice and its comparative performance advantage is preserved.\n",
      "\n",
      "There are no contradictions between the answers, and they convey the same fundamental information about the model recommendation and its performance benefits.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  76%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 76/100 [06:37<01:48,  4.52s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides slightly more detail than the Correct Answer by mentioning the integration of external tools and functions, both answers highlight the key point about the Cookbook providing interactive Jupyter notebooks that demonstrate PDF handling and embeddings functionality. The Generated Answer captures the essential elements from the Correct Answer and expands upon them without contradicting the core information. The differences are in elaboration rather than substance.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  77%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b  | 77/100 [06:43<01:56,  5.08s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the key relationship between context window size and RAG effectiveness described in the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. A larger context window enables better utilization of retrieved information\n",
      "2. The size of the context window directly impacts how much retrieved information can be used\n",
      "3. This affects the quality and effectiveness of the generated responses\n",
      "\n",
      "While the Generated Answer uses slightly different wording and provides some additional details about coherence and consistency, it maintains the core concept that context window size determines how much retrieved information can be effectively utilized in RAG. There are no contradictions with the Correct Answer, and no critical pieces of information are missing.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  78%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a  | 78/100 [06:50<02:00,  5.49s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the key elements from the Correct Answer and even expands on them in a helpful way. Both answers emphasize:\n",
      "\n",
      "1. The tool's ability to identify edge cases where prompts might not perform well\n",
      "2. The capability to rate individual results to assess prompt performance\n",
      "3. The importance of ensuring consistent performance across different inputs\n",
      "4. The ability to refine prompts based on evaluation results\n",
      "5. The goal of building more robust AI applications\n",
      "\n",
      "The Generated Answer actually provides additional helpful detail about the iterative nature of the process, but this doesn't contradict the Correct Answer - it merely elaborates on it. The core substance and main points are aligned between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  79%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589  | 79/100 [06:54<01:46,  5.05s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers state that Claude 3 Haiku has the fastest comparative latency. The Generated Answer conveys exactly the same information as the Correct Answer, just with slightly different phrasing. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  80%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 80/100 [07:01<01:51,  5.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key concept from the Correct Answer that you need to send the full conversation history with each request to maintain context across multiple turns. The Generated Answer actually provides more detailed implementation guidance with code examples, but the core concept matches the Correct Answer - that the API is stateless and requires sending the complete message history each time. The Generated Answer demonstrates this by showing how to append each new message to the conversation history array and include that full history in subsequent API calls. There are no contradictions between the answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 80/100 questions. Current Accuracy: 0.6750\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  81%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 81/100 [07:08<01:56,  6.14s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the core message of the Correct Answer. Both answers emphasize that using XML tags to provide a specific role context (like General Counsel) helps Claude catch critical legal issues and risks in contract analysis that might otherwise be missed. While the Generated Answer provides more detail and additional benefits (like improved focus and parseability), it doesn't contradict the Correct Answer and includes the key point about helping to identify critical legal issues that could save the company from significant risks. The essence of both answers is the same - role prompting with XML tags improves Claude's ability to analyze legal contracts by providing important context that leads to better identification of crucial legal issues.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  82%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f | 82/100 [07:12<01:39,  5.55s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the core distinction between how the two models handle missing information in tool calls, though it uses slightly different wording. Both answers convey that Opus is more likely to ask for clarification/missing information, while Sonnet is more likely to make inferences/assumptions about missing parameters. While the Generated Answer includes some additional context about capabilities that isn't in the Correct Answer, this doesn't contradict or detract from the key point about how they handle missing information differently. The substance of the distinction is preserved.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  83%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e | 83/100 [07:20<01:47,  6.30s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the key points mentioned in the Correct Answer and even provides additional helpful detail. Both answers emphasize:\n",
      "\n",
      "1. Implementing retry logic for error handling\n",
      "2. Conducting thorough staging/testing\n",
      "3. Load testing\n",
      "4. Error handling and logging setup\n",
      "5. Gradual rollout process\n",
      "6. Documentation and training\n",
      "7. Monitoring and alerting\n",
      "\n",
      "The Generated Answer expands on these points with more specific implementation details, but the core recommendations align perfectly with the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information from the Correct Answer are missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  84%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d | 84/100 [07:26<01:40,  6.26s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer partially aligns with the correct answer but goes well beyond it. The correct answer specifies three key metrics: accuracy, cost, and speed. The generated answer includes accuracy and speed, but does not explicitly mention cost. While the generated answer includes additional metrics like F1 score, consistency, structure, and bias/fairness considerations, which may be valuable, it misses one of the three core metrics (cost) specified in the correct answer. Though the generated answer does mention \"overall cost\" briefly at the end, it's not emphasized as one of the main evaluation criteria as indicated in the correct answer. Since we're missing one of the three critical components from the correct answer, this should be marked as incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  85%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 85/100 [07:31<01:24,  5.63s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer and Correct Answer are essentially referring to the same two methods, though using slightly different wording. Both mention the interactive tutorial and a Google/Claude for Sheets tutorial/workbench. While the specific terminology differs slightly, the core substance of both answers points to the same two recommended learning methods - an interactive tutorial and a spreadsheet-based tutorial. The minor differences in phrasing don't change the fundamental meaning or recommendations being conveyed.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  86%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 86/100 [07:37<01:23,  5.94s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the key substantive differences outlined in the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. Pretrained LLMs are initially trained on unlabeled text data\n",
      "2. These base models are not inherently good at following instructions/answering questions\n",
      "3. Claude has undergone additional training/fine-tuning (including RLHF) to make it more capable at various tasks\n",
      "\n",
      "While the Generated Answer includes additional details about interpretability and adaptability that aren't mentioned in the Correct Answer, these additions don't contradict the core message. The Generated Answer maintains the essential contrast between basic pretrained models and Claude's enhanced capabilities through additional training.\n",
      "\n",
      "The substance and main points align between both answers, even though they are worded differently.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  87%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b | 87/100 [07:45<01:23,  6.40s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides a more detailed expansion of the key points mentioned in the Correct Answer. It covers all the main advantages mentioned in the Correct Answer:\n",
      "\n",
      "1. Cost and resource efficiency (points 1 and 2)\n",
      "2. Speed and time efficiency (point 4)\n",
      "3. Less data requirements (point 5)\n",
      "4. Flexibility and rapid iteration (point 6)\n",
      "5. Preservation of general knowledge (point 9)\n",
      "6. Transparency (point 10)\n",
      "\n",
      "The Generated Answer not only includes all the core concepts from the Correct Answer but also provides additional relevant details and examples. There are no contradictions between the two answers, and the Generated Answer doesn't miss any critical information from the Correct Answer. While the Generated Answer is more verbose and detailed, the substance and main points align perfectly with the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  88%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a | 88/100 [07:49<01:08,  5.75s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that you need to run the command `gcloud auth application-default login` to authenticate with GCP before accessing Claude models on Vertex AI. The Generated Answer simply provides slightly more context by explaining that this authenticates your local environment, but the fundamental instruction and key information is identical. There are no contradictions or missing critical pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  89%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589 | 89/100 [07:54<00:59,  5.43s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the core information about the Prompt Generator tool being introduced on May 10th, 2024, and its main purpose of helping users create tailored prompts for specific tasks. While the Correct Answer provides additional context about the Claude iOS app and Claude Team plan, these are supplementary details that don't affect the central claim about the Prompt Generator's capabilities. The Generated Answer accurately conveys the key functionality of the tool - helping users create high-quality, customized prompts - which aligns with the essential information in the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes the critical information about what was introduced and its main purpose.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  90%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 90/100 [07:57<00:49,  4.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys exactly the same information as the Correct Answer - that both Claude 3.5 Sonnet and the Artifacts feature became available on June 20th, 2024. While the wording is slightly different (omitting \"both\" and having a slightly different sentence structure), the core information and meaning are identical. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 90/100 questions. Current Accuracy: 0.7000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  91%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 91/100 [08:02<00:42,  4.71s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information - that to limit Claude's response to a single token, you should use \"max_tokens\": 1 in the request. The Generated Answer uses slightly different wording by referring to it as a \"header\" rather than just a request parameter, but this minor difference doesn't change the fundamental correctness of the technical information being conveyed. Both answers specify the exact same parameter name (\"max_tokens\") and the exact same value (1) to achieve the desired result of limiting the response to a single token.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  92%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 92/100 [08:05<00:35,  4.48s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that temperature controls randomness in the model's output generation. The Generated Answer simply provides more detail and elaboration about what higher and lower temperatures do specifically, but the fundamental meaning matches the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes all critical information from the Correct Answer while expanding on it.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  93%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e| 93/100 [08:10<00:31,  4.54s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses one of the key ways to specify API parameters mentioned in the Correct Answer. While it correctly identifies that parameters can be added as additional arguments after the prompt and model (like max_tokens), it completely omits the second way mentioned in the Correct Answer - the ability to pass in an API key for a specific cell using \"api_key\". Instead, the Generated Answer discusses how to make a simple prompt call, which wasn't one of the two parameter specification methods described in the Correct Answer. Since it's missing this critical piece of information about API key specification, the answer cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  94%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d| 94/100 [08:14<00:26,  4.35s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key points:\n",
      "1. Prefilling with { makes Claude skip the preamble/explanation\n",
      "2. It causes Claude to output directly as JSON\n",
      "3. The result is more concise and easier for programs to parse\n",
      "\n",
      "The Generated Answer captures all the essential information from the Correct Answer, just using slightly different wording. There are no contradictions or missing critical pieces of information. The substance and meaning are equivalent.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  95%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 95/100 [08:19<00:22,  4.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is partially correct but contains extra information that is not verified by the correct answer. The first two points about the multimodal cookbook and API reference documentation match the correct answer's substance. However, the third point about the developer community is not mentioned in the correct answer and cannot be verified as accurate. Since this additional information goes beyond what's confirmed in the correct answer but doesn't contradict the core accurate information, and since all the key elements from the correct answer (cookbook and API documentation) are present in the generated answer, this can still be considered substantially correct.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  96%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 96/100 [08:26<00:20,  5.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect for several reasons:\n",
      "\n",
      "1. While it correctly shows that you can specify the API key either through environment variables or directly when creating the client, the TypeScript example is completely wrong as it uses OpenAI's SDK syntax instead of Anthropic's.\n",
      "\n",
      "2. The core substance of how to specify the API key is technically present, but showing incorrect SDK usage could be very misleading for developers.\n",
      "\n",
      "3. The correct answer provides a simpler, more general explanation that captures the key points without implementation details, while the generated answer includes specific code examples but gets the TypeScript implementation wrong.\n",
      "\n",
      "Even though the Python portion is accurate and the general concept of specifying API keys is present, providing incorrect SDK information makes this answer problematic enough to be considered incorrect overall.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  97%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b| 97/100 [08:30<00:14,  4.97s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the same two key benefits mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers mention identifying edge cases where prompts might fail/falter\n",
      "2. Both answers discuss ensuring consistent performance across different inputs/test cases\n",
      "\n",
      "The Generated Answer breaks these points down in a slightly different format but conveys the same core information. The substance and meaning are equivalent, with both answers focusing on how the tool helps identify problems and ensure reliability across different scenarios. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  98%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a| 98/100 [08:37<00:10,  5.47s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct in substance. Both answers emphasize the key distinction that the pretrained model lacks inherent capabilities for following instructions and being helpful, while the final Claude model available through the API has been enhanced through fine-tuning and RLHF to be more capable and aligned with human needs. While the Generated Answer goes into more detail about specific aspects like prompt engineering, cost, and latency, these additional points don't contradict the core message of the Correct Answer. The fundamental transformation from a basic language model to a helpful AI assistant through fine-tuning and RLHF is captured in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  99%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589| 99/100 [08:39<00:04,  4.60s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is exactly identical to the Correct Answer, stating that Anthropic's IPv6 address range is 2607:6bc0::/48. There are no differences in wording or substance, and all critical information is present.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100/100 [08:45<00:00,  5.25s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It describes the same two methods for specifying the API key as mentioned in the Correct Answer:\n",
      "\n",
      "1. Passing the API key directly when initializing the Anthropic client\n",
      "2. Setting it as an environment variable named ANTHROPIC_API_KEY\n",
      "\n",
      "The Generated Answer even provides helpful code examples to illustrate both methods, though these weren't required to match the Correct Answer. The substance and key information is identical between both answers, just expressed in slightly different words.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 100/100 questions. Current Accuracy: 0.7100\n",
      "Detailed results saved to evaluation/csvs/evaluation_results_one.csv\n",
      "Average Precision: 0.4283\n",
      "Average Recall: 0.6592\n",
      "Average MRR: 0.7367\n",
      "Average F1: 0.5193\n",
      "End-to-End Accuracy: 0.7100\n",
      "Evaluation complete. Results saved to evaluation_results_one.json, evaluation_results_one.csv\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "avg_precision, avg_recall, avg_mrr, f1, precisions, recalls, mrrs = evaluate_retrieval(retrieve_base, eval_data, db)\n",
    "e2e_accuracy, e2e_results = evaluate_end_to_end(answer_query_base, db, eval_data)\n",
    "\n",
    "# Create a DataFrame\n",
    "df = pd.DataFrame({\n",
    "    'question': [item['question'] for item in eval_data],\n",
    "    'retrieval_precision': precisions,\n",
    "    'retrieval_recall': recalls,\n",
    "    'retrieval_mrr': mrrs,\n",
    "    'e2e_correct': e2e_results\n",
    "})\n",
    "\n",
    "# Save to CSV\n",
    "df.to_csv('evaluation/csvs/evaluation_results_detailed.csv', index=False)\n",
    "print(\"Detailed results saved to evaluation/csvs/evaluation_results_one.csv\")\n",
    "\n",
    "# Print the results\n",
    "print(f\"Average Precision: {avg_precision:.4f}\")\n",
    "print(f\"Average Recall: {avg_recall:.4f}\")\n",
    "print(f\"Average MRR: {avg_mrr:.4f}\")\n",
    "print(f\"Average F1: {f1:.4f}\")\n",
    "print(f\"End-to-End Accuracy: {e2e_accuracy:.4f}\")\n",
    "\n",
    "# Save the results to a file\n",
    "with open('evaluation/json_results/evaluation_results_one.json', 'w') as f:\n",
    "    json.dump({\n",
    "        \"name\": \"Basic RAG\",\n",
    "        \"average_precision\": avg_precision,\n",
    "        \"average_recall\": avg_recall,\n",
    "        \"average_f1\": f1,\n",
    "        \"average_mrr\": avg_mrr,\n",
    "        \"end_to_end_accuracy\": e2e_accuracy\n",
    "    }, f, indent=2)\n",
    "\n",
    "print(\"Evaluation complete. Results saved to evaluation_results_one.json, evaluation_results_one.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABXAAAAJOCAYAAAAeWC/9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8pXeV/AAAACXBIWXMAAA9hAAAPYQGoP6dpAAC1MElEQVR4nOzdd3gUVf/+8XuTkEKvCUXaAySQECAQqaEG0Ehv0pSiKDwKFqoUqVLlERREBESKUqWE3pv0Gqr0LhpCL+nJ/P7gl/2yJIEEkuwC79d1cV3szJmZz5Yzu7n37BmTYRiGAAAAAAAAAAA2x87aBQAAAAAAAAAAEkaACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAANs8wDGuX8MrjMX418DwCAAC8ehysXQAAvIz27Nmjdu3aJbo+Xbp0ypIli9zd3dW6dWvVrVv3qfsLCwtTlSpV9PDhQ/n7+2vSpElJqsMwDO3evVvLli3TsWPH9O+//yoiIkLZs2dXqVKlVLduXQUEBMje3j7J961WrVr6+++/4y23s7OTs7Oz3NzcVKFCBX344YcqUKBAkvf7PO7cuaPhw4dr27ZtCg0NlZubm9asWSMHB96+0tLjr3d7e3tt375d2bNnT7R9cHCwatSoodjYWLm5uWnbtm0vdPwVK1Zo48aNGjduXJK3ef/997V37179+uuvqly58gsdPyWEhISoUaNG6tixoz766COLdYcPH9a8efO0d+9ehYSEKF26dHJzc1PFihXVtm1bFSlSJNXrO3LkiL755hvNmTMn1frXwoULNWDAADVp0kSjRo16aturV6/K398/RV4/yfGsc/uTZs2apQoVKqR4HdHR0fLy8pIknTp1Ksnbbd++Xb/88ot+/fXXFK/pcR4eHklu27VrV3Xr1i1V6mjdurUOHjyY7Odh6dKl6tOnjyTpxx9/VO3atVOlvtfRgwcP9Pbbb6tZs2b68ssvrV0OAACvDP4CBoAXkD59evn7+8dbfu/ePZ09e1Y7d+7Uzp079eWXX6pLly6J7mfNmjV6+PChnJyctGXLFgUHB8vNze2px7569aq+/PJLHTlyRJJUpEgR+fr6Kl26dLpy5YrWrVuntWvXaurUqZo8ebLy5MmTrPtWuXJl5ciRw3zbMAyFhYXpr7/+0rx587RixQr9+uuvKlWqVLL2mxwjRozQsmXLlCNHDtWsWVNZs2YlvLWymJgYrV+/Xi1btky0zerVqxUbG5six9u3b5969OihsmXLpsj+rKV///7KnDmzOnToYLF84sSJmjhxouzs7OTt7S1vb289fPhQFy5c0O+//6758+dr4MCBT328U8K7777LyM3HJHZuf1LOnDnToJqkuXbtmj788MNnvnekpNq1a8vFxeWpbZIT9qaVRYsWycnJSREREZo7dy4BbgrKmDGjevbsqb59+6patWoqV66ctUsCAOCVwF/BAPACsmXLprFjxya4LjY2VjNmzNDo0aM1YcIENWrUKNEQddGiRZKkDh066Oeff9aCBQueOmIpODhYbdq0UXBwsPz8/NS3b18VLVrUos3Vq1f1zTffaPPmzfr000+1YMGCZIWfXbp0SXBEU3R0tPr376+lS5dqwIABWrZsWZL3mVyHDx+WJI0bNy5VRrkheTJnzqx79+5pzZo1zwxw06VLp6ioqBc+5vMGwaNHj1ZYWJjy5s37wjW8qNWrV2vr1q368ccflS5dOvPyXbt2acKECcqbN69mzpxpMaLdMAytWLFCffr00eDBg1W6dGkVL1481WokvLX0tHO7rUqpL02So2/fvnrjjTfS/Lgv4sqVK9q3b5/8/Px08+ZN7dixQ1euXFH+/PmtXdoro1GjRpoxY4YGDRqkpUuX8sUrAAApgDlwASCV2NnZ6YMPPlDJkiUVHR2tP//8M8F2ly9f1v79++Xh4aG2bdvKzs5OCxcuVExMTKL7Hjx4sIKDg1WnTh1NmTIlXngrSW+88YZ+/PFHeXl56fjx41q3bl2K3C8HBwf169dP9vb2OnXqlC5fvpwi+01IXACYO3fuVDsGkq5o0aIqWLCg9u7dq9u3byfY5u+//1ZQUJCqVq2axtVZyps3r4oUKfLM0YGpLTo6Wt99950KFSoUb5RfYGCgJOnTTz+NNx2JyWRSgwYN1KZNG8XGxmr+/PlpVjPwKlu8eLEMw5Cfn58CAgJkGAb9K4WZTCZ16tRJZ86c0ZIlS6xdDgAArwQCXABIZfny5ZP0aD7XhCxatEiGYejtt982z3sZHBysTZs2Jdj+8uXL2rRpk9KnT6+hQ4c+dX5be3t7ffHFF2rZsmWKhqBZsmRRlixZJEk3b960WHfx4kV99dVXqlatmkqWLKlq1aqpf//+Cc6rW6tWLfn6+ur06dNq2rSpSpYsqZo1a6p9+/by8PAwb1O3bl15eHhoz5495m3Pnj2r3r17q2rVqipZsqT8/PzUq1cvnT17Nt5x3n//fXl4eOj06dNq166dvL295efnp1WrVmnPnj3y8PDQ6NGjdfbsWX366acqX768ypYtq/bt2+vo0aOSpP379+v999+Xj4+PqlWrpq+++kq3bt2Kd6wLFy5o4MCBeuutt1SmTBmVKlVKtWvX1pAhQxQcHGzRdvHixfLw8NCMGTN0+PBhderUSW+++abKlCmj1q1bJ/oaCAkJ0ejRo/XWW2+pVKlSqlmzprp3765z587FaxsREaGpU6eqYcOGKl26tMqVK6d27doluu9nCQgIUHR0tDZs2JDg+pUrV0qS6tevn+g+klrTV199ZZ6P9ODBg/Lw8ND7778vSebnbfjw4Zo9e7YqV66s0qVLmwPPuOd8586dFvuMjIzUjBkz1LRpU/n4+KhSpUpq165dgvOsLlu2TO+9954qV66sUqVKqU6dOhoyZIj++eefJD9ea9as0eXLl9W8efN46+L6jslkSnT7hg0bqmHDhnJ3d4+3LiX6wNSpUy1+4u7l5RXvJ+/37t3TuHHj9Pbbb8vb21sVKlRQ586dtX///gRrvn//vr777jvVqVNHpUqVUr169bRw4cJE7+OzXL16VZ9//rl8fX1VtmxZdejQQdu3b7do06VLF3l4eCQaxE2ZMkUeHh6aOHHic9fxLF999ZU8PDz0119/KTAwUM2aNVOZMmVUvnx5devWTWfOnElwu9WrV6tVq1YqW7asKlasqIEDB+ru3bvJOvaECRPMUz4EBwfLw8NDtWrVsmiTnNdLannec96xY8f0ySefqFKlSvLx8dFHH32k06dPJ/v4sbGxWrp0qUwmk9566y01aNBAJpNJixYtUmRkZKLbPXjwQBMnTlT9+vVVpkwZVa1aVV26dFFQUFC8tkk9xyR2jpL+73Hq2bOnedmzznlS8t5/Ht/vJ598Ij8/P/n4+KhBgwb66aefFBoaKunRY+/h4aFq1aolOMo7IiJCb775pnx8fMzbSI/et7Nmzapp06Yxwh8AgBRAgAsAqejhw4c6cOCAJKlYsWLx1sf9MWlnZ6dGjRpJkpo0aSJJmjdvXoL7XL58uSSpZs2aT72QVJxq1app6NChKTqH6M2bN83h5eM/Ud+1a5eaNGmiJUuWKEuWLKpVq5ayZMmiP/74Q02bNtWxY8fi7SsqKkoff/yx7t27p+rVq8tkMqlWrVpq0KCB0qdPL0ny9/dXgwYNzPNNbtq0SU2bNlVgYKCyZcsmf39/Zc+eXcuWLVOzZs20efPmBOvu1q2bzp8/r+rVq8vBwUElS5Y0rzt+/LhatGih48ePq3z58sqVK5d2796tdu3aaeHChWrXrp3u3LmjKlWqKCIiQkuWLFGnTp0s/jDdv3+/mjRpovnz5ytDhgyqVq2afHx8dOPGDc2ZM0etWrXSgwcP4tW1e/dutW3bVufOnVP58uVVsGBBHTx4UP/973+1du1ai7anT59WkyZNNH36dEVHR6tGjRrKli2bVq5cqWbNmpkDZ+lR6PDee+9p7NixCgkJUcWKFVWqVCnzvidMmPDM5/pJAQEBkh4FkwlZvXq18uXLp9KlSye4Pjk1+fj4mC9Alj17djVo0CDeBcm2bdum4cOHq2jRoipTpowKFiwoO7uEP948fPhQ77//vkaOHKmrV6+qUqVKKl68uA4cOKCPPvpIv//+u7ntpEmT1KtXLx0/flyenp6qXr26YmJiNGfOHDVv3lwhISFJerziRp8lNJ9q3JQI48eP17Zt2xIMR0qVKqVvv/1WrVu3tlieUn2gdOnSatCggXl9/fr1LW7/+++/at68uSZPnqywsDBVrVpVxYoV07Zt2/T+++/HC2bv3r2rtm3b6ueff1ZERIRq1KghZ2dnDRgwQDNmzEjSY/a4hw8fqnXr1tq+fbvKly8vLy8v7d69Wx9++KHF8xUXkC9dujTB/SxZskQmk8l8fk1NP/74o3r37q3o6GhVrVpVLi4uWrdunVq1aqUrV65YtP3+++/1xRdf6NixYypbtqxKly6twMDAZF1ITXo0z2zcCG8XFxc1aNDAYsT3875eUktyznlbt25V69attXHjRhUoUEBVq1bViRMn1Lp1a127di1Zx925c6euXbumChUqKE+ePMqTJ48qVaqkW7duJforleDgYDVv3lwTJkzQnTt3VLVqVeXPn1+bN2+OFzon5xzzvBI75z3P+8+0adPUvn17bd68WQULFlSVKlV0+/ZtjR8/Xp06dVJkZKRKliyp4sWLKzg4WLt3745Xz8aNG3Xv3j0FBASY37MlydHRUX5+frp48aL5cxAAAHgBBgAg2Xbv3m24u7sbNWvWjLcuJibGuHPnjrFz506jZcuWhru7u9GkSRMjOjo6XtutW7ca7u7uRocOHczLwsPDDV9fX8PDw8O4fPlyvG06d+5suLu7G/PmzUvZO/X/1axZ03B3dzd2796d4PrQ0FBzDW3btjUvv3XrllG+fHmjRIkSxqpVqyy2mTdvnuHu7m74+/sbERER8Y7VpEkT8/KYmJh46y9evGhedv36daNMmTKGh4eHsXjxYovjLFy40PDw8DB8fHyMf//917z8vffeM9zd3Y3q1asbt2/ftjhO3HPp7u5ufPnll0ZkZKRhGIYRERFhNGvWzLzuxx9/tKjB19fXcHd3N44cOWJeXr9+fcPd3T3e/b9+/br5vgQGBpqXL1q0yLz/ESNGmI9tGIYxcuRIw93d3WjatKl5WUxMjNG4cWPD3d3dGDNmjMVj9dtvvxnu7u5G/fr1zcv69OljuLu7G927dzcePnxoXn7hwgVzPTt27DCeJe4xatWqlWEYhvHWW28Znp6exq1btyzanT9/3nB3dzfGjh1rXLlyxXB3dzeqVq1q0Sa5NT157CeXu7u7G1OnTrV4jAzj/57zx/c1bNgww93d3WjXrp1x79498/LDhw8bpUuXNjw9PY0bN24YERERRunSpY3y5csbwcHB5nZRUVFG165dDXd3d+OHH3545uMWFhZmeHt7G5UqVUpwfXBwsFG1alXz/ahUqZLRvXt3Y+7cucbZs2cT3W9K9wHDMMw1REVFWewvbrtRo0ZZvD6DgoIMX19fw8vLyzhz5ox5+dChQw13d3fjv//9rxEeHm5evmDBAvMx+vTp85RH7ZG414+7u7tRr149IyQkxLzuzz//NLy8vIySJUsaV65cMQzj0XNTqVIlw93dPd558/Dhw4a7u7vRvn37Zx73aef2Z4l7bZcoUcJYuXKleXl4eLjRqlUr8+P4eF0eHh5G+fLljVOnTpmXX7582ahRo4b5/idVYn3ueV4vzxJXW9zjn1TJPec9ePDAqFKliuHh4WEsX77cvPzhw4fGBx98YN5XYu9XT/riiy/inYdXrFgR7/3scV26dDGfsx5//9qwYYNRvHhxo3z58ub7kdRzjGEkfI568nHq0aOHedmzznnJff85cuSIUbx4caNs2bLG/v37zctDQ0PNtU2fPt0wDMOYOXOm4e7ubvTq1SterZ06dTLc3d2Nffv2xVs3d+5cw93d3fjuu+8SemgBAEAyMAIXAF7A33//LQ8PD4t/JUqUUPny5dWhQwcdOnRI1atX19SpUxOc6mDx4sWSpKZNm5qXOTk5qX79+jIMI8FRuHE/307sSuMjRoxQz5494/0bMWJEsu7b5MmTLbbv0aOHPvjgA/n5+Wnz5s3KkSOHhg0bZm6/cOFC3blzR23atDGP0ozTsmVL1axZU1euXNH69evjHevdd9+Vo6OjJCU6ejLO/PnzFRoaqiZNmsQbTde8eXM1adJEDx8+1Ny5c+Nt26BBA2XNmjXB45hMJg0YMMB8kSlHR0e99dZbkqQ8efKoc+fO5ra5cuWSj4+PJJnnAH748KFKliypZs2axbv/uXLlMo+Gu3r1ary6cuTIoV69ellc4CpuBN7jP7s+dOiQTpw4oWLFiqlnz54W96Ft27YqX768MmbMqFu3bik4OFjLli1Trly5NGzYMIuRUYUKFdJXX30lSfrll1/i1fMsiU2jsGrVKklSvXr1EtwuNWqyt7dXmzZtzLcTe/1ERkZq0aJFcnBw0JgxY5QpUybzulKlSqlt27Zyd3fX6dOndf/+fYWFhcnFxUXZsmUzt3NwcFCPHj00ePBg1axZ85m1BQUFKSIiItGLj7m6umru3LmqXr26pEcj21esWKFBgwbpnXfeUc2aNTVu3Lh4o+ZSqw886fDhw9q7d6+KFy8e7/VZunRpffLJJ4qKitKsWbMkPXqMFy9erHTp0umbb76Rk5OTuX2LFi2S9Jgl5OuvvzaPvpckPz8/tWrVyvycSo+em7hfMTw5CjduFPTj59lnSejc/uS/Tz75JMFta9WqpXfeecd828nJyXzRv8f78/z582UYhj799FOLKTLy58+vvn37JrnWZ3mR18uz+Pv7P/Ux8vX1TXC7pJ7zNmzYoJCQENWuXdtiWpb06dNr1KhRFts/y927d7VhwwZlypRJdevWNS+vU6eOsmbNqn379sWbTiJuOqOsWbNq+PDh5vepuPv+zjvvqECBArp48WKyzjEvIqFz3vO8/8yfP1+xsbHq0qWLypUrZ17u4uKir776SgUKFDD/0qBhw4ZydHTU+vXrLaZJCAkJ0Y4dO1SoUKEEn+u4c9/j0x8BAIDnwyVBAeAFpE+f3vzTaMMw9O+//5rnhaxXr54+++wzFSpUKMFt79y5o40bNypz5swWf0xKj/6onjNnjhYvXqzPP//c4o/GuJ9ZG4nMKbdhw4YE55vNly+f+vXrl+T79uS8fPb29sqQIYMKFiwoPz8/vf/++8qVK5d5fdwfaBUqVEhwf1WrVtXmzZu1Z8+eeAFfYgFXQvbt2ydJ5nD1Se+8844WL16svXv3xlv3tOMUKFAg3pQUcbeLFSsWL4CP++M8IiJCkpQhQwaNHDky3n6Dg4P1119/6eTJk5KU4DyLXl5e8a7S7erqat5/bGys7OzszPepRo0aCc6bOnv2bPP/V65cqZiYGHl7e1sEpXGqVKkiOzs7HThwQDExMU+dS/lJAQEBmjRpktauXasWLVqYl69evVpFihRR8eLFEwyq9+/fn+I1FShQIMF9Peno0aMKDQ1V6dKlE/zyo1evXha3//Of/+j8+fNq1qyZGjRooGrVqsnDw0OFChVKtE8/Ke7Llrh5sBOSL18+TZkyRZcuXdKmTZu0e/duHThwQPfv39e1a9c0efJkBQYGavbs2cqfP7+k1OsDT4rr02+++WaCYW/VqlU1atQo83HiHmMfH58Ep3epXbt2sn+q7+rqmuA5pVatWpo9e7bFfWzevLmmT5+uwMBAdevWTdKj/rZq1SplzJgx3nn2aR4/tyfm8SlYHpfQ9CFx/TksLMy8LO55rFatWrz2NWrUkIODg6Kjo5Ncc2Je5PXyLLVr137qhQIT65tJPec97THKlSuXSpcunehczE9asWKFIiMj1aRJEzk7O5uXOzo6qkGDBpo9e7bmzZunAQMGmNfF9YHKlStbbBPnf//7n/n/Bw4cSNY55nkldM57nvefuOf7ybmSpUfPz+NftmbNmlX+/v5avXq11q1bp8aNG0t6NE94TExMolOTvPHGG5IeTcUCAABeDAEuALyAbNmyaezYsRbLDhw4oI8//lgrV66Uu7u7unTpkuC2y5cvV2RkpJydndWpU6d46+3s7HTr1i2tXbvWYk5KV1dXnT59Wjdu3Ehwv09eBObSpUvJCi7izJo1K9EwNiFxYVXXrl2f2i6hP+TiLoiWFNevX5eUeCgW9wdjQnOUPu04Ca2LC0mftu5JBw8e1IIFC3T8+HFdvnxZ4eHhFu0TCt4zZ84cb9nj4UZcmBF3n/LkyZPo/YgTNzfkpk2b4l2U6nFhYWG6e/dukuZTjuPu7q6iRYtq9+7dunv3rrJkyaJTp07pzJkz5uAsrWqKG036LMl57KRH89J269ZNp06d0qlTpzR27FjlypVLtWrV0rvvvptoePe4uIuUZcyY8ZltCxYsqI4dO6pjx46KjY3V8ePHtWbNGs2ZM0f//POPunfvbp5vNrX6wJPinq/Zs2dbfDnwpLg+HVdXYr8OiKsrORK7j3EXZXz8wkxFihSRj4+PDh06pP3798vX11ebN2/WnTt31LJlywQDuMQkdG5PqoQe47gvIx6f5/hpj5ejo6NcXV0t5nidP3++OdB8XKtWrRId6fr4cZL6elm3bl2C88HWrVs33ntJ3759n+t5Teo5LymvqaQGuHGjtfft22e+GGKcuDndAwMD1aNHD3MonZzzRnLPMc/raee85Lz/xNX7+Dz2T9O8eXOtXr1aS5cuNQe4cXP4x91+UtwXnU9e7BQAACQfAS4ApLBy5cpp9OjR+vTTTzVu3Djlz58/wZ+Ux/0xee/evaeOfJo3b55FgFuiRAlt375dBw8etBj9aG0xMTGSHl1c7WmBVdGiReMte9ZPuR+X2MjjOHEByeOjlpNynCdHgz2PIUOGaM6cObKzs1Px4sUVEBCgIkWKqHTp0tqxY4cmT56c4HaJhcFPSs5ovLjHoWjRoipRokSSt0uqgIAATZgwQRs2bFCzZs2eOX1CatWU1Mcu7vWZVB4eHlq1apV27NihzZs3a9euXbp48aLmz5+vBQsWqF+/fs+80FTc85XQxclCQ0N19uxZ2dvby8vLy2KdnZ2dvL295e3trYCAALVs2VJHjhzRhQsXVLhw4VTrA4ntx9vb+6mjjuOeg2c9F8/Txx6fhiEp+2zWrJkOHTqkZcuWydfXV4GBgZKSN33Ci0rqa/JZ7Z4cgX7o0CHzRSwfV7ly5acGuMl9vZw6dSrB4xQsWPC5vgxMSGo9Rok5efKkjh8/Lkk6f/68zp8/n2C7e/fuaeXKleaL4iXnvJHcc8zz7iuxxyS57z/JHd1duXJl5c2bV3v27FFwcLBu3bql06dPy8/Pz/yFypPiXlsp+dgAAPC6IsAFgFRQu3ZtNW/eXH/88YcGDx6sN9980/zzUEn666+/9Ndff8nNzU1btmxJMFQJCQlR9erVtX//fp09e9YcfDZq1EhTp07Vhg0b1Ldv3wRHMlmDq6urLl68qHbt2qly5cqpepwLFy7o77//VrFixeKtj7vKe44cOVKthoTs3btXc+bMUZ48eTRt2rR4QXViVzhPjrjX0OOjDh+3a9cu3bhxQ+XLlzdPb1GiRInnHkn4NHEB7po1a9SsWTOtXr1aXl5eKly4cKLbpHZNTxN37MQeuwsXLujAgQPy9vY2jw52cHBQ9erVzXPUXrt2TbNmzdKvv/6qcePGqVWrVgmGpHHiRmLevn073rpTp06pVatWKlKkiDn8TkjcFeCPHTumu3fvSkq7PhD3mFWpUkVffvnlM9vHjZJ8fNTo4+JGUyZHYtvETRPz5OjBd955RyNGjNCGDRvUu3dvbd++Xf/5z39UpkyZZB87tcWdM69du6YiRYpYrIuNjY33K4tRo0Zp1KhRz3Wc5LxeunXr9tSR9Gkp7jWV0LRAUtJfU3FfmH788cfq0aNHgm1++eUXjRkzRvPmzTMHuM86bxw9elTnzp1T2bJlk32OiQtiEwo379+/n6T7Fed53n9y5cqlv//+W//++2+C5+158+bJ1dXVPMWCnZ2dmjRpoh9//FEbNmwwP/bNmjVLtK64kc228jkFAICXGRcxA4BU0qdPH+XKlUv37t2LNzdd3B+T77zzTqIj4nLlyiU/Pz9Jsri4TLFixVS3bl3du3dP/fv3T3B03+PiRh2ltjfffFOStHXr1gTXjxkzRo0bN9aCBQtS5Dhr165NcP3q1aslSeXLl3+h4yRXUFCQpEc/M37yj+eYmBjt3r1b0rNHwz1N2bJlJUnbtm1LcP24cePUs2dP3bp1y/w47du3z2LezThHjx5V3bp11a1bt+eqqUiRInJ3d9euXbu0e/duXbp0yeLCTQl5npqSOlLvWby8vOTo6Khjx44l+HPeRYsWqX///tq1a5d27dqlgIAAff311xZt8ubNq6+++kqZM2dWaGio7ty589Rjxo1aTSjQcXd3V4YMGXTu3Dnt2rUr0X1ERETo2rVrSpcunXl/adUH4o7z559/JnieWb9+vQICAjR48GBJj8LmzJkz6/jx4wmGuFu2bEl2DRcuXEhwPuW4QOrJ+5ghQwa9/fbbunnzpsaPH6+IiIg0HX2bHHFfdCUUru3ZsyfBPvI0ifUVWz1nJkWlSpUkKcGLX96/f18HDhx45j4iIyPNI4ofvxDakxo0aCB7e3sdPXrU/L4Zd87dtWtXgvOXT58+XX369NHp06eTdY6R/m9+4ITaxr2fJNXzvP/E3beE3rPPnTunQYMG6fvvv7dY3rRpU5lMJq1fv14bNmxQlixZzBdIS0hcyPu0L/YAAEDSEOACQCrJnDmz+vTpI0latWqV+aJgSf1jUpJ5XrnAwECLP+aHDx+uAgUKaN26dXrvvfd09OjReNteuXJFAwcONI82Ss4cp8+jZcuWSp8+vX777TetXLnSYt2mTZs0a9YsnTx5Ut7e3i90nHfffVfp06fXkiVLzFeXj7No0SIFBgYqffr0iV5UJbVky5ZN0qM/9B9/rsLCwvT111+br6wed9Gz51GxYkUVKVJEf/31lyZOnGjxx/icOXN0+PBhubu7q0SJEsqfP7/8/f3177//qn///nrw4IG57c2bN9W/f39dunRJefLkee6QNCAgQFFRURoyZIhMJtNTp0+Q9Fw1xf2E/vG2zyNDhgxq0qSJoqKi1K9fP4vn6OjRo/rtt9/k7Oyst956Sx4eHrp8+bICAwPjBURbtmzRvXv3lDdvXouL+CWkVKlScnBw0NGjR+ONssuQIYM++OADSdLnn3+uFStWxAtJb926ZQ7kmzVrZp77MjX6QNzj/PjIvwoVKqhEiRI6fvy4xowZYxFgXbp0Sd98843Onz9vDmfSpUunNm3aKCYmRr1797Z4ztauXZvgz/KfxTAMffXVV/H2tWjRImXKlCnBaWTiRgT+/vvvsre3V6NGjZJ93LTQtm1bpUuXTlOmTLGYx/X69esaOnRosvcX9xyGhoZavJZs9ZyZFLVq1VKBAgW0c+dOzZgxw7w8MjJSAwYMUGho6DP3sWnTJt2+fVvu7u5PnXvb1dVVVapUkfR/X5rGXbTz5s2bGjZsmMW0A5s3b9aaNWuUI0cOValSJVnnGOn/Lig4b948i761Zs2aBAPrp3me95+2bdvKZDJp0qRJOnHihHn5w4cPza+/hg0bWhznjTfeUMWKFbVnzx6dPXtW9erVe+qvEA4dOiTp/8JiAADw/JhCAQBSUYMGDbRo0SLt2rVLQ4YM0fLly7VhwwbduXNHhQoVeuaFkPz9/ZU5c2bdu3dPK1asMIcVmTNn1sKFC/X1119r3bp1at68ufLnz6/ChQvLyclJV65cMV912snJSW3atNFnn32WqvfVzc1No0ePVvfu3dW9e3f9+OOP+s9//qN//vlHx44dkyT169fvhec+ffw4X331lWbMmKHChQvrwoULOnnypFxcXDRmzJhEL9iTWgICAjRx4kSdPn1atWvXVpkyZRQZGalDhw7p/v37KlasmM6cOZPoxeeSws7OTt999506dOigCRMmaMWKFXJ3d9fly5f1119/KUOGDBo3bpy5/bBhw3Tp0iWtXLlSO3bskLe3t0wmk/bv36/Q0FCVLVs2ST+Nf9p9/v7773X+/HmVLVs2SRfvSW5Nb7zxhuzt7XX69Gm1b99eHh4e6tev33PV27t3bx07dkxbtmxRrVq15Ovrq7t372r//v2KiYnR6NGjzfehV69eGjlypNq2basyZcrI1dVVwcHBCgoKkr29vQYOHPjM4DtDhgwqX768du7cqRMnTsT78uLTTz/VjRs3NHfuXPXo0UPDhw+Xl5eXMmbMqOvXr+vIkSOKiopS9erV1b9/f/N2qdEHChYsqNOnT6tdu3YqVKiQRo8erfTp02vcuHFq3769fv31V61cuVJeXl4KDw/X/v37FRUVpbfeekvvvfeeeT+ffPKJDh48qL1796p27dp68803dePGDR08eNB8gbHkKFy4sM6cOaM6derI19dXISEhOnTokNKlS6cxY8YkOE2Er6+v+fGoUaOGxfQ1SXX79m317Nnzme3efPNNtWzZMtn7lx7NBd2vXz8NHTpU7dq105tvvqkMGTJo9+7dypkzp7Jnz27+CXpSZM+e3fx+0apVKxUoUEBjx45N1XPmyJEjzRf8elpdz9tnnZycNHbsWHXq1EkjR47U0qVLVaBAAR05ckS3bt2Sp6enRfiYkLhfvDzrCyZJatKkibZt26aVK1fqq6++UsaMGTV8+HC1bdtWCxYs0Pbt2+Xt7a3r16/r0KFDcnBw0HfffWd+DJJzjnn33Xf1+++/69ChQ6pbt65KlSqlK1eu6MSJE2rSpEm8sP1pnuf9x8fHR59//rnGjx+vFi1ayNfXVxkyZFBQUJBu3rypKlWqqGPHjvGO1bx5c/Mo4meNbo+76F7cNAwAAOD5EeACQCobNGiQGjZsqIsXL2rq1KnmAONZo2+lR3+8BgQEaP78+Zo3b57FaLOsWbNqwoQJOnLkiJYtW6b9+/fr6NGjevDggbJly6YqVaqocuXKatq0aaqPvo1Tt25dLVq0SNOmTdPu3bu1ZcsW5ciRQzVr1lTHjh1VoUKFFDvOH3/8oalTp2rPnj06d+6ccuXKpebNm+uDDz6IN59kWsiYMaMWLFig77//Xrt379bWrVuVIUMGeXp6qlWrVqpYsaIqV66s7du3KyoqSunSpXuu4xQvXlxLlizR5MmTtW3bNm3atEmZMmVS/fr11bVrV4ufqubIkUMLFizQzJkztXr1au3bt0+Ojo4qXLiwGjVqpJYtW8rZ2fm573PhwoVVokQJ/fXXX0kKR56nphw5cmj48OGaOHGiDhw4oGvXrj13GJQxY0b9/vvvmjFjhlauXKktW7bIwcFB5cuXV6dOncxTlkhShw4d5Orqqrlz5+rkyZM6evSosmXLpnfeeUedOnWKd+GxxLRo0UI7d+7U2rVr4wW4JpNJgwcPVsOGDbVkyRLt379fhw8fVlhYmLJmzaqqVauqUaNGevvtt+PtN6X7wPDhwzV48GCdOXNG169f15UrV+Th4aHChQtr6dKlmjZtmjZu3KgdO3YoQ4YMKlmypN599101bNjQ4kJSTk5O+uWXXzRjxgwtWbJEW7dulaurq3r27KmSJUuqQ4cOyaord+7c+vHHHzVq1Cht375ddnZ2qlmzprp16/bU56Bs2bK6cOHCc0+fEBoamqQRww4ODs8d4EpSmzZtVKhQIU2ZMkXHjh2TyWRS9erV9dVXX6l169bJ2pednZ3Gjh2r0aNH68SJE7py5Yru3r2rLFmypNo5c8OGDc9sky9fvufus5JUunRpLViwQD/++KP5YoJeXl767rvvtGDBgqcGuMHBwdqxY4ekpAW4j39pGhgYqLZt2yp37txatGiRpkyZog0bNmjTpk1ycXFRzZo19cknn6hUqVLm7ZNzjsmbN6/mzZun77//Xnv27NHWrVtVrFgxjRs3Th4eHskKcJ/3/ee///2vPD09NXPmTB09elRhYWF644039N5776lTp04JTvFUrlw5SY+mgXnaL2oePHig3bt3q2jRouZtAADA8zMZLzIZHwAAABIVGxurBg0a6Pbt29qyZctTf26MlBEZGalq1arJ3t5eW7Zsee4vSwDEN2PGDI0cOVIDBgzQ+++/n2i73377TcOGDdO4ceOeOT86AAB4NubABQAASCV2dnbq2rWrbt68qVWrVlm7nFdWbGysIiMjFR0drbFjx+r27dtq1aoV4S2QAsLDwyVJp0+f1tSpU5UxY8anzpkcGxurOXPmyN3dPcFfEAAAgORjCgUAAIBUFBAQoKVLl+qHH35QQECA+WJTSDnR0dHy8fGRyWRSVFSU3Nzckj1dA4CETZo0STNmzDBfBK13797KmDFjou0XLlyoixcvas6cOQlOwwAAAJLPZt5RIyMjVb9+fe3ZsyfRNidOnFCLFi1UunRpNWvWzHxRHAAAAFs2YsQIRUREaPr06dYu5ZXk6Oio4sWLy2QyycfHR9OmTVOmTJmsXRbwSihRooTs7e2VPXt2ffrpp/rggw8SbfvgwQN9//336tKli8qUKZN2RQIA8IqziTlwIyIi1KNHD61fv16zZs1K8CI3oaGhqlu3rho0aKDmzZtr7ty5Wr16tdavX6/06dNboWoAAAAAAAAASF1WH4F79uxZvfvuu7p8+fJT261atUpOTk7q3bu3ihQpov79+ytDhgxas2ZNGlUKAAAAAAAAAGnL6gHu3r17VaFCBc2fP/+p7Q4fPqxy5crJZDJJkkwmk8qWLaugoKA0qBIAAAAAAAAA0p7VL2LWpk2bJLULCQlR0aJFLZblyJFDZ86cSfKxYmNjFR0dLTs7O3MQDAAAAAAAANtmGIZiY2Pl4ODARRLx2rF6gJtUYWFhcnR0tFjm6OioyMjIJO8jOjpaR48eTenSAAAAAAAAkAa8vb3j5UPAq+6lCXCdnJzihbWRkZFydnZO8j7ivqHx9PSUvb19itaHV0dMTIxOnDjB6wRIBfQvIPXQv4DUQ/8CUhd9DEkR9zph9C1eRy9NgOvm5qYbN25YLLtx44ZcXV2TvI+4aRMcHR15U0CiYmJiJPE6AVID/QtIPfQvIPXQv4DURR9DUsS9TpgSE6+jl+Zri9KlS+vQoUMyDEPSo7lPDh48qNKlS1u5MgAAAAAAAABIHTYd4IaEhCg8PFyS9Pbbb+vevXsaPny4zp49q+HDhyssLEwBAQFWrhIAAAAAAAAAUodNB7h+fn5atWqVJCljxoz6+eefdeDAATVt2lSHDx/WlClTlD59eitXCQAAAAAAAACpw6bmwD116tRTb5cqVUpLlixJy5IAAAAAAAAAmxETE6OoqChrl4EXlC5duiTP+21TAS4AAAAAAACA+AzD0L///qs7d+5YuxSkkKxZsyp37tzPvDgfAS4AAAAAAABg4+LCW1dXV6VPn/6ZoR9sl2EYCg0N1fXr1yVJefLkeWp7AlwAAAAAAADAhsXExJjD2xw5cli7HKQAFxcXSdL169fl6ur61OkUbPoiZgAAAAAAAMDrLm7O2/Tp01u5EqSkuOfzWXMaE+ACAAAAAAAALwGmTXi1JPX5JMAFAAAAAAAAABtFgAsAAAAAAADgmTw8POTh4aFr167FWzd37lx5eHhowoQJSdrXzZs3tXr1aot979mzJ8VqrVWrlhYvXpxi+7MmAlwAAAAAAAAASZIuXTpt2rQp3vINGzYka4qHsWPHauvWrSlZ2iuLABcAAAAAAABAkvj6+sYLcB88eKBDhw7J09MzyfsxDCOlS3tlEeACAAAAAAAASBJ/f3/t3btXDx48MC/bsmWLfH19lSFDBou28+bNU61ateTj46P3339fp06dkiRNmDBBS5Ys0ZIlS1SrVi1z+/3796tBgwby9vbWe++9p7///tu87ty5c/rwww9VtmxZVa1aVRMnTlRsbKzFsWrUqKGyZctq0qRJFnWcPHlSrVq1UunSpc3bvkwIcAEAAAAAAAAkibu7u9zc3LRt2zbzsvXr16t27doW7TZt2qSJEyfq66+/1pIlS1SuXDm1a9dOd+/e1QcffKCAgAAFBATojz/+MG+zcOFCDRgwQH/88Yfu3r2rsWPHSpJu3bqlNm3ayNXVVQsXLtSgQYP022+/adasWZKkP//8U8OHD9cXX3yh+fPn6+jRoxbhb+/evVWiRAmtWLFCw4cP17Rp016q6RsIcAEAAAAAAAAkmb+/v3kahcjISO3YsUP+/v4WbaZNm6bOnTurZs2aKlSokL744gvly5dPy5YtU4YMGeTs7CxnZ2dlz57dvM1///tfVahQQR4eHmrevLlOnjwpSVqxYoVcXFw0bNgwFSlSRLVr19bnn3+uadOmSXoU/DZo0ECNGzdWsWLFNGLECDk5OZn3+/fffytr1qzKly+fqlWrpl9//TVZ0z1YGwEuAAAAAAAAgCTz9/fXn3/+qejoaO3atUvu7u7KkSOHRZtz587p22+/lY+Pj/nfyZMndfHixUT3W6BAAfP/M2XKpIiICPO+vLy85ODgYF7v4+OjkJAQ3bt3T+fOnVOJEiXM67Jly6b8+fObb3fu3Fk//fST/Pz81K9fP0VGRipXrlwv+jCkGYdnNwEAAAAAAACAR8qVKydJOnDggDZs2KA6derEaxMTE6N+/fqpUqVKFsszZsyY6H7t7BIea/r4aNo4cfPfxsTESIp/UbR06dKZ///xxx8rICBAGzZs0KZNm9S+fXsNGzZMLVq0SLQWW8IIXAAAAAAAAABJ5uDgoOrVq2vTpk3avHlzvPlvJalw4cL6999/VbBgQfO/yZMnKygoSJJkMpmSfLzChQvr+PHjioqKMi87dOiQsmfPrqxZs6pYsWI6evSoed2DBw906dIlSVJERIS++eYbOTo6qmPHjpo9e7beffddrV279jnvfdojwAUAAAAAAACQLP7+/lq4cKFy5MhhMV1BnI4dO2rmzJlaunSpLl++rG+//VarV69WkSJFJEkuLi76+++/FRwc/MxjNWjQQJGRkRo4cKDOnTunDRs2aMKECWrdurVMJpPee+89rV69WgsWLNC5c+c0cOBAhYeHS3o0evfgwYMaNmyYzp8/r6NHj2r//v0v1Ry4TKEAAAAAAAAAIFn8/PwUHR2d4OhbSXrnnXd048YN/fDDD7px44aKFi2qn376SYUKFZIkNWrUSJ9++qkaNmyo3bt3P/VYGTNm1LRp0zR8+HA1btxY2bNnV/v27dW5c2dJkq+vr0aOHKnx48fr1q1batasmcWcuOPGjdPQoUPVvHlzOTg46O2339Ynn3ySMg9EGjAZT04Q8QqLiYlRUFCQypQpI3t7e2uXAxvF6wRIPfQvIPXQv4DUQ/8CUhd9DEnxur9OwsPDdeHCBRUuXFjOzs7WLgcpJKnPK1MoAAAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAACQ4jw8PCz+VaxYUQMGDNDDhw9feN979uyRh4dHsre7evVqvLq8vLzk5+enYcOGKTIyMt42EyZMkIeHh3bt2pXgPqOjo/XLL7+oYcOGKlOmjHx9fdWpUycdOHAg2fUlxCFF9gIAAAAAAAAgTcUahuxMJps+3oQJE+Tj46PY2Fj9888/GjhwoMaMGaMhQ4a8UC0+Pj7avn37c2+/cOFC5cmTR5IUERGhvXv3atCgQcqWLZu6du1q0XbFihUqUKCAli5dqkqVKlmsi42NVefOnfXXX3+pT58+Klu2rEJDQxUYGKgOHTpo1qxZ8vHxee46JQJcAAAAAAAA4KVkZzJp2cX7uhkenerHyuHsoIaFMiV7uyxZsihXrlySJDc3N3Xu3FlDhgx54QDX0dHRvN/nkT17dovt33jjDR08eFAbNmywCHCPHz+uy5cva/jw4Ro2bJgGDhyoDBkymNfPnTtXBw4c0PLly5U/f37z8t69e+vu3bv6+eefNXny5OeuU2IKBQAAAAAAAOCldTM8WsFhMan+L6VCYhcXF4vbwcHB+uyzz/Tmm2+qZMmSatKkicXUA7NmzVLNmjXl7e2tpk2bav/+/ZLiT6Fw6dIlffjhh/Lx8VGNGjU0a9asZNfm6Ogoe3t7i2UrVqxQ8eLF9dZbbykqKkrr1q2zWL9o0SI1bdrUIryN06NHD40dOzbZdTyJABcAAAAAAABAqrt165Zmz56thg0bmpf17NlTMTExmjdvnpYuXSo3NzcNHjxYknTixAmNGTNGgwYN0urVq+Xr66svvvhCsbGxFvuNiIjQBx98oAwZMmjBggUaOHCgxo0bp82bNyepLsMwtGfPHi1fvlxvvfWWxfLVq1erZs2aypAhgypVqqQlS5aY10dGRurEiRPy9fVNcL/Zs2dXxowZk/rwJIopFAAAAAAAAACkio8++kj29vYyDENhYWHKmjWrOaA1DEO1a9fWW2+9pdy5c0uS2rZtq48//liS9Pfff8tkMilv3rx644039MUXX6hmzZrxAtzt27fr1q1bGjFihDJmzKhixYppwIABsrNLfOxq/fr1Zfr/8/lGRkYqe/bsateunT788ENzmwMHDuiff/5R7dq1JUl169bV119/rb///lv58uXTnTt3ZBiGsmTJYt7mwoULatq0qcWxDh069JyP3iMEuAAAAAAAAABSxTfffKPSpUvLMAzdvn1bv/32m1q3bq3ly5crR44cat26tVatWqWDBw/qwoULOnbsmDmg9fPzk7u7uxo0aCBPT0/5+/urRYsWcnCwjDQvXLigwoULW4x2bdas2VPrmjJlitzc3HTt2jUNHTpUxYsXV5cuXSymUFi5cqXy5csnT09PSZK/v78GDhyowMBAffLJJ+bg9t69e+Zt3njjDS1dulSSdPjwYfXq1ev5H7z/jykUAAAAAAAAAKQKNzc3FSxYUIUKFZKPj49GjhypsLAwrV69WrGxsfrggw80ffp05c2bVx9++KHGjBlj3tbFxUULFy7UzJkzVb58eS1evFhNmzZVcHCwxTGeDHSTIm/evCpYsKAqVaqkn3/+WVu2bNHo0aPN62NiYrRmzRpdu3ZNnp6e8vT0lJ+fn2JjYxUYGChJcnJykoeHh8UI23Tp0qlgwYIqWLCg3Nzckl1XQghwAQAAAAAAAKQJOzs7GYahmJgYnT17Vvv27dOMGTPUpUsX1ahRQ9evX5f0aHqFQ4cO6eeff1bFihXVt29frVmzRhERERYXOZOkQoUK6dKlSwoLCzMvGz16tL755psk1VSgQAF169ZNv/32mw4fPixJ2rVrl27duqUffvhBS5cuNf/76quvdPHiRR08eFCS1LJlSy1evFj//PNPvP0+GTQ/LwJcAAAAAAAAAKni7t27CgkJUUhIiC5evKihQ4cqJiZGtWrVUubMmWVnZ6eVK1fq77//1po1azRhwgRJj+aldXZ21o8//qiFCxfq6tWrWrlypUJDQ+Xh4WFxDD8/P+XMmVMDBw7UuXPntHHjRs2bN09+fn5JrrNdu3YqUqSIhg4dqtjYWK1cuVLFihVT3bp15e7ubv7Xpk0bZc2a1TxNQuvWrVWhQgW1atVKS5Ys0aVLl3Ty5El9++236tevn8qVK/fCjyFz4AIAAAAAAAAvqRzOaRPvPe9xunXrZv6/i4uLSpYsqalTpyp//vySpMGDB+vHH3/Ud999p8KFC2vAgAHq06ePTpw4IR8fHw0fPlyTJk3S0KFDlTdvXn377bcqUqSIbty4Yd6vg4ODuU2TJk2UM2dO9e7dWzVq1EhynQ4ODhowYIA6dOigBQsWaP369eratWu8dk5OTmratKn++OMP9e/fX05OTpo4caIWLFigOXPmaOjQoTKZTCpRooSGDRumhg0bPtfj9jiTYRjGC+/lJRETE6OgoCCVKVPGYkJi4HG8ToDUQ/8CUg/9C0g99C8gddHHkBSv++skPDzcfKEuZ2dn8/JYw5CdyZRmdaT18V51iT2vT2IKBQAAAAAAAOAllNZhKuGtdRDgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAA8BKIjY21dglIQUl9Ph1SuQ4AAAAAAAAAL8DR0VF2dna6du2acuXKJUdHR5lMJmuXhedkGIYiIyMVEhIiOzs7OTo6PrU9AS4AAAAAAABgw+zs7FS4cGH9888/unbtmrXLQQpJnz69ChQoIDu7p0+SQIALAAAAAAAA2DhHR0cVKFBA0dHRiomJsXY5eEH29vZycHBI0khqAlwAAAAAAADgJWAymZQuXTqlS5fO2qUgDXERMwAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG2X1ADciIkL9+vWTr6+v/Pz8NH369ETbrl+/XgEBAfLx8VHr1q11/PjxNKwUAAAAAAAAANKW1QPcMWPG6NixY5o5c6YGDRqkiRMnas2aNfHanTlzRj169FDnzp0VGBioEiVKqHPnzgoLC7NC1QAAAAAAAACQ+qwa4IaGhmrhwoXq37+/vLy8VKdOHXXq1Em///57vLY7duxQ0aJF1bhxYxUoUEDdu3dXSEiIzp49a4XKAQAAAAAAACD1WTXAPXnypKKjo+Xj42NeVq5cOR0+fFixsbEWbbNmzaqzZ8/qwIEDio2N1eLFi5UxY0YVKFAgrcsGAAAAAAAAgDThYM2Dh4SEKFu2bHJ0dDQvy5kzpyIiInTnzh1lz57dvPydd97Rpk2b1KZNG9nb28vOzk4///yzsmTJkuzjxsTEpEj9eDXFvT54nQApj/4FpB76F5B66F9A6qKPISl4feB1ZtUANywszCK8lWS+HRkZabH89u3bCgkJ0cCBA1W6dGnNnTtXffv21ZIlS5QjR45kHffo0aMvVjheC7xOgNRD/wJSD/0LSD30LyB10ccAIGFWDXCdnJziBbVxt52dnS2Wjx07Vu7u7mrbtq0kadiwYQoICNCiRYv08ccfJ+u43t7esre3f4HK8SqLiYnR0aNHeZ0AqYD+BaQe+heQeuhfQOqijyEp4l4nwOvIqgGum5ubbt++rejoaDk4PColJCREzs7Oypw5s0Xb48eP6/333zfftrOzU/HixXXt2rVkH9fe3p43BTwTrxMg9dC/gNRD/wJSD/0LSF30MQBImFUvYlaiRAk5ODgoKCjIvOzAgQPy9vaWnZ1laa6urjp37pzFsgsXLuiNN95Ii1IBAAAAAAAAIM1ZNcB1cXFR48aNNXjwYB05ckQbNmzQ9OnT1a5dO0mPRuOGh4dLkt59910tWLBAS5cu1aVLlzR27Fhdu3ZNTZo0seZdAAAAAAAAAIBUY9UpFCSpb9++Gjx4sNq3b6+MGTOqW7duqlu3riTJz89PI0eOVNOmTfXOO+/o4cOH+vnnn/Xvv/+qRIkSmjlzZrIvYAYAAAAAAAAALwurB7guLi4aPXq0Ro8eHW/dqVOnLG63aNFCLVq0SKvSAAAAAAAAAMCqrDqFAgAAAAAAAAAgcQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRDtYuAAAAAACsafXq1Zo4caKioqLUsGFDde3a1bwuODhYH330kcLDw+Xs7KzQ0FAFBwdrz549Sp8+vSTpwYMHaty4sYYPH64KFSpY624AAIBXFAEuAAAAgNdWSEiIxowZo0WLFilTpkz66KOP9Oeff6pq1aqSJDc3Ny1ZskRBQUEqXbq0OnfurI8++sgc3krSsGHDdO/ePWvdBQAA8IpjCgUAAAAAr60dO3aoYsWKyp49u9KlS6fGjRtr1apVCbZdsWKFoqOj1bJlS/OyVatWKUOGDPLw8EirkgEAwGuGABcAAADAa+v69etydXU133Z1dVVwcHC8drGxsZo0aZJ69uxpXnbt2jXNnDlTvXv3TpNaAQDA64kAFwAAAMBrKzY2Nt4yk8kUb9mxY8fk6uoqb29v83b9+/fX119/LWdn51SvEwAAvL4IcAEAAAC8tnLnzq2QkBDz7evXryt37tzx2u3fv1/16tUz3z5//rzOnz+v/v37q1GjRjp27JgGDBignTt3pkndAADg9UGACwAAAOC1ValSJe3evVs3btxQVFSUli1bpho1asRrd+rUKZUvX958u2jRotq6dasCAwMVGBiokiVL6ptvvlHlypXTsHoAAPA6IMAFAAAA8Npyc3NTr1691LFjR9WvX18eHh6qU6eO+vfvr40bN5rbXb9+XXny5LFipQAA4HXlYO0CAAAAAMCaAgICFBAQYLFs+PDhFrd//fVXOTk5JbqP2bNnp0ptAAAAjMAFAAAAgGdwcXGxdgkAAOA1xQhcAAAA4BUSaxiyM5msXcYrxd7eXp6entYu45XFaxYAgKcjwAUAAABeIXYmk5ZdvK+b4dHWLgV4phzODmpYKJO1ywAAwKYR4AIAAACvmJvh0QoOi7F2GQAAAEgBzIELAAAAAAAAADaKABcAAAAAAAAAbBRTKAAAAAAAgFSxevVqTZw4UVFRUWrYsKG6du1qXhccHKyPP/5YhmEoPDxcsbGxCg4O1p49e5QuXToNGjRIQUFBMplMGjFihEqXLm3FewIA1kOACwAAAAAAUlxISIjGjBmjRYsWKVOmTProo4/0559/qmrVqpIkNzc3BQYGKiYmRocOHdKkSZP00UcfKX369JoxY4YMw9CqVat09uxZffrpp1q5cqUcHIgxALx+mEIBAAAAAACkuB07dqhixYrKnj270qVLp8aNG2vVqlWJto2OjlbLli0lSZs3b1aTJk0kSUWLFpWbm5sOHTqUZrUDgC3hqysAAAAAAJDirl+/LldXV/NtV1dXBQcHx2sXGxurxYsX64cffjAvCw4Olpubm8W2//77b+oWDAA2ihG4AAAAAAAgxcXGxsZbZjKZ4i3btWuXsmXLppIlS5qXGYYRr52dHREGgNcTZz8AAAAAAJDicufOrZCQEPPt69evK3fu3PHabdy4UZUrV7ZY5ubmZrFtSEiIxYhcAHidEOACAAAAAIAUV6lSJe3evVs3btxQVFSUli1bpho1asRrd/DgQXl6elosq1GjhhYtWiRJOnfunC5fvqxSpUqlRdkAYHMIcAEAAAAAQIpzc3NTr1691LFjR9WvX18eHh6qU6eO+vfvr40bN5rbXblyRTly5LDY9r333pOdnZ3q1aunzz//XCNGjJCjo2Na3wUAsAlcxAwAAAAAAKSKgIAABQQEWCwbPny4xe0DBw4oKCjIYpmjo2O8dgDwumIELgAAAAAAsCoXFxdrlwAANosRuAAAAAAAJEGsYcjOZLJ2Ga8ce3v7eHPgImXwmgVeDQS4AAAAAAAkgZ3JpGUX7+tmeLS1SwGeKYezgxoWymTtMgCkAAJcAAAAAACS6GZ4tILDYqxdBgDgNcIcuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG0WACwBIMatXr1a9evVUt25dTZw4Md76kJAQjRkzRk2aNFGrVq109epVSdKDBw/Uo0cPNWrUSI0bN9bx48fTunQAAAAAAGwSAS4AIEXEhbOzZ8/WypUrtX//fv35558Wbfr06aOyZctqyZIlatSokcaMGSNJGjlypPLkyaPAwEB1795dAwcOtMZdAAAAAADA5jhYuwAAwKthx44dqlixorJnzy5Jaty4sVatWqWqVatKkm7duqVTp06pW7dukqRmzZqpUqVKMgxD69at08aNGyVJ1apVU+7cua1zJwAAAAAAsDGMwAUApIjr16/L1dXVfNvV1VXBwcHm21euXFHevHn122+/qUmTJurWrZvSpUunmzdvytHRUXPmzFHjxo31/vvvKzY21hp3AQAAAAAAm0OACwBIEQmFriaTyfz/6OhoHT9+XCVKlNCSJUtUu3ZtffXVV4qJidGNGzeUPn16LV26VF26dNGnn36alqUDAAAAAGCzCHABACkid+7cCgkJMd++fv26xVQIuXLlkouLi3x9fSVJ9evX15EjR5QtWzY5ODiofv36kqQqVaooNDRUN2/eTNs7AAAAAACADSLABQCkiEqVKmn37t26ceOGoqKitGzZMtWoUcO8vkCBAsqTJ48OHDggSdq6das8PT3l6OioypUra+XKlZKkI0eOyMXFRdmyZbPG3QAAAAAAwKZwETMAQIpwc3NTr1691LFjR0VGRqpWrVqqU6eO+vfvr1q1asnf318TJkxQz549FRgYqIwZM2rUqFGSpOHDh2vgwIGaP3++7O3t9b///U92dnzHCAAAAAAAAS4AIMUEBAQoICDAYtnw4cPN/y9cuLC+/vprlSlTRvb29ublrq6umjx5cprVCQAAAADAy4LhTQCANOXi4mLtEgAAAAAAeGkwAhcAEhFrGLIzmaxdxivF3t5enp6e1i7jlcVrFgAAAABePQS4AJAIO5NJyy7e183waGuXAjxTDmcHNSyUydplAAAAAABSGAEuADzFzfBoBYfFWLsMAAAAAADwmmIOXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABslNUD3IiICPXr10++vr7y8/PT9OnTE2176tQptW7dWqVKlVKDBg20e/fuNKwUAAAAAAAAANKW1QPcMWPG6NixY5o5c6YGDRqkiRMnas2aNfHa3b9/Xx988IGKFi2q5cuXq06dOuratatu3rxphaoBAAAAAAAAIPVZNcANDQ3VwoUL1b9/f3l5ealOnTrq1KmTfv/993htlyxZovTp02vw4MEqWLCgPvvsMxUsWFDHjh2zQuUAAAAAAAAAkPocrHnwkydPKjo6Wj4+PuZl5cqV0+TJkxUbGys7u//Ll/fu3St/f3/Z29ubly1atChN6wUAAAAAAACAtGTVEbghISHKli2bHB0dzcty5sypiIgI3blzx6LtlStXlD17dn399deqUqWK3n33XR04cCCNKwYAAAAAAACAtGPVEbhhYWEW4a0k8+3IyEiL5aGhoZoyZYratWunqVOnauXKlfrwww+1evVq5cmTJ1nHjYmJebHC8UqLe33wOsHjI/6BlwXnrtcX71+Iw/sXXkYvy7mL/oWX0cvSv57lVbkfwPOwaoDr5OQUL6iNu+3s7Gyx3N7eXiVKlNBnn30mSfL09NSOHTsUGBioLl26JOu4R48efYGq8brgdfJ6c3Fxkaenp7XLAJLt1KlTCgsLs3YZsCLev15vvH/hZfUyvH/Rv/Cyehn6F4Cns2qA6+bmptu3bys6OloODo9KCQkJkbOzszJnzmzRNleuXPrPf/5jsaxQoUL6559/kn1cb29vvjlFomJiYnT06FFeJwBeSh4eHtYuAVbC+xeAlxnvX0DqeVX6V9xnHeB1ZNUAt0SJEnJwcFBQUJB8fX0lSQcOHJC3t7fFBcwkqUyZMtq3b5/FsvPnz6t+/frJPq69vT1/2OCZeJ0AeBlx3gLvXwBeRpy3gNRD/wJefla9iJmLi4saN26swYMH68iRI9qwYYOmT5+udu3aSXo0Gjc8PFyS1KpVK506dUoTJkzQpUuX9P333+vKlStq1KiRNe8CAAAAAAAAAKQaqwa4ktS3b195eXmpffv2GjJkiLp166a6detKkvz8/LRq1SpJUr58+TRt2jRt3rxZ9evX1+bNmzVlyhS5ublZs3wAAAAAAAAASDVWnUJBejQKd/To0Ro9enS8dadOnbK4Xa5cOS1evDitSgMAAAAAAAAAq7L6CFwAAAAAAAAAQMIIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG/VcAe7evXsVFBQkSbp27Zq6dOmiBg0a6Mcff0zJ2gAAAAAAAADgtZbsAHfp0qVq37691q9fL0kaOHCg9uzZo4IFC2ry5MmaMmVKihcJAAAAAAAAAK+jZAe4M2bMUJMmTdSrVy+FhIRo586d6tq1qyZOnKgvv/xSixYtSo06AQAAAAAAAOC1k+wA9/z582rcuLEkaevWrTIMQ/7+/pIkb29v/fPPPylaIAAAAAAAAAC8rpId4GbOnFkPHjyQJP3555/KmzevChUqJEm6fPmysmXLlqIFAgAAAAAAAMDryiG5G1SoUEETJ07U2bNntXHjRnXs2FGStHbtWn3//ffy8/NL8SIBAABed6tXr9bEiRMVFRWlhg0bqmvXrhbrt2/fru7duytfvnwymUzy9PTUyJEjdfv2bfXv319Xr16VYRjq0qWL6tWrZ6V7AQAAACC5kh3g9u/fX7169dLEiRNVqVIlde7cWZI0cuRI5c2bVz169EjxIgEAAF5nISEhGjNmjBYtWqRMmTLpo48+0p9//qmqVaua2xw9elSNGzdWv379ZG9vb17+ww8/yNPTU5MmTVJISIiaNGmiChUqKGfOnNa4KwAAAACSKdkBbvbs2fXLL7/EWz5nzhzlzZs3RYoCAADA/9mxY4cqVqyo7NmzS5IaN26sVatWWQS4x44d0/Xr19W0aVPlzZtXgwYNUu7cuVWtWjWVLFlSkpQrVy5lzZpVN27cIMAFAAAAXhLJngM3zrlz5zRr1iyNHTtWwcHBunbtmnluXAAAAKSc69evy9XV1Xzb1dVVwcHBFm0yZ86s+vXra/Hixapatar5V1E1a9ZUrly5JEkrV65UZGSkihYtmnbFAwAAAHghyR6BGxsbq4EDB2rRokUyDEMmk0kBAQGaNGmSLl26pN9//125c+dOjVoBAABeS7GxsfGWmUwmi9sjR45UUFCQJKlNmzb67rvvdP/+fWXKlEmSFBgYqG+//VbTpk2Tg0OyPwICAAAAsJJkj8CdNGmSli9frm+++UY7duyQYRiSpF69eskwDI0bNy7FiwQAAHid5c6dWyEhIebb169ft/jCPCIiQlOmTLHYxjAMc1A7ZcoUff/995o5c6aKFy+eNkUDAAAASBHJDnAXLVqkzz77TM2aNVPWrFnNy0uUKKHPPvtMO3bsSMn6AAAAXnuVKlXS7t27dePGDUVFRWnZsmWqUaOGeb2Tk5OWLl2qAwcOSHr0ea1MmTJycXHR4sWLtWTJEs2fP19FihSx0j0AAAAA8LyS/fu5GzduqESJEgmuc3Nz07179164KAAAAPwfNzc39erVSx07dlRkZKRq1aqlOnXqqH///qpVq5b8/f01duxY9enTR0uXLlXOnDk1evRoSdK4ceNkMpnUqVMn8/6GDh2q0qVLW+vuAAAAAEiGZAe4BQsW1NatW1W5cuV46/bu3auCBQumSGEAAAD4PwEBAQoICLBYNnz4cPP/PT09NWzYMJUpU0b29vbm5X/++Wea1QgAAAAg5SU7wG3fvr0GDhyoqKgo1axZUyaTSZcuXdKePXs0ffp0ffXVV6lRJwAAAJ7BxcXF2iUAAAAASGHJDnBbtGihW7du6aefftLcuXNlGIa6d++udOnSqVOnTmrdunVq1AkAAF4hsYYhO5PJ2mW8Uuzt7eXp6WntMl5ZvGYBAABgLckOcCWpc+fOatu2rQ4ePKi7d+8qc+bMKl26tMVFzQAAABJjZzJp2cX7uhkebe1SgGfK4eyghoUyWbsMAAAAvKaeK8CVpIwZM6patWopWQsAAHiN3AyPVnBYjLXLAAAAAACbluwAt127ds9sM2vWrOcqBgAAAAAAAADwf5Id4BqGEW9ZaGiozp07p/Tp06tu3bopUhgAAAAAAAAAvO6SHeDOnj07weV3797VRx99pP/85z8vXBQAAAAAAAAAQLJLqR1lyZJFH3/8sWbMmJFSuwQAAAAAAACA11qKBbhxbt68mdK7BAAAAAAAAIDXUrKnUNi3b1+8ZTExMfr33381adIkeXl5pUhhAAAAAAAAAPC6S3aA+/7778tkMsVbbhiG8uTJo379+qVIYQAAAAAAAADwukt2gDtr1qx4y0wmkzJmzCgPDw/Z2aX4rAwAAAAAAAAA8FpKdoBbvnz51KgDAAAAAAAAAPCEJAW4ffv2TfIOTSaTRowY8dwFAQAAAAAAAAAeSVKAu2fPniTvMKH5cQEAAAAAAAAAyZekAHfTpk2pXQcAAAAAAAAA4AkpesWx0NBQbdu2LSV3CQAAAAAAAACvrWRfxOzvv//W4MGDtXfvXkVGRibY5q+//nrhwgAAAAAAAADgdZfsAHfkyJE6ePCgWrRooYMHD8rFxUVlypTRjh07dPr0aU2YMCE16gQAAAAAAACA106yp1DYt2+fvvzySw0YMEBNmzaVk5OTevXqpUWLFunNN9/Uxo0bU6NOAAAAAAAAAHjtJDvAffjwoTw8PCRJ//nPf3TixAlJkr29vdq0aaPdu3enbIUAAAAAAAAA8JpKdoDr6uqqGzduSJIKFiyou3fvKiQkRJKUNWtW3bx5M2UrBAAAAAAAAIDXVLID3OrVq2v8+PE6dOiQ8uXLp9y5c2v69Ol68OCBFi1aJDc3t9SoEwAAAAAAAABeO0kKcN9//30tW7ZMERER+uyzz5Q5c2Z9//33kqQvv/xSM2fO1Jtvvqnly5erY8eOqVowAAAAAAAAALwuHJLS6M6dO+rdu7eGDRum+vXra9CgQeaRtg0bNlTevHkVFBSkUqVKqXz58qlaMAAAAAAAAAC8LpIU4C5fvlzHjx/XkiVLtGrVKs2bN08eHh5q0aKFGjRoIF9fX/n6+qZ2rQAAAAAAAADwWknyHLheXl4aMGCAtm3bpokTJyp//vwaNWqUqlatqp49e2r37t2pWScAAAAAAAAAvHaSNALXYgMHB/n7+8vf3193797VihUrtGzZMnXo0EH58+dXs2bN1KVLl9SoFQAAAAAAAABeK0kegZuQLFmyqG3btpo/f75mz54te3t788XNAFu1evVq1atXT3Xr1tXEiRMTbXfixAmVLFnSfPvatWtq166dGjZsqBYtWuivv/5Ki3IBAAAAAADwGnuhADckJEQzZsxQ8+bN1a5dO0VGRuqTTz5JqdqAFBcSEqIxY8Zo9uzZWrlypfbv368///wzXruIiAgNHz5cUVFR5mWjRo1SgwYNtGzZMnXr1k1DhgxJy9IBAAAAAADwGkr2FAoPHz7UunXrtHz5cu3Zs0f29vaqXbu2vvzyS1WuXFkmkyk16gRSxI4dO1SxYkVlz55dktS4cWOtWrVKVatWtWj322+/qV27djp06JB52fjx483/v3r1qjJnzpwmNQMAAAAAAOD1laQANzo6Wlu3btXy5cu1ZcsWhYeHq0SJEurbt68aNGigLFmypHadQIq4fv26XF1dzbddXV0VHBxs0WbTpk2KjIzUW2+9ZbHczu7RgPW6devq2rVr+umnn1K/YAAAAAAAALzWkhTgVqlSRffu3VPmzJnVrFkzNWvWTJ6enqldG5DiYmNj4y17fNR4SEiIfv75Z33++eeJ7mPdunU6fvy4PvzwQ61Zs0ZZs2ZNjVIBAAAAAACApAW4Xl5eatasmerUqSNHR8fUrglINblz59bevXvNt69fv67cuXObb2/ZskV37tzRsGHD5OzsLElq1KiRZs+erb1798rPz0/Ozs7y8vJSvnz5dOXKFQJcAAAAAAAApJokBbjTp09P7TqANFGpUiX98MMPunHjhrJkyaJly5apdevW5vUtWrRQ06ZNFRQUpDJlysjT01OBgYGSpIULFyo4OFht27bV6dOndfPmTRUpUsRadwUAAAAAAACvgWRfxAx4mbm5ualXr17q2LGjIiMjVatWLdWpU0f9+/dXrVq15O/vn+i2gwYNUr9+/bRgwQI5OTnpu+++U/r06dOwegAAAAAAALxuCHDx2gkICFBAQIDFsuHDhyfY9tSpU+b/582bVzNmzEjN0gAAAAAAAAALdtYuALBFLi4u1i4BAAAAAAAAYATuyy7WMGRnMlm7jFeKvb29PD09rV3GK4nXKwAAAAAAQPIQ4L7k7EwmLbt4XzfDo61dCvBUOZwd1LBQJmuXAQAAAAAA8FIhwH0F3AyPVnBYjLXLAAAAAAAAAJDCmAMXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUVYPcCMiItSvXz/5+vrKz89P06dPf+Y2V69elY+Pj/bs2ZMGFQIAAAAAAACAdVj9ImZjxozRsWPHNHPmTF27dk19+vRR3rx59fbbbye6zeDBgxUaGpqGVQIAAAAAAABA2rNqgBsaGqqFCxdq6tSp8vLykpeXl86cOaPff/890QB32bJlevjwYRpXCgAAAAAAAABpz6pTKJw8eVLR0dHy8fExLytXrpwOHz6s2NjYeO1v376tb7/9VkOHDk3LMgEAAAAAAADAKqwa4IaEhChbtmxydHQ0L8uZM6ciIiJ0586deO1HjRqlJk2aqFixYmlYJQAAAAAAAABYh1WnUAgLC7MIbyWZb0dGRlos37lzpw4cOKAVK1a88HFjYmJeeB+2wt7e3tolAMnyMvU/+hdeRi9LH6N/4WVE/wJSD/0LSD0vS/96llflfgDPw6oBrpOTU7ygNu62s7OzeVl4eLgGDhyoQYMGWSx/XkePHn3hfdgCFxcXeXp6WrsMIFlOnTqlsLAwa5fxTPQvvKxehj5G/8LLiv4FpB76F5B6Xob+BeDprBrgurm56fbt24qOjpaDw6NSQkJC5OzsrMyZM5vbHTlyRFeuXNFnn31msf1HH32kxo0bJ3tOXG9vb745BazEw8PD2iUArzT6GJB66F9A6qF/AannVelfMTExr8yAPCC5rBrglihRQg4ODgoKCpKvr68k6cCBA/L29pad3f9Nz1uqVCmtW7fOYtu6devqm2++UZUqVZJ9XHt7ewJcwEroe0Dqoo8BqYf+BaQe+heQeuhfwMvPqgGui4uLGjdurMGDB2vEiBG6fv26pk+frpEjR0p6NBo3U6ZMcnZ2VsGCBeNt7+bmphw5cqR12QAAAAAAAACQJuye3SR19e3bV15eXmrfvr2GDBmibt26qW7dupIkPz8/rVq1ysoVAgAAAAAAAIB1WHUErvRoFO7o0aM1evToeOtOnTqV6HZPWwcAAAAAAAAArwKrj8AFAAAAAAAAACSMABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG0WACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG0WACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG0WACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEZZPcCNiIhQv3795OvrKz8/P02fPj3Rtlu2bFGjRo3k4+OjBg0aaOPGjWlYKQAAAAAAAACkLasHuGPGjNGxY8c0c+ZMDRo0SBMnTtSaNWvitTt58qS6du2qZs2aaenSpWrVqpU+//xznTx50gpVAwAAAAAAAEDqc7DmwUNDQ7Vw4UJNnTpVXl5e8vLy0pkzZ/T777/r7bfftmi7YsUKVaxYUe3atZMkFSxYUJs2bdLq1atVvHhxa5QPAAAAAAAAAKnKqgHuyZMnFR0dLR8fH/OycuXKafLkyYqNjZWd3f8NEG7SpImioqLi7eP+/ftpUisAAAAAAAAApDWrBrghISHKli2bHB0dzcty5sypiIgI3blzR9mzZzcvL1KkiMW2Z86c0a5du9SqVatkHzcmJub5i7Yx9vb21i4BSJaXqf/Rv/Ayeln6GP0LLyP6F5B66F9A6nlZ+tezvCr3A3geVg1ww8LCLMJbSebbkZGRiW5369YtdevWTWXLlpW/v3+yj3v06NFkb2OLXFxc5Onpae0ygGQ5deqUwsLCrF3GM9G/8LJ6GfoY/QsvK/oXkHroX0DqeRn6F4Cns2qA6+TkFC+ojbvt7Oyc4DY3btxQx44dZRiGfvjhB4tpFpLK29ubb04BK/Hw8LB2CcArjT4GpB76F5B66F9A6nlV+ldMTMwrMyAPSC6rBrhubm66ffu2oqOj5eDwqJSQkBA5Ozsrc+bM8doHBwebL2I2a9YsiykWksPe3p4AF7AS+h6QuuhjQOqhfwGph/4FpB76F/DyS/7w1RRUokQJOTg4KCgoyLzswIED8vb2jjeyNjQ0VJ06dZKdnZ1+++03ubm5pXG1AAAAAAAAAJC2rBrguri4qHHjxho8eLCOHDmiDRs2aPr06eZRtiEhIQoPD5ck/fzzz7p8+bJGjx5tXhcSEqL79+9brX4AAAAAAAAASE1WnUJBkvr27avBgwerffv2ypgxo7p166a6detKkvz8/DRy5Eg1bdpUa9euVXh4uFq0aGGxfZMmTTRq1ChrlA4AAAAAAAAAqcrqAa6Li4tGjx5tHln7uFOnTpn/v2bNmrQsCwAAAAAAAACszqpTKAAAAAAAAAAAEkeACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG0WACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFEEuAAAAAAAAABgowhwAQAAAAAAAMBGEeACAAAAAAAAgI0iwAUAAAAAAAAAG0WACwAAAAAAAAA2igAXAAAAAAAAAGwUAS4AAAAAAAAA2CgCXAAAAAAAAACwUQS4AAAAAAAAAGCjCHABAAAAAAAAwEYR4AIAAAAAAACAjSLABQAAAAAAAAAbRYALAAAAAAAAADaKABcAAAAAAAAAbBQBLgAAAAAAAADYKAJcAAAAAAAAALBRBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAAAAAAAAANooAFwAAAAAAAABsFAEuAAAAAAAAANgoAlwAAAAAAAAAsFFWD3AjIiLUr18/+fr6ys/PT9OnT0+07YkTJ9SiRQuVLl1azZo107Fjx9KwUgAAAAAAAABIW1YPcMeMGaNjx45p5syZGjRokCZOnKg1a9bEaxcaGqqPP/5Yvr6+Wrx4sXx8fNS5c2eFhoZaoWoAAAAAAAAASH1WDXBDQ0O1cOFC9e/fX15eXqpTp446deqk33//PV7bVatWycnJSb1791aRIkXUv39/ZciQIcGwFwAAAAAAAABeBVYNcE+ePKno6Gj5+PiYl5UrV06HDx9WbGysRdvDhw+rXLlyMplMkiSTyaSyZcsqKCgoLUsGAAAAAAAAgDRj1QA3JCRE2bJlk6Ojo3lZzpw5FRERoTt37sRr6+rqarEsR44c+vfff9OiVAAAAAAAAABIcw7WPHhYWJhFeCvJfDsyMjJJbZ9s9zSGYZj3bW9v/zwl2xx7e3vldDTJzjBZuxTgqbI7mhQTE6OYmBhrl5Jk9C+8TF62Pkb/wsuE/gWkHvoXkHpetv71LHH3Iy7bAV4nVg1wnZyc4gWwcbednZ2T1PbJdk8TNy3DiRMnnqdcm5X3//8DbFqoFHTb2kUkH/0LL42XsI/Rv/DSoH8BqYf+BaSel7B/JcWTU24CrwOrBrhubm66ffu2oqOj5eDwqJSQkBA5Ozsrc+bM8dreuHHDYtmNGzfiTavwNA4ODvL29padnZ15Ll0AAAAAAADYNsMwFBsba86PgNeJVV/1JUqUkIODg4KCguTr6ytJOnDggDlkfVzp0qU1depUGYYhk8kkwzB08OBBdenSJcnHs7OzizcNAwAAAAAAAADYKqtexMzFxUWNGzfW4MGDdeTIEW3YsEHTp09Xu3btJD0ajRseHi5Jevvtt3Xv3j0NHz5cZ8+e1fDhwxUWFqaAgABr3gUAAAAAAAAASDUmw8qzP4eFhWnw4MFat26dMmbMqA8//FAdOnSQJHl4eGjkyJFq2rSpJOnIkSMaNGiQzp07Jw8PDw0ZMkSenp5WrB4AAAAAAAAAUo/VA1wAAAAAAAAAQMKsOoUCAAAAAAAAACBxBLgAAAAAAAAAYKMIcAEAAAAAAADARhHgAgAAAAAAAICNIsAFAAAAAAAAABtFgAsAeCFTp05VUFCQtcsAAOCFxcbGWrsE4JVhGMZTbwMAko4AFwDw3G7cuKEZM2bol19+0fHjx61dDvBK2bVrl+7cuWPtMoDXwvr16yVJdnb8eQSkFJPJJEnmz4hxtwEAyccnFLzyEhtJwTfAwItZu3atcubMqXnz5uny5cv66aefCHGBFLJ37171799f69ev171796xdDvBKmzZtmgYMGKBjx45ZuxTglbNz506NHDnS/F7G32AA8HwIcPFKMwzDPJJi7dq1+uOPP3T48GFFR0fLZDLxAQJ4Trt379bnn3+un3/+Wfnz59fEiRN16dIlQlwghRQrVkz37t3TjBkztHbtWt2/f9/aJQGvpNWrV+vUqVMaMWKESpYsae1ygFdOsWLFdP78ec2fP18So3AB4HkR4OKVFRsba/6AMGLECA0YMEBjx47VkCFDNGXKFEVGRhLiAs+pYsWKGjFihMaPH6/Jkycrf/78mjRpEiEu8ILCwsIkSdHR0cqYMaNy5cqlyZMna82aNYS4QAq7cuWKpk+fruXLl2vnzp2KioqSxAhB4Hkl1Hdy5cqlfv36acuWLTp37pwVqgKAVwMBLl5ZcSNvL1y4oAsXLmj27NlatmyZqlatqj179mjatGmEuMBziJuWpGnTpvrmm28IcYEU0qtXL+3cuVPR0dE6c+aMsmbNqhkzZuidd97RlClTCHGBFDRu3DiNHz9eU6dOVd26dbV//35t377d/CstAMkX13eWL1+uGTNmmJd7e3vLzs5Op06dksTFAgHgeRDg4pW2Zs0atW/fXmFhYcqdO7dcXV3VuXNnlSlTRrt37ybEBZIpNjbW4gIvzZo109ChQxMMcSdPnkyICyTR6NGjtW3bNvn7+8vBwUHFixfX22+/LcMw1KNHD9WtW5cQF0ghmzZt0p49e1SnTh1lzZpVY8eOVc6cOfXzzz9rz549io6OlsRIXCCpHg9kIyMjtXbtWi1atEj169fXxo0b5erqqqZNm2rEiBG6desWFwsEgOfAmROvlLgP2oZhKDY2VoUKFVKxYsXM3/ZKUvr06dWlSxeVLVtW+/bt0/jx4xltASTB4+Ht/v37tXHjRj148EDvvvtugiNxr1y5olGjRun8+fNWrhywfU5OTvLy8lJoaKgWL16s7Nmzq3Pnzub3pl69ehHiAing+PHj+u2333T+/HkVLVpUkuTo6KiffvpJGTJk0MSJE7V3714+GwJJ9Pjnw6CgIJ08eVKdO3fWL7/8olKlSumXX37Re++9J0dHR5UtW1Zr1qyRxBckAJBcJoMzJ14Rj394iIyMlL29vezt7XXp0iV1795dERERCgwMlL29vaRH8wz+73//kyT179+fD+nAUxiGYe4jY8aM0R9//KF06dJJkn799Ve5u7vrjz/+0IABA/TFF1+oS5cuunDhgn766SeNGjWKkRbAM6xbt06//PKLJOnw4cPasWOHcuTIYf5CMu6969tvv9WGDRvUtm1bNW/eXOnTp7dm2cBLJ2504Lhx41SkSBF9//335n4UGRmpTz/9VJcvX9a3336rUqVKWbla4OUxevRobdiwQQ8fPpS/v786d+6sN954Q0FBQdq5c6dmzpyp2NhYFS1aVHPnzpVk+fkSAPB0BLh4JTz+5j916lTt2bNHoaGhKlq0qD755BPdv39f/fr1U3h4uAIDA81hUkREhBwdHc1TKPABAni66dOna/r06Ro9erS8vb3VoUMHhYeHa/z48XJ3d9eiRYv09ddf64MPPlDPnj3N2z059QKA+P773/9q27Ztqlevnnr16qVcuXKZ18XExJhD3IEDB+revXsaN24c71tAEi1btkzh4eEqXLiw3nzzTa1atUozZ85U4cKFNWjQILm4uEh6FOL+73//U+/evc19DsDTrV+/XkOGDNHUqVPl6uqqO3fuqEiRIpL+7/3r5MmT2rNnj3799Ve1adNGH3/8sZWrBoCXCwEuXimTJk3Sb7/9pg4dOsjR0VFz5sxR9uzZ1bNnT2XLls0cKC1atMgiTCK8BZ4tKipKn376qfz8/NSuXTsdOnRI3bt3V+bMmXX9+nXNnDlT7u7umj9/vpYsWaK5c+fSr4AkiImJkclk0rfffqssWbJo8+bNKl26tFq3bq3ChQub2z3+RUjc+xbvX8CzjRs3TrNnz1bWrFmVOXNmNW7cWB06dNDKlSs1Z84cFShQQAMHDjSHuHEe/+IEQOIWLlyoFStWaPr06RZ9Zvv27Tp27Jg6deokBwcHGYahP/74Q/v27dOwYcPMA2kAAM/GcCi8tJYvX27+v2EYunXrlrZt26bBgwfr448/Nn8wt7Oz07fffqvcuXPrhx9+UEhIiPr27WuxLz44APE9eYVgk8mk8PBwOTs76++//9acOXPUunVrLV26VDlz5tSXX36pFStWqGXLlpo3bx4XBwSSyM7OTnZ2durTp4+6dOmili1b6tChQ5o3b54uXLhg0S6uXxLeAkl38eJFzZgxQzNmzNA777yj5cuXa/r06apXr57atGmjq1evqkePHoqMjLTYjvAWiO/Jz4eSdPfuXZ08edLcZ6KioiRJ+/bt05YtWyzeu9KnT6+9e/fqzp07vIcBQDIQ4OKltHXrVv3yyy8WHwZiY2N1/fp18+iJiIgIpUuXTtOnT9elS5c0d+5c5c+fX/Pnz9eIESOsWT5g8x4f6Xfu3DmdPn1aMTEx+uijj+Tt7a1jx47p4cOH8vPzk2EYcnNzU0hIiNatW2cR2vLBHHi2uH4S957WtGlTtWnTRgcPHtT8+fPjhbhPbgcgYUuXLtWPP/6oq1evysXFRQUKFFDjxo1Vu3ZtrVy5Ur/++qvq1aunRo0aydXVVQ4ODtYuGbBpj38+vHLlii5evChJat26tfLmzasePXqY/waTpAoVKkh6dO0RSbpz544uXbqksLAw+hsAJBNnTbyUqlevrqpVq8rOzk5BQUEqU6aMMmfOLAcHB23cuFFVq1aVk5OToqKi5OjoKA8PD4WHh0uS8uXLJ4mfxQFPE/fh/H//+5/WrFkjwzBUsmRJffPNN8qYMaMWLFigLFmyyNPTU5KULl06TZgwQeXLl2dkIPCc7OzszH2nSZMmMplMmjt3ru7du6du3bopT5481i4ReGmMHTtW8+fPl5ubm86ePastW7YoT548cnV1VbNmzWQymbR69Wo9fPhQXbt2VfPmzSUxZzuQGMMwzH1j3LhxWrFihSTJ19dXo0eP1n//+1/9+uuv+vzzz9W/f3+FhYVp1qxZyp49uzJnzixJypQpk6pXr6569eopR44cVrsvAPAyIsDFS8vOzk4nT55Uq1atzFe97927t/r376/s2bPrs88+U7p06WQYhsLCwswfHOIQ3gJPt2zZMi1ZskTjx49XlixZFBkZqYwZM0qSMmfOrDVr1mjmzJlav3697t+/L19fX/NoeP74BZ7P41+ANG7cWKGhoTpx4oTc3NysXRrw0vjnn3907949TZ8+Xe7u7hozZow2bdqkDBkymEfbNm3aVA8fPtQ///xj8aUj719AwuL6yA8//KD58+erZ8+eypQpk3r06KGsWbOqb9++ypIli77//ns1bdpUefPmlYuLi2bPnm3+fGhvby8vLy8r3xMAeDlxETO8VBIKhhYuXKghQ4aoe/fueu+997R06VKNHj1aPj4+yps3ry5cuKDbt29r6dKl/FQHSIb//e9/unLlisaPH2+xfNmyZSpUqJCWLFmic+fOKXPmzBo3bpzSpUvHyHYghTweKHHBMiBpDMPQzp079eGHHyp37tyaMGGCvL29FRsbq6FDh+r06dOqX7++GjVqpAwZMuj27dvKmjUr/Qt4isf7RnBwsDp37qwvv/xS1atX1+7du/XJJ58oMjJSb7/9tsaOHStJOnbsmNKnT69ChQrJzs5O0dHR/B0GAC+IsyheGo+Ht9u3b9c///wjT09PtWjRQg4ODurXr5/s7OzUoUMHlSxZUr/88ouioqJUvHhx9enTRw4ODoRLQDI4ODjo5s2bun//vjJlymRevmvXLv3xxx+aNWuWHjx4YB6Vy4dzIGFPG5We2DqTyWTuUyaTSZGRkXJ0dEztUoGXmslkUpUqVdS1a1dNnDhRp06dkru7u5ycnDRw4EANGzZMq1evVmhoqN577z1ly5ZNkghvgUQ83jf+/fdfRUREKCoqStmzZ9f58+f122+/6bPPPlOZMmXUqlUrpU+fXt27d5eXl5fF/O58PgSAF8eZFC+NuD9wx4wZo8DAQOXIkUNly5ZVsWLF1KRJExmGof79+ysyMlIff/yx/ve//1lsT7gEJOzxAOnx/5crV04LFy7Upk2b9NZbb8nZ2VmS5OXlpfv370uSObw1DIP+BSTg8T61atUqBQcHKyIiQhUrVlSZMmUSDXYf71MLFy6U9OjiZnwJCSQu7ov6rl27KjQ0VEOGDFGGDBnk7+8vR0dHff311+rVq5euXr0qJycn83aEt0B8j4e348eP18mTJzVq1ChVq1ZNzs7O2r59uzJnzix/f385OzsrZ86cWrBggWJjY/XNN9+Y98O0JACQMvhrGzZt586dqly5svn2hg0btGzZMk2aNEmlS5fWrVu3dPPmTd26dUsBAQFKnz69evbsqYcPH+rLL7+02BfhEhDf4xekmDFjhs6cOaN//vlH1atXV/PmzdWuXTuNGPH/2rvzuCrr/P//j3PggAKaQhnigpOToEMq6mQjKuQ6ueXXhWxsSHPJBDEHcU/SSQxJDcQURagRXEIN13Ryy62sHMkFdVzQD8FkKpgiynLO+f3RjTNgbs0vBe15v9283eqc67p4X9xuF9f7/bze1+sdSVFREU2bNqVu3brs3LnTNmuplAa/IrdWen1FR0eTlpZGQEAA2dnZbNy4ke7duzNixAjg1iUTAFauXElERARxcXEKb0Xuws7OzvbQZNy4cVitVsaPH09UVJQtxH3vvfdUlkTkHpReG9u3b+fLL7/ktddeo0aNGowdOxY7OzsiIyNp37499erV48cff6Rly5YMGzaMxo0bV3DLRUQeTUq0pNJas2YNS5cuZc2aNbYOxMWLF2nUqBHNmjVj//79fPzxx3z55ZcUFRXRs2dPJkyYwKRJk9iwYYM65SL3oPQamTt3LitXrmTUqFF4eHiwYsUKPv/8cxITEykpKWHZsmW8//77PPHEEwAsWLAA0GunIvciIyODzz77jLi4OHx9fdm4cSMTJkzAx8eHnJwcPDw8bhnerlixgujoaObNm0enTp0q8hREHhpGo9EW4o4fPx6DwcCkSZOIiIige/fumEwmLbgpcgdl70PHjx/no48+IjMzk4YNG9q+B7h69SqHDx8mIyODOXPmkJ+fT5MmTTAajSpbJyJyH2gRM6nUSm/+J0+e5Omnn+bgwYO8/PLLNG/enMOHD9OuXTs6d+5MlSpVmDx5MitWrMDb21szK0Tuouw1kpeXxxtvvEFwcDDt27fn888/Z/To0URGRtKoUSMaNmxIdnY2OTk5FBcX89xzz2FnZ6eyJCK3cXMwtHv3bqZNm8bWrVvZunUr48ePJzw8HD8/PxISEhg2bBh169Ytt19peBsZGUnXrl0r6lREHlplr6epU6dy9uxZ/vGPf1Rwq0QeHqdPn6Zhw4asW7eOuXPn0qhRI2JiYmwltdLT0wkODsbV1RVnZ2eWLl2KyWTSwxERkftEAa5USmXDpW+++Ya//vWvvPvuu/Tu3ZudO3fy2Wef4e/vj7+/P46OjhQWFvLyyy8TERFBs2bNyh1DRMore23k5eVhMpno2LEj69at49SpU4waNYqxY8fSq1cvpk2bxvPPP0+3bt3KHUMzK0Rurez1lZGRQZMmTbh48SLh4eH4+vqSlJTExIkTCQwMJDs7m65duzJ37lw6d+5sO4bCW5FfR9kgSf1CkXu3c+dOYmNjGTp0KN26dWP9+vUkJyfz1FNP8fbbb9tqSF+9epWLFy/i6emJ0WjUw30RkftIf12l0inb2TYYDPzxj3/kb3/7G2+99RYmk4nu3bsTEBBATk4Ou3bt4sknnyQ2Nhaj0cgzzzxjO4466SI/V/b6io6OJicnh/feew9/f3+mTZvGvn37eOutt+jbty9Wq5UTJ07g7u7+swBX4a3Iz5W9vo4ePcrYsWMJDQ3lhRdeoFq1asTHxzN48GACAwMBcHR0pFGjRri6utqOsWnTJqKionj33XcV3orco9vN+CsbKBkMBoqKinBwcKiAFopUbjc/4Khfvz6NGjVi9erVGAwGevbsicViYeXKlUybNo2IiAgcHR2pVq0a1apVA366DhXeiojcP/oLK5VK2Q74unXrOHPmDG5ubvTu3Ruj0Uh4eDhWq5UePXqQmZnJ5MmTqV+/Pk5OTixfvrxc3TMR+bnSa+Pbb7/l2LFjhIWF2R5+fPjhh7Rv356+ffsCUFxcTLVq1ahdu3ZFNlnkoVB2QcBFixaRnp7O2bNniYmJwcnJiXfeeYcLFy5w+vRp5s+fT5MmTUhOTgagefPmtuM0aNCAuLg4/Pz8KuI0RCq9TZs2cf78eQoLC3nuuedo3rz5bft9VqvVFiilpqYC0KdPHz2EFLnJzRNfnnrqKUaOHMnChQtZuXIlAC+++CIAq1atIiwsjDlz5pR7IKLxl4jI/aUAVyqV0hv/e++9x9q1a/Hx8aFatWo0atSIoUOHcuPGDcaNG4fRaKRbt26kpaVhb2/PE088gcFg0Gs7IndhtVr58ssvGTx4MB4eHlSpUgWDwcArr7zCxYsX+eabbwgKCuKZZ57h4MGDXLlyxTZbUERur3Twm5CQQGJiIhEREQQEBHDo0CHi4uIIDg5m/vz5xMfHs3HjRnbu3ImbmxsrV67Ezs4Os9mM0WikSZMmFXwmIpVXdHQ0aWlpBAQEkJ2dzcaNG+nevTsjRowAys8iLPvfK1euJCIigri4OIW3IreRmJjIwYMHmTdvHvDTLNzXX3+d+Ph4UlJScHBw4MUXX+TGjRscO3ZMYy4RkQdMNXCl0jl+/DghISFERUXRsmVLrly5QvXq1QHIzc1ly5YtzJgxg4iICPr372/bTzNvRe5dXFwccXFxRERE0Lt3b6pWrYrVamXr1q188cUXXLx4EXd3d8aNG4e9vb1q3orco/DwcDw9PQkJCQEgJyeHDRs2sHnzZkaPHo2/vz8lJSVcuXKFmjVr6uGjyD3KyMjgzTffJCoqCl9fXzZu3MiECRNYsGABTz31FB4eHrZty4a3pTWl33333XK1pkV+624eO+3YsYOwsDC6du3KzJkzbZ9nZ2czatQoSkpKeO211+jdu/dtjyEiIveP/tpKpWS1Wm01AUvD2yNHjvC3v/2NTp068eqrr5KWllZuH3UeRO7ObDYDEBISwtChQ5kxYwY7duygqKgIg8FA586dmTp1Ku+//z6TJk3C3t6ekpIShbcit1D2GbjVaqW4uJizZ89y/vx52+ceHh50794dFxcXpk+fzoYNG7C3t8fV1dW2WKfCW5G7u3TpEhaLBV9fX7Zu3crUqVOZPHkynp6exMfH89133wE/BUo3h7eRkZEKb0XKKBu8njx5kqNHj+Lt7c38+fPZvn07EyZMsG1bp04dfH19cXBwICsrq9y9T+MvEZEHR39xpUJZLJaffWY0Gvnhhx84fvw48N/AqaCggMzMTPLy8ggPD7fVDhSRe2dnZ2e77saOHUtQUBATJkxg27ZttmsNynfIFS6J/FzZkKikpITi4mJMJhOBgYEcOHCA3bt327atU6cODRo0wMnJidTUVHbu3Gn7TgtuitxZRkYGAI0bN6ZevXrExsYSHh7O+PHjGTBgAEajkdWrV3Ps2DHgv/evsuGtFgQUKa/0Opk1axajR48mKCiIlJQUPDw8mD17Ntu2bWPSpEnAT2Owy5cv07t3b0JCQnTfEhGpIBqVS4Up++T3zJkzWCwWatasSaNGjQgODmbatGlUq1aNtm3bAuDt7Y2rqyvXrl0DsM1cUidC5Jcpu9jfuHHjAJg0aRKFhYX06tVLsylE7qLs/SsxMZH09HScnJwICgqia9eu7N27l5SUFCwWC/7+/uTn5/PDDz/w/PPPc/HiRfbv309AQEDFnoTIQ+Do0aOMHTuW0NBQXnjhBapVq0Z8fDyDBw+21Wd3dHSkUaNGtje34KeFzqKionj33XcV3orcxtq1a1m7di2LFi3C3t4ei8WCp6cnnp6exMTEMHHiRNq1a4ezszN2dnbMmjVL4y8RkQqkGrhS4ebOncs///lPDAYDBQUF9O/fnz/+8Y9s376dZcuWERwcTM2aNfnss8/Iy8vj448/VsAk8isoG0JNnTqVs2fP8o9//KOCWyXy8IiKimLVqlUEBARw7do1Tp8+TUJCAgALFixg586duLu720qUrF+/nqSkJLZs2cLSpUsxmUwVfAYildeiRYtIT09nx44deHp6MnHiRHx9fXn99depUaMGPj4+NGnShOTkZPLy8khNTbWV+8nIyCAvLw8/P78KPguRymvRokUcPnzYtmhZqb179/L999/Tvn17Vq1ahbOzM3/5y1+0JoKISAVTgCsV6uOPPyYmJob33nuPP/3pT4SHh9tmLj322GN89tlnrFixgho1alCtWjVmz56NyWRSwXyRX0nZa0kzKkTu3b59+5g0aRLx8fF4eXmRmprKW2+9haenJ7GxsXh5eXHw4EEOHDhAjRo16NevHwAzZswgNzeXmTNn4uDgUMFnIVI5JSQkkJCQQEREBFevXuXQoUOcOHGC4OBgmjZtSnx8PLt378bZ2Rk3NzfmzZuHyWTCbDZjNBp1LxO5ya3GTlFRUezYsYPNmzcDUFRUhL29PW+//TZ5eXk/C3a14KaISMVSgCsP1M2dh8jISCwWC1OmTGHr1q1MmDCBiRMnUr9+fa5cuULHjh3Jz8+natWqtg65Og8it3anBxt3+q7sNVVUVKRQSeQWbr6GNm3axLJly0hOTubYsWNERkbypz/9iZycHL7++mtiYmLw9vamuLiYjIwMvvzySy5cuEBaWhrJycl4e3tX4NmIVG7h4eF4enoSEhICQE5ODhs2bGDz5s2MHj0af39/SkpKuHLlCjVr1lT/UOQOyt6/9u/fz9WrV3F3d6dKlSqMGjWKDh06EB4ebtt+9erVrFu3joULF1K1atWKaraIiNxEUxjlgbFarbbOw+7du8nNzaVatWq4urqybds2wsPDCQsLo2/fvhw+fJjIyEjy8/NxcnLCzs5Oq3WL3EHZzvmmTZtISkpi4cKFpKenA7dfJbjsNZWamsratWvLLWYmIuXvXzExMaxbtw4XFxccHR3Jzc1l/fr1NGvWjKCgIFq3bs25c+fo3bs3n3/+OWazmUuXLrFlyxby8/MV3orcgdVqpbi4mLNnz3L+/Hnb5x4eHnTv3h0XFxemT5/Ohg0bsLe3x9XVVf1DkbsovX9FRUUxadIkIiMjWblyJZcvX+bll1/myy+/ZNq0aRQUFJCdnc0///lPatWqpfBWRKSSUU9HHoiyr2anpKSwZMkSli9fjru7O5GRkRQXFxMREUH//v0BqFGjBu7u7jg6OpYLnvRKnMitlV4n0dHRpKWlERAQQHZ2Nhs3bqR79+6MGDECKH8tlv3vlStXEhERQVxcnGqbidyk9DrZvHkza9euJSIigvbt29OwYUOMRiP79+8nODgYFxcXnnzySbp168Zzzz2Hn58f9vb2dOjQAX9/f4VMIndQUlKCxWLBwcGBwMBAkpKS2L17N+3atQOgTp06NGjQwFbv1sXFxbYYoPqHIne2c+dO1q9fT2JiIo8//jgFBQXUrVuXVq1a4eLiwrJly+jQoQNPPPEEDg4OxMXFASqvJSJSmWgUIQ9E6Y1/06ZNbNy4kddff50nn3yS/v37c+7cORITE6lbty7fffcdNWrUYNOmTbi5uWmBF5FfICMjg88++4y4uDh8fX3ZuHEjEyZMwMfHh5ycHDw8PG4Z3q5YsYLo6GjmzZtHp06dKvIURCqtbdu2kZKSQtOmTfH39wd+mhV45MgRTp06Rb169bBarSQmJuLs7ExgYCDw3xIlejAicnuJiYmkp6fj5OREUFAQXbt2ta2JYLFY8Pf3Jz8/nx9++IHnn3+eixcvsn//fluAKyJ3dvXqVWrXrk2dOnVwdnbG1dUVgK+++oqMjAyWLVvG119/jYuLCz4+PtjZ2aksiYhIJaO/yPLAXLx4kfT0dNLT08utCjx27FiuXLnChAkTMBgMVK9eHaPRSGpqKqAnvyK3c3NNzkuXLmGxWPD19WXr1q1MnTqVyZMn4+npSXx8PMOGDaNu3brl9isNbyMjI+ncuXNFnYpIpXPzvcfBwQGTycTevXvZvn07HTp0wGAw4O3tTadOnejZsydPP/00AGvWrLEdQ4NfkTuLiopi1apVBAQEcOXKFcaMGUNCQgJhYWEsWLCAiRMn4u7uTlFREQaDgYULF5KUlMSWLVsoLi7Ww36Rm9xq3YMff/yRrKwsqlSpAvx3zYMTJ05w8OBBgHLjM7PZrPuXiEglo7/Kct/c3Hl4/PHHGT58OAUFBcTHx/OHP/zBNnNi+vTp/Otf/+LKlSuYzWYCAgL05FfkDsrW5MzIyKBJkyY0btyYevXqERsbS1JSEhMnTiQwMJDs7GxWr15N27ZtqVu37i3D265du1bk6YhUKmXvX0VFRdjZ2dGuXTs8PDyIjo4mJSWFKlWq0KZNG0wmE5MnT6Zr164UFBTQs2dP3b9E7tG+ffv49NNPSU5OxsvLi9TUVLZv387QoUOJjY0lMjKSgwcPcuDAAWrUqEG/fv2AnxY1q1OnDlqLWaS8svevM2fOUFRUhLe3N3/5y1/4+OOPGT58OEuWLLEtWOvl5QX8NEPXzc3Ndhy9NSIiUvloZCH3RdlwKS0tjf/85z8YDAZ69+5NWFgYjo6OTJs2DaPRSPv27QFo0aJFuWPoya/IrZXtnB89epSxY8cSGhrKCy+8QLVq1YiPj2fw4MG2V7gdHR1p1KiR7XU5+KmcSVRUFO+++67CW5Eyyt6/EhISOHbsGJmZmXTs2JEePXowbtw45syZQ3JyMlarFT8/P1xdXenSpYvtGLp/idyby5cvU7duXby8vDh27Bjr1q0jNDSUnJwcQkNDiYmJwdfXFx8fHzIyMoiPj+fChQukpaWRnJxsC6FEpPz9a86cOaxfv55r167Rpk0bZs2axeTJk5k5cyYvv/wy06ZN48aNGyQlJeHm5laujygiIpWTwapH1/IrKxsuvffee6xevZrGjRtz+fJlzpw5Q3R0NL///e9JTk5mx44dTJ8+nbZt26pUgsg9KHudLFq0iPT0dHbs2IGnpycTJ07E19eX119/nRo1auDj40OTJk1ITk62LfpSOqMiIyODvLy8cq/Lich/xcTEsGzZMsaPH8/169dZtWoVAJ988gnffvstCQkJWK1W+vXrpzqcIr9QTEwMv/vd76hRowYfffQR0dHRJCQkYDQaGTFiBDt27CA8PByA+Ph4Wrduzb59+4iLi6NRo0YMGjQIb2/vCj4LkcopLi6OlJQUW/mR119/nT59+hAWFsZ3333H9OnTycrKonr16lSvXp0PP/wQk8l0y9ILIiJSeSjAlfvm3LlzxMbG8tJLL/Hss88CMHv2bJYuXcqiRYto0qQJc+bMITU1leTkZJo1a1bBLRZ5eCQkJJCQkEBERARXr17l0KFDnDhxguDgYJo2bUp8fDy7d+/G2dkZNzc35s2bh8lkwmw2YzQa9bBE5A7y8/N54403GDx4MB06dGDPnj2EhIQQGRlJgwYNqF+/PpmZmcyaNYvmzZsTFhZW0U0WeWhs3ryZWbNmERERgb+/P9nZ2Tg7OzNkyBCCg4Pp0KEDX331FStWrOC5556jT58+thntZrNZtaVFbsNqtXLhwgXeeOMNQkJCeP755zl48CBDhw6luLiYgIAAYmJiMBgMnD17FqPRaCutpbI/IiKVn/5Ky6/OarWyb98+hgwZUm4lboCwsDAKCgqYOHEia9eu5dVXX6V+/fr4+PhUYItFHj4nTpzglVde4YUXXgCgbdu2bNiwgdjYWEaPHs3EiRMJDw/nypUr1KxZE4PBoM65yG2UndleUFCAxWLh5MmTeHh48MUXXzBq1CjCw8Pp2LEjU6dOpUWLFrz00ktMnDhRswBFfoFt27aRkpJC06ZN8ff3B8DDw4MjR45w6tQp6tWrh9VqJTExsVwfsvT+pbqcIrdnMBiws7PDYrFgb2/P2bNnSUpKIiwsDD8/P3r27MmUKVMICgqiYcOGtj5h6fYiIlK56R0J+dUZDAb8/PwIDQ3l2rVrnDt3Dvhp1gRA//79sVqt/Oc//8HT05NBgwZhZ2dn+15Eyiv7ooTVaqW4uJizZ89y/vx52+ceHh50794dFxcXpk+fzoYNG7C3t8fV1RWDwaAZSyJ3UBrepqSksHHjRqpXr06XLl2Ijo5mxIgRTJkyhYEDB+Lo6Eh2djYZGRkANGnSBKPRiMViqcjmizw0HBwcMJlM7N27l+3btwM/XX/e3t506tSJnj170qtXL7Kzs3n33XcBdP8S+QWcnJzo3LkzHh4e7N+/H2dnZ1q3bk2NGjV44oknWL16NampqeWuKZVNEBF5OOivtfzqSoPYkSNHMmTIECIjI/niiy9ssyacnZ2pUqUKJSUl5fbTrAqRn7NYLLZwqaSkhOLiYkwmE4GBgRw4cIDdu3fbtq1Tpw4NGjTAycmJ1NRUdu7caftOJRNE7u7kyZPMnz8fs9mMn58f2dnZtGnTxrbQX1FREQaDAU9Pz3L7afArcmdFRUWYzWbatWvH5MmTadmyJSkpKezbtw8Ak8nEpEmTiImJYciQIXzyySeYTCZKSkp0/xK5RxaLhapVqzJ8+HAaNmzIjh07qFWrFg0bNsTBwYFnnnmG5ORkJk6cWNFNFRGR/4EeZ8uvrvTVHaPRSHh4OGazmTfeeIO//vWv1KlTh+3bt+Pi4kKjRo0quqkilVrZxSQSExNJT0/HycmJoKAgunbtyt69e0lJScFiseDv709+fj4//PADzz//PBcvXmT//v1aXEnkNm61WEtoaChnzpwhNTWVAQMGkJWVxc6dO20LJp06dYr8/HyCgoIqqNUiD5+EhASOHTvG2bNn6dChAz179iQ8PJy5c+eSnJwMQJs2bXBzc7M9LIGfJgRo5q3IvSu9p9nb21NSUkJBQQGnT59mz549JCUlceXKFVq0aIHRaMRsNmvyjIjIQ0aLmMl9U3ZwPHfuXOLj42nevDnPPvssb775pjoPIvcoKiqKVatWERAQwLVr1zh9+jQJCQkALFiwgJ07d+Lu7m6bHbh+/XqSkpLYsmULS5cuxWQyVfAZiFRep0+fpmbNmri6ulJYWMicOXPKXWN79uwhPT2dc+fO4eHhwahRo7C3t9f9S+QexMTEsGzZMsaPH8/169dZtWoVAJ988gnffvstCQkJWK1W+vXrpweOInfxf//3f9SvX/+u25XWdT9z5gyvvvoqTzzxBE5OTiQlJWEymW75AFNERCo/PdaW+6a0LqDRaGTMmDHY29uzaNEigoKCMBqNWK1WDX5F7mLfvn18+umnJCcn4+XlRWpqKtu3b2fo0KHExsYSGRnJwYMHOXDgADVq1KBfv34A5OTkUKdOHfSMTqS8sgPXvXv3EhYWxrPPPkufPn0ICAggODiYHj16sGjRIoYPH07btm1p27ZtuWNoQUCRu8vPz+ebb75h5syZdOjQgT179pCZmUlkZCQZGRk0bNiQ4cOHM2vWLA4cOKAAV+QOFi5cyNdff82YMWPuuvizwWDAYrHw1FNPsXnzZvLz86lVq5YWtBURecjpr7fcV2VD3FGjRlFQUMCUKVMoKiqiZ8+eCnBFbnLzrIjLly9Tt25dvLy8OHbsGOvWrSM0NJScnBxCQ0OJiYnB19cXHx8fMjIyiI+P58KFC6SlpZGcnIyDg0MFno1I5VL2+vrqq69wc3PjpZdeomrVqoSEhNC7d286derE+PHj2bNnDzk5OdSuXftnNTg1+BW5s4KCAiwWCydPnsTDw4MvvviCUaNGER4eTseOHZk6dSotWrTgpZdeYuLEiXh7e1d0k0Uqtfr165Oenk5SUhKvvvoqTZs2veP2pZNlnJ2dcXZ2tn2usZeIyMNL707IL3KnlbZv953RaKS4uBiA8ePH061bN2bPns3169fvSxtFHlZWq9UWLsXExLBu3TpcXFxwdHQkNzeX9evX06xZM4KCgmjdujXnzp2jd+/efP7555jNZi5dusSWLVvIz88nOTlZA2KRMspeX1FRUYwYMYKQkBAOHjyIj48PaWlpmM1mPvjgA2bPns3Bgwc5deqUFlAS+YVSUlLYuHEj1atXp0uXLkRHRzNixAimTJnCwIEDcXR0JDs7m4yMDACaNGlie+AvIrfWrVs3XnnlFcxmMx9++CHHjx//RfufPHmSa9eu6Z4mIvIQU4Ar96zszKVNmzaRlJTEwoULSU9PB26/CrfVarXV4FyzZg2dO3dm1apVuLi4PJB2izwsSjvVmzdvZu3atTz22GO0b9+e6dOnYzQa2b9/Py1atMDFxYUnn3ySbt26MX36dPz8/KhSpQodOnQgNTWVd955R+GtyE1Kr6/FixeTlpbGkiVLeP/993F0dGTmzJlcvnyZmTNnMnv2bP74xz+SlZVlq9epUiQi9+7kyZPMnz8fs9mMn58f2dnZtGnTxrZAWWm9dk9Pz3L7qSanyM+Vvf9cuXKFoqIi/vnPf/LBBx9w9OjRO+5Xet9LSUlh/PjxXLp06b63V0RE7h/1lOSelXaso6OjmTFjBqdOneLLL7/krbfeYuHChbbtynY0ynYeVq5cyaRJkyguLubJJ598sI0XeUhs27aNlJQUmjZtir+/PwAeHh5kZWVx6tQp6tWrh9VqJTExEYPBQGBgoG21Yfjp1Ti93i1ya2azmcOHDzNmzBh8fX2xWq0cP36c2rVrExUVxc6dO6lXrx5RUVEsX76c999/H0AzlkR+gdDQUOrXr09qaipdu3alX79+XL16lUGDBjFlyhSCgoLIy8sjKCioopsqUumV3n9iYmKIjIzk2WefJTAwkO+//54lS5Zw6NChn+1Tdvy1YsUK5syZw9ChQ+9pATQREam8FODKL5KRkcFnn31GXFwcM2bMoH///pw9exYfHx9ycnKA/3Y0bu48zJo1i3nz5tGpU6cKa79IZXPzzD4HBwdMJhN79+5l+/btwE/XlLe3N506daJnz5706tWL7Oxs3n33XdsxFNqK3N3169c5evQoFouFvLw8PvroI1577TXGjRuH2Wxm1qxZfPjhhwA0bdoUo9GI2Wyu2EaLPAROnz5Nbm4uAM7OzjRu3JitW7cCMHToUEaMGEFAQACFhYW0bt2atLQ07O3tdX2J3IPCwkIOHTrEqFGjGDRoEFOnTiU0NBQHBweSkpI4duyYbVuz2Vxu/BUdHc3MmTPp1q1bRTVfRER+JQpw5Y5urkd26dIlLBYLvr6+bN26lalTpzJ58mQ8PT2Jj4/nu+++s+13c+chMjKSzp07P/BzEKmsyl4nRUVFmM1m2rVrx+TJk2nZsiUpKSns27cPAJPJxOTJk4mNjWXIkCGkpaVhMpkoKSnR7ECRe+Ti4kJUVBTe3t4cOHCAgoICWrVqRaNGjXB3d8fR0ZFdu3aVe7CiBV9E7mzv3r0MHDiQt99+m507d+Lo6EhwcDD//ve/WbRoEQBt27YlJCSE6OhoxowZY3tzRNeXyN0VFRVx5swZrly5Yvusbdu29OzZk4yMjHIl7UqvqZUrV9rGX126dKmIZouIyK9MAa7cVtkFX0oXmmjcuDH16tUjNjaW8PBwxo8fz4ABAzAajaxevdr2BLh0v7LhbWntMxEpf30lJCQwceJE+vfvz/z587G3t2fcuHFUrVqV5ORk9u7dC4CrqytdunShd+/e2NnZYTabNfNW5Bdq2bIlzZs359NPP8VkMvHMM89w48YNCgsLGTx4sK08iereitzdV199hZubGy+99BJNmjQhJCSEKVOm8K9//Yvx48eTmZlJTk7OLa8n3b9Efu7myTNWq5Vq1arRq1cvNm/ebBuTAfj5+eHu7s6RI0fYs2eP7fPly5czc+ZMZs6cqfGXiMgjRAGu3FLZmYFHjx4lLCyMTz/9lMcff5xq1aoRHx/PwIEDCQwMBMDR0ZFGjRrh6upqO8amTZuIiopSeCtyC2Vrmi1evBg/Pz/69u3L1q1befPNN3nqqacYNmwYdnZ2LF++nJ07d/7sGJq5JPLLlV57LVq04Pjx4yxYsICRI0dy9epVunfvDpQvASQitxYVFcWIESMICQnh4MGD+Pj4kJaWhtls5oMPPmD27NkcPHiQU6dO6XoSuQdlH+4vXbqUadOmERcXR25uLn369MHV1ZXFixdz/PhxAPLz86latSovv/wywcHBAPzf//0f//znP4mKitLMWxGRR4zBqikmcpOyA9dFixaRnp7Ojh078PT0ZOLEifj6+vL6669To0YNfHx8aNKkCcnJyeTl5ZGammoLlTIyMsjLy8PPz68iT0ek0srPz+eNN95g8ODBdOjQgT179hASEkJkZCQNGjSgfv36ZGZmMmvWLJo3b05YWFhFN1nkkZGbm8uiRYs4ePAgtWrVYs6cOZhMJsxmsx6OiNzF4sWLSUxM5IMPPsBkMhETE0NOTg7Tpk2jVatWZGVlERcXx4YNG+jYsSOxsbF6MCJyBxaLxRbezp07l+XLl9O4cWPy8vKoUqUKixcv5vTp0yxatIhDhw7h7e1tqzu9evXqcvet8+fPa8FoEZFHkAJcua2EhAQSEhKIiIjg6tWrHDp0iBMnThAcHEzTpk2Jj49n9+7dODs74+bmxrx582yDX6PRqE66yE3KDl4LCgooKSmhS5cufPjhh+Tl5TFy5EjGjh1Lv379mDp1Ki1atOCll14iIyMDb29vW8deRH49hYWFODg4YDAYKCkp0WvdIndhNpsZM2YMbdu2JTAwkMOHDzNy5Ei8vLz48ccfCQ4OJiAgAIBDhw7h4+Oj+5fIPcrKyiI+Pp7AwECaNm3Kt99+y/z587l8+TLx8fFUqVKFrVu3cubMGWrUqMHAgQNtCwJq/CUi8mhTgCu3FR4ejqenJyEhIQDk5OSwYcMGNm/ezOjRo/H396ekpIQrV65Qs2ZNDX5F7lFKSgoODg7079+fqVOnkp2dzTfffMPUqVPp27cvAK+88goNGzZk2rRptv3Kzs4QkV+XZgeK3Jv8/HxefPFFhg0bRteuXZkxYwZ/+MMf8PPzY8KECdy4cYPAwEAGDRpk20cz20XuzGq1sm/fPoYMGWKbGNOiRQsA0tPTmT9/Pnl5ecTFxeHu7l5uX42/RER+G5QECEC5xSWsVivFxcWcPXuW8+fP2z738PCge/fuuLi4MH36dDZs2IC9vT2urq62BV/UeRC5u5MnTzJ//nzMZjN+fn5kZ2fTpk0bW63ooqIiDAYDnp6e5fZTeCty/yi8Fbk3Li4uREVF4e3tzYEDBygoKKBVq1Y0atQId3d3HB0d2bVrV7m+pcJbkZ8re40YDAb8/PwYOXIkly5d4sSJExQVFQHQvHlzQkJCqFWrFoGBgVy6dKnccTT+EhH5bdBfeyk3q6+kpASLxYKDgwOBgYEkJSWxe/du2rVrB0CdOnVo0KCBrd6ti4uL7TU5DX5Ffu5Ws2ZDQ0M5c+YMqampDBgwgKysLHbu3MmgQYPw9vbm1KlT5OfnExQUVEGtFhERub2WLVtiMBgICwvDZDLxzDPPcOPGDQoLCxk8eDC9evUCNLNd5HbK9g/z8vK4ceMGtWvXJjQ0FIAZM2bw2GOP0blzZ0wmE82aNeO1115jx44d1KhRowJbLiIiFUUB7m9c2c5DYmIi6enpODk5ERQURNeuXdm7dy8pKSlYLBb8/f3Jz8/nhx9+4Pnnn+fixYvs37/fFuCKyM+VXl+nT5+mZs2auLq64uzsTOPGjdm6dSsDBgxg6NCheHt7k56ezrlz52jdujWjRo2y1TTTzCUREalMSkPZFi1a8I9//IMFCxbw9ddfk5+fT/fu3QGFtyK3Y7Vabf3DuLg4Pv/8cy5fvoybmxuDBg0iNDSU4uJixo8fj8FgoHPnztjb29OqVStatWoFqCyJiMhvkWrgCgBRUVGsWrWKgIAArl27xunTp0lISABgwYIF7Ny5E3d3d9ur3evXrycpKYktW7awdOlSTCZTBZ+BSOVS9uHI3r17CQsL49lnn6VPnz4EBARw5coVevTowSuvvMLw4cNveQzVNBMRkcosNzeXRYsWcfDgQWrVqsWcOXNsC9oqXBK5s6SkJBYvXszbb7/Nn/70J4KCgrhx4wZLlizBw8OD2bNnk5ycTEREBD179tQ1JSLyG6dkQNi3bx+ffvopycnJeHl5kZqayvbt2xk6dCixsbFERkZy8OBBDhw4QI0aNejXrx/w06JmderUQc8ARMorG95+9dVXuLm58dJLL1G1alVCQkLo3bs3nTp1Yvz48ezZs4ecnBxq1679s5lKCm9FRKQyc3V1ZcKECRQWFuLg4KAFbUXuQel6I+np6bz++ut06dKFffv2ce7cOaZPn05WVhYFBQWEhYWRl5fHmjVr6N27d0U3W0REKph6V79BN9fkvHz5MnXr1sXLy4tjx46xbt06QkNDycnJITQ0lJiYGHx9ffHx8SEjI4P4+HguXLhAWloaycnJODg4VODZiFQuZV+Li4qKYuXKlbi6uuLh4cHw4cNJS0tjyZIlfPDBB1y8eBEHBwdOnTqFh4dHBbdcRETkf+Po6AigBW1F7oHBYMDBwQF7e3see+wxtm/fTlhYGGPHjqVHjx5MnjyZH374gcWLF/POO+9osoyIiACgJc1/Y8qGSzExMaxbtw4XFxccHR3Jzc1l/fr1NGvWjKCgIFq3bs25c+fo3bs3n3/+OWazmUuXLrFlyxby8/NJTk7G29u7gs9IpHIpnUW7ePFiW1j7/vvv4+joyMyZM7l8+TIzZ85k9uzZ/PGPfyQrK4tVq1YBqIMuIiIPNdW8Fbl3jz/+OLGxsYSHhzNhwgQGDhwIwJNPPlmuPJ3BYFAfUURENAP3t6a0Y71582bWrl1LREQE7du3p2HDhhiNRvbv309wcDAuLi48+eSTdOvWjeeeew4/Pz/s7e3p0KED/v7+mmEhcgdms5nDhw8zZswYfH19OXz4MMePH8fLy4uoqCiCg4MJCAggKiqKgQMH4uPjA2jgKyIiIvJbMWHCBDIyMiguLrYtFl21alW++eYbGjRoUG5b9RFFREQzcH+Dtm3bRkpKCk2bNsXf3x8ADw8PsrKyOHXqFPXq1cNqtZKYmIjBYCAwMBB7e3tKSkoAsLOzU3grcgfXr1/n6NGjWCwW8vLy+Oijj3jttdcYN24cZrOZWbNm8eGHHwLQtGlTjEYjZrO5YhstIiIiIg+E2WzGYDAwf/58XF1dGTJkCC+//DIDBgwgLy+PqVOnAno7S0RE/ksp3G+A1Wot99TWwcEBk8nE3r172b59Ox06dMBgMODt7U2nTp3o2bMnTz/9NABr1qyxHUOhrci9cXFxISoqCnt7ew4cOEBBQQGtWrWiUaNGuLu785///Iddu3bx6quv2q5NrSwsIiIi8ttgZ2eHxWKhevXqrF27lvXr15OXl4eDgwP9+vWzTZ7R+EtEREoZrHqs90gru2BZUVERdnZ22NnZcfr0aaKjoykuLmbIkCG0adMGgNzcXL755hsKCgro2bMndnZ26jyI/A9KH5yEhYVRUlJCTEwMN27cIDg4mBdffJFevXqV205EREREflvMZvMtH+Jr/CUiIjdTgPsIKxsMJSQkcOzYMTIzM+nYsSM9evTAbDYzZ84cLBYLAwcOxM/P72fHuF2nQkTuTUpKCv/4xz/o3bs3X3/9Nfn5+Sxfvhw7OzuFtyIiIiKPkLKTZ+7l81Iac4mIyN0owP0NiImJYdmyZYwfP57r16/bVrz/5JNP+Pbbb0lISMBqtdKvXz8CAgIqtrEij5jc3FwWLVrEwYMHqVWrFnPmzMFkMqmjLiIiIvIIKRvSZmZm2tYN8fDwAG7/1lXZz48cOYK7uzuPP/74g2u4iIg8FBTgPuLy8/N54403GDx4MB06dGDPnj2EhIQQGRlJgwYNqF+/PpmZmcyaNYvmzZsTFhZW0U0WeSQVFhbi4OCAwWDQa3EiIiIij6g5c+awY8cOcnNzcXV1xd/fn7FjxwI/D3HL/n9ycjJLly7lgw8+oGHDhhXSdhERqbyUIDxiynYCCgoKsFgsnDx5Eg8PD7744gtGjRpFeHg4HTt2ZOrUqbRo0YKXXnqJiRMn4u3tXcGtF3l0OTo6AloQUERERORRlZKSwpo1a5g9ezYWi4Xz588zffp0fvzxR/7+979jMBhs47Wy47YVK1bw/vvvM336dIW3IiJyS0oRHjGlnYCUlBQcHBzo378/Xbp0ITo6mm+++YapU6fSt29fALKzs6lSpQoATZo0Ae5en0lE/v9RzVsRERGRh19WVhaPPfYY1atXt42hTpw4QZ8+fWjdurVtu9q1azNixAieeuopBg8efMvwNjo6msjISLp27VpRpyMiIpWckrpH1MmTJ5k/fz5msxk/Pz+ys7Np06aNrVNQVFSEwWDA09Oz3H4Kb0VERERERG6vuLiYzZs3s3v3bgDOnDkDwPHjx7lw4YJtO7PZTOvWrenfvz9ffvklhYWFmM1mW3i7cuVKhbciInJPNAP3EXCrWbOhoaGcOXOG1NRUBgwYQFZWFjt37mTQoEF4e3tz6tQp8vPzCQoKqqBWi4iIiIiIPHxMJhPnzp1j9erVrF27lpKSEhITE3nxxRdZunQpu3bton379rYFa11cXCgpKbGV1IKfwtsZM2bw3nvv0aVLl4o6FREReUhouuUjoDS8PX36NLm5uQA4OzvTuHFjtm7dCsDQoUMZMWIEAQEBFBYW0rp1a9LS0rC3t8dsNldY20VERERERB4277zzDi4uLuzatYvnnnsOgDZt2uDl5cWyZcvYsWMHAD/++CPffvst9erVA35aD+H8+fPs2bNH4a2IiNwzg9VqtVZ0I+R/U3bm7d69ewkLC+PZZ5+lT58+BAQEcOXKFXr06MErr7zC8OHDb3mMkpISLagkIiIiIiJyj4qLiykuLuatt96ipKSEU6dOMWLECHr27MmRI0dYvnw527Ztw83NDaPRiMFgYPXq1ZhMJtsxcnNzcXV1rcCzEBGRh4kC3IdU2fD2q6++onr16nz66adUrVqVuLg4evfuTadOnbh27Rp79uxh1KhR1K5dWwsoiYiIiIiI/IreeustvvrqK0aNGkWPHj24du0ap06d4tChQzz22GN069YNe3t7SkpKsLOz05hMRER+MQW4D6Gyq5ZGRUWxcuVKXF1d8fDwYPjw4bi7u7NkyRJOnz7NxYsXcXBwYNKkSbRv376CWy4iIiIiIvJoKPs249SpU/n6668JDg7mT3/6ExcuXMDb29u2rdlsttXEFRER+aUU4D7EFi9eTGJiIh988AEmk4mYmBhycnKYNm0arVq1Iisri7i4ODZs2EDHjh2JjY0tF/6KiIiIiIjI/65sMDt16lS++OILzGYzTz/9NAsXLtTYS0REfhUKcB9SZrOZMWPG0LZtWwIDAzl8+DAjR47Ey8uLH3/8keDgYAICAgA4dOgQPj4+tpILIiIiIiIicnsbNmzg+eefx9nZ+a7blg1x09LSuHTpEq+++qrWGhERkV+NEr2H1PXr1zl69CgWi4W8vDw++ugjXnvtNcaNG4fZbGbWrFl8+OGHADRt2hSj0YjZbK7YRouIiIiIiFRyhw4dYuzYsSxevJjr16/fdXs7OzvbWKt3794MGTLEVvNWRETk16AA9yHl4uJCVFQU3t7eHDhwgIKCAlq1akWjRo1wd3fH0dGRXbt2UXaCtWouiYiIiIiI3FnTpk2Ji4tj0aJFxMfHk5+ff9d9yoa4pTQDV0REfi0KcB9iLVu2pHnz5nz66aeYTCaeeeYZbty4QWFhIYMHDyYxMRGDwYCqZIiIiIiIiNxd6azZTp06MWPGDOLj41m2bNldQ1yr1WqbMLNhwwb+/e9/3/e2iojIb4cC3IdYaUH8Fi1acPz4cRYsWMDIkSO5evUq3bt3B9CiZSIiIiIiIveodNZsVFQUR44cwcXFhTlz5pCQkMC1a9duuU/ZMdfKlSsZO3Ys58+ff2BtFhGRR5/e6XgEvPDCC2RlZbFz505q1apFfHy87RUelU0QERERERG5d5s2bWL16tUsWLCAvn37kpmZyZQpU7C3t2fw4MHlFjYrG96uWLGC9957j9jYWNq1a1dRzRcRkUeQAtxHgKurKxMmTKCwsBAHBwcMBgMlJSWquSQiIiIiIvILZWVl0aRJE1q2bAlAkyZNqFatGqNHj8ZisfDaa6/h4uLys/A2OjqayMhIunTpUpHNFxGRR5BKKDxCHB0dbTVvFd6KiIiIiIjcmcVi+dlnrq6uXLt2jezsbNs27du3Z/jw4SxatIjFixeTn59frmxCaXjbtWvXB9p+ERH5bVCA+whSzVsREREREZE7s1gsGI0/DYnPnTvH1atXAfD19eXixYusXr2aK1eu2LZxc3Pjd7/7HSdOnLCVUUhJSeGdd95h5syZCm9FROS+0TRNERERERER+U2xWq22YHb27Nls2bKFwsJCXnnlFV599VWmTJnC6NGjuXHjBq1bt+b3v/89W7ZsoU+fPgwaNMg2acbZ2ZlZs2apbIKIiNxXBqvVaq3oRoiIiIiIiIg8CGVr127bto2IiAgmT57Mv/71L77++mv8/Pz429/+xt69e4mPjyczM5PHH38ce3t7Vq5ciclk0oLRIiLyQCnAFRERERERkd+cTZs2sXXrVpo0acLQoUMBWL58OampqTz33HMEBwdjNBr5/vvvKSoq4umnn8ZoNGrBaBEReeB01xEREREREZFHXtmZt/n5+Rw4cICtW7fy+OOP27Z5+eWXAVi9ejV2dnb07duX3/3ud7bvLRaLwlsREXngdOcRERERERGRR1rZBcuysrJwdHTE39+fxx57jMTERNq1a0e7du2An0Jco9FIfHw8tWvXpkGDBrbjlB5DRETkQVIJBREREREREflNmDt3Lrt37+batWtUr16dp556iho1avDZZ5/xzjvv0KZNG9u2W7du5fnnn1etWxERqXB6fCgiIiIiIiKPvKVLl7JixQomT57Mhx9+iI+PD2vXrsXPz49OnTrx1ltvsW/fPtv2nTp1ws7ODrPZXIGtFhERUYArIiIiIiIivwFnz55lwIABtGzZkqNHj7J+/XreffddXFxcePzxx+nZsycjRozgyJEj5fbTDFwREaloCnBFRERERETkkWW1WrFarfznP//B2dmZI0eOEB4ezpgxY+jduzeHDh1i//79+Pv7M2rUKBo3blzRTRYRESlHi5iJiIiIiIjII8tgMADw4osvMnnyZObMmcOsWbPo1asXAEVFRRQVFeHr64uvry8AZrNZM29FRKTS0AxcEREREREReeS1a9eOvn37Ur9+fR577DEAfvzxRw4cOIC7u3u5bRXeiohIZWKwWq3Wim6EiIiIiIiIyP126dIlEhISWL58OXXq1MFqtWIymVi1ahUmkwmr1WqbsSsiIlJZKMAVERERERGR34ySkhJOnDjBsWPHqFatGp06dcLOzo6SkhLs7VVlUEREKh8FuCIiIiIiIvKbppq3IiJSmSnAFREREREREREREamktIiZiIiIiIiIiIiISCWlAFdERERERERERESkklKAKyIiIiIiIiIiIlJJKcAVERERERERERERqaQU4IqIiIiIiIiIiIhUUgpwRURERERERERERCopBbgiIiIiDzmr1VrRTRARERERkftEAa6IiIjIA/DXv/4VLy8vBgwYcNttxowZg5eXFxMmTLjn4x44cIDhw4ffdbt58+bh5eV1z8cVEREREZHKwb6iGyAiIiLyW2E0GklPT+f777/H3d293HcFBQXs2LHjFx8zNTWV06dP33W7/v37065du198fBERERERqViagSsiIiLygDRp0gRHR0c2b978s+927NhB1apVefLJJ+/Lz3Z3d6d58+b35dgiIiIiInL/KMAVEREReUCcnJzw9/e/ZYC7adMmunbtir39f1+QslgsLFq0iM6dO+Pj40PXrl1ZunSp7fsJEybwySefkJ2djZeXF2vWrOG7777Dy8uLpKQk/vznP9OsWTNWr159yxIKaWlp/L//9/9o1qwZAQEBzJ49m6KiIgBu3LjB22+/Tfv27fHx8eHPf/4zS5YsuU+/GRERERERuR0FuCIiIiIPULdu3WxlFErl5+eza9cuevToUW7bt99+m9jYWHr16sXChQv585//TGRkJPPnzwdg5MiR+Pv788QTT7By5UoCAgJs+86bN49hw4Yxa9Ys/Pz8ftaOlJQUxo8fzx/+8Afi4uIYPnw4S5cu5Z133gEgMjKSXbt2MX78eJYsWULHjh2ZNWsWq1evvg+/FRERERERuR3VwBURERF5gAICAqhatSqbN29m0KBBAHz22We4ubnRsmVL23aZmZl8/PHH/O1vf7MtUta2bVsMBgPx8fH85S9/oX79+ri6uuLg4GArj1BQUADACy+8QN++fW/ZBovFwvz58+nUqZMtsAW4fv06GzdupLi4mK+++go/Pz+6d+8OQOvWrXFycsLNze3X/pWIiIiIiMgdaAauiIiIyANUpUoVOnToUK6MwsaNG3nhhRcwGAy2z7788kusVisdOnSgpKTE9q9Dhw4UFhZy4MCBO/6cxo0b3/a7zMxMLl26ROfOnct9PmTIENasWYPJZKJ169Z8/PHHDBs2jOTkZLKysggODi43y1dERERERO4/zcAVERERecBeeOEFQkJC+P7773F0dOSLL77gzTffLLfN5cuXAWwzYG92/vz5O/4MJyen235Xeuw7zaadPHky7u7urFu3jr///e/8/e9/x9fXl7fffhtvb+87/mwREREREfn1KMAVERERecDat2+Ps7MzmzdvxsnJibp16+Lj41Num+rVqwPw0Ucf4ezs/LNjeHh4/M8/v/TYubm55T7Py8sjIyMDX19fnJyceOONN3jjjTfIyclhx44dfPDBB4SFhbFx48b/+WeLiIiIiMgvoxIKIiIiIg+Yg4MDnTp1YsuWLXz66ae3nGXbqlUr4KdQ9ZlnnrH9y83NJSYmxjaL1mj85d25p556ipo1a7Jjx45yn69du5bhw4eTn59P165dSUxMBH4KiwcOHEj37t3Jycn5xT9PRERERET+d5qBKyIiIlIBunXrxuuvv47RaGTKlCk/+97Ly4tevXrx1ltvkZ2djY+PD5mZmcydO5e6devSoEED4KfZtBcvXuTzzz+/Y93bsuzs7Bg1ahTTp0/Hzc2NDh06kJmZSWxsLAMHDqRWrVr84Q9/IC4uDpPJhJeXF5mZmXzyySd07dr11/w1iIiIiIjIXSjAFREREakAbdq0oXr16tSuXZuGDRvecpuZM2cSHx/PihUr+P7773Fzc6Nbt268+eab2NnZAdCnTx8+//xzgoODCQ0NpVu3bvf08wcOHIiTkxNLlixh5cqVuLu7M2zYMIYNGwbA9OnTef/990lMTOTChQu4ubnRr18/Ro8e/ev8AkRERERE5J4YrFartaIbISIiIiIiIiIiIiI/pxq4IiIiIiIiIiIiIpWUAlwRERERERERERGRSkoBroiIiIiIiIiIiEglpQBXREREREREREREpJJSgCsiIiIiIiIiIiJSSSnAFREREREREREREamkFOCKiIiIiIiIiIiIVFIKcEVEREREREREREQqKQW4IiIiIiIiIiIiIpWUAlwRERERERERERGRSkoBroiIiIiIiIiIiEglpQBXREREREREREREpJL6/wDQGDe9RctWtgAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 1400x600 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#let's visualize our performance\n",
    "plot_performance('evaluation/json_results', ['Basic RAG'], colors=['skyblue'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Level 2: Document Summarization for Enhanced Retrieval\n",
    "\n",
    "In this section, we'll implement an improved approach to our retrieval system by incorporating document summaries. Instead of embedding chunks directly from the documents, we'll create a concise summary for each chunk and use this summary along with the original content in our embedding process.\n",
    "\n",
    "This approach aims to capture the essence of each document chunk more effectively, potentially leading to improved retrieval performance.\n",
    "\n",
    "Key steps in this process:\n",
    "1. We load the original document chunks.\n",
    "2. For each chunk, we generate a 2-3 sentence summary using Claude.\n",
    "3. We store both the original content and the summary for each chunk in a new json file: `data/anthropic_summary_indexed_docs.json`\n",
    "\n",
    "This summary-enhanced approach is designed to provide more context during the embedding and retrieval phases, potentially improving the system's ability to understand and match the most relevant documents to user queries."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generating the Summaries and Storing Them"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from anthropic import Anthropic\n",
    "from tqdm import tqdm\n",
    "\n",
    "def generate_summaries(input_file, output_file):\n",
    " \n",
    "    # Load the original documents\n",
    "    with open(input_file, 'r') as f:\n",
    "        docs = json.load(f)\n",
    "\n",
    "    # Prepare the context about the overall knowledge base\n",
    "    knowledge_base_context = \"This is documentation for Anthropic's, a frontier AI lab building Claude, an LLM that excels at a variety of general purpose tasks. These docs contain model details and documentation on Anthropic's APIs.\"\n",
    "\n",
    "    summarized_docs = []\n",
    "\n",
    "    for doc in tqdm(docs, desc=\"Generating summaries\"):\n",
    "        prompt = f\"\"\"\n",
    "        You are tasked with creating a short summary of the following content from Anthropic's documentation. \n",
    "\n",
    "        Context about the knowledge base:\n",
    "        {knowledge_base_context}\n",
    "\n",
    "        Content to summarize:\n",
    "        Heading: {doc['chunk_heading']}\n",
    "        {doc['text']}\n",
    "\n",
    "        Please provide a brief summary of the above content in 2-3 sentences. The summary should capture the key points and be concise. We will be using it as a key part of our search pipeline when answering user queries about this content. \n",
    "\n",
    "        Avoid using any preamble whatsoever in your response. Statements such as 'here is the summary' or 'the summary is as follows' are prohibited. You should get straight into the summary itself and be concise. Every word matters.\n",
    "        \"\"\"\n",
    "\n",
    "        response = client.messages.create(\n",
    "            model=\"claude-3-haiku-20240307\",\n",
    "            max_tokens=150,\n",
    "            messages=[\n",
    "                {\"role\": \"user\", \"content\": prompt}\n",
    "            ],\n",
    "            temperature=0\n",
    "        )\n",
    "\n",
    "        summary = response.content[0].text.strip()\n",
    "\n",
    "        summarized_doc = {\n",
    "            \"chunk_link\": doc[\"chunk_link\"],\n",
    "            \"chunk_heading\": doc[\"chunk_heading\"],\n",
    "            \"text\": doc[\"text\"],\n",
    "            \"summary\": summary\n",
    "        }\n",
    "        summarized_docs.append(summarized_doc)\n",
    "\n",
    "    # Save the summarized documents to a new JSON file\n",
    "    with open(output_file, 'w') as f:\n",
    "        json.dump(summarized_docs, f, indent=2)\n",
    "\n",
    "    print(f\"Summaries generated and saved to {output_file}\")\n",
    "\n",
    "# generate_summaries('data/anthropic_docs.json', 'data/anthropic_summary_indexed_docs.json')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary-Indexed Vector Database Creation\n",
    "\n",
    "Here, we're creating a new vector database that incorporates our summary-enhanced document chunks. This approach combines the original text, the chunk heading, and the newly generated summary into a single text for embedding.\n",
    "\n",
    "Key features of this process:\n",
    "1. We create embeddings for the combined text (heading + summary + original content) using the Voyage AI API.\n",
    "2. The embeddings and full metadata (including summaries) are stored in our vector database.\n",
    "3. We implement caching mechanisms to improve efficiency in repeated queries.\n",
    "4. The database is saved to disk for persistence and quick loading in future sessions.\n",
    "\n",
    "This summary-indexed approach aims to create more informative embeddings, potentially leading to more accurate and contextually relevant document retrieval."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import numpy as np\n",
    "import pickle\n",
    "import json\n",
    "import voyageai\n",
    "\n",
    "class SummaryIndexedVectorDB:\n",
    "    def __init__(self, name, api_key=None):\n",
    "        if api_key is None:\n",
    "            api_key = os.getenv(\"VOYAGE_API_KEY\")\n",
    "        self.client = voyageai.Client(api_key=api_key)\n",
    "        self.name = name\n",
    "        self.embeddings = []\n",
    "        self.metadata = []\n",
    "        self.query_cache = {}\n",
    "        self.db_path = f\"./data/{name}/summary_indexed_vector_db.pkl\"\n",
    "\n",
    "    def load_data(self, data_file):\n",
    "        # Check if the vector database is already loaded\n",
    "        if self.embeddings and self.metadata:\n",
    "            print(\"Vector database is already loaded. Skipping data loading.\")\n",
    "            return\n",
    "        # Check if vector_db.pkl exists\n",
    "        if os.path.exists(self.db_path):\n",
    "            print(\"Loading vector database from disk.\")\n",
    "            self.load_db()\n",
    "            return\n",
    "\n",
    "        with open(data_file, 'r') as f:\n",
    "            data = json.load(f)\n",
    "\n",
    "        texts = [f\"{item['chunk_heading']}\\n\\n{item['text']}\\n\\n{item['summary']}\" for item in data]  # Embed Chunk Heading + Text + Summary Together\n",
    "        # Embed more than 128 documents with a for loop\n",
    "        batch_size = 128\n",
    "        result = [\n",
    "            self.client.embed(\n",
    "                texts[i : i + batch_size],\n",
    "                model=\"voyage-2\"\n",
    "            ).embeddings\n",
    "            for i in range(0, len(texts), batch_size)\n",
    "        ]\n",
    "\n",
    "        # Flatten the embeddings\n",
    "        self.embeddings = [embedding for batch in result for embedding in batch]\n",
    "        self.metadata = data  # Store the entire item as metadata\n",
    "        self.save_db()\n",
    "        # Save the vector database to disk\n",
    "        print(\"Vector database loaded and saved.\")\n",
    "\n",
    "    def search(self, query, k=3, similarity_threshold=0.75):\n",
    "        query_embedding = None\n",
    "        if query in self.query_cache:\n",
    "            query_embedding = self.query_cache[query]\n",
    "        else:\n",
    "            query_embedding = self.client.embed([query], model=\"voyage-2\").embeddings[0]\n",
    "            self.query_cache[query] = query_embedding\n",
    "\n",
    "        if not self.embeddings:\n",
    "            raise ValueError(\"No data loaded in the vector database.\")\n",
    "\n",
    "        similarities = np.dot(self.embeddings, query_embedding)\n",
    "        top_indices = np.argsort(similarities)[::-1]\n",
    "        top_examples = []\n",
    "        \n",
    "        for idx in top_indices:\n",
    "            if similarities[idx] >= similarity_threshold:\n",
    "                example = {\n",
    "                    \"metadata\": self.metadata[idx],\n",
    "                    \"similarity\": similarities[idx],\n",
    "                }\n",
    "                top_examples.append(example)\n",
    "                \n",
    "                if len(top_examples) >= k:\n",
    "                    break\n",
    "        self.save_db()\n",
    "        return top_examples\n",
    "    \n",
    "    def save_db(self):\n",
    "        data = {\n",
    "            \"embeddings\": self.embeddings,\n",
    "            \"metadata\": self.metadata,\n",
    "            \"query_cache\": json.dumps(self.query_cache),\n",
    "        }\n",
    "\n",
    "        # Ensure the directory exists\n",
    "        os.makedirs(os.path.dirname(self.db_path), exist_ok=True)\n",
    "        \n",
    "        with open(self.db_path, \"wb\") as file:\n",
    "            pickle.dump(data, file)\n",
    "\n",
    "    def load_db(self):\n",
    "        if not os.path.exists(self.db_path):\n",
    "            raise ValueError(\"Vector database file not found. Use load_data to create a new database.\")\n",
    "        \n",
    "        with open(self.db_path, \"rb\") as file:\n",
    "            data = pickle.load(file)\n",
    "        \n",
    "        self.embeddings = data[\"embeddings\"]\n",
    "        self.metadata = data[\"metadata\"]\n",
    "        self.query_cache = json.loads(data[\"query_cache\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Enhanced Retrieval Using Summary-Indexed Embeddings\n",
    "\n",
    "In this section, we implement the retrieval process using our new summary-indexed vector database. This approach leverages the enhanced embeddings we created, which incorporate document summaries along with the original content.\n",
    "\n",
    "Key aspects of this updated retrieval process:\n",
    "1. We search the vector database using the query embedding, retrieving the top k most similar documents.\n",
    "2. For each retrieved document, we include the chunk heading, summary, and full text in the context provided to the LLM.\n",
    "3. This enriched context is then used to generate an answer to the user's query.\n",
    "\n",
    "By including summaries in both the embedding and retrieval phases, we aim to provide the LLM with a more comprehensive and focused context. This could potentially lead to more accurate and relevant answers, as the LLM has access to both a concise overview (the summary) and the detailed information (the full text) for each relevant document chunk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "def retrieve_level_two(query, db):\n",
    "    results = db.search(query, k=3)\n",
    "    context = \"\"\n",
    "    for result in results:\n",
    "        chunk = result['metadata']\n",
    "        context += f\"\\n <document> \\n {chunk['chunk_heading']}\\n\\nText\\n {chunk['text']} \\n\\nSummary: \\n {chunk['summary']} \\n </document> \\n\" #show model all 3 items\n",
    "    return results, context\n",
    "\n",
    "def answer_query_level_two(query, db):\n",
    "    documents, context = retrieve_base(query, db)\n",
    "    prompt = f\"\"\"\n",
    "    You have been tasked with helping us to answer the following query: \n",
    "    <query>\n",
    "    {query}\n",
    "    </query>\n",
    "    You have access to the following documents which are meant to provide context as you answer the query:\n",
    "    <documents>\n",
    "    {context}\n",
    "    </documents>\n",
    "    Please remain faithful to the underlying context, and only deviate from it if you are 100% sure that you know the answer already. \n",
    "    Answer the question now, and avoid providing preamble such as 'Here is the answer', etc\n",
    "    \"\"\"\n",
    "    response = client.messages.create(\n",
    "        model=\"claude-3-haiku-20240307\",\n",
    "        max_tokens=2500,\n",
    "        messages=[\n",
    "            {\"role\": \"user\", \"content\": prompt}\n",
    "        ],\n",
    "        temperature=0\n",
    "    )\n",
    "    return response.content[0].text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading vector database from disk.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  12%|\u2588\u258f        | 12/100 [00:00<00:05, 16.06it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 10/100 items. Current Avg Precision: 0.5000, Avg Recall: 0.8000, Avg MRR: 0.8500\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  22%|\u2588\u2588\u258f       | 22/100 [00:01<00:04, 15.74it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 20/100 items. Current Avg Precision: 0.4000, Avg Recall: 0.6750, Avg MRR: 0.6667\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  32%|\u2588\u2588\u2588\u258f      | 32/100 [00:01<00:04, 16.51it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 30/100 items. Current Avg Precision: 0.4333, Avg Recall: 0.7000, Avg MRR: 0.7222\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  42%|\u2588\u2588\u2588\u2588\u258f     | 42/100 [00:02<00:03, 17.05it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 40/100 items. Current Avg Precision: 0.4667, Avg Recall: 0.7125, Avg MRR: 0.7667\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  52%|\u2588\u2588\u2588\u2588\u2588\u258f    | 52/100 [00:03<00:02, 16.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 50/100 items. Current Avg Precision: 0.4600, Avg Recall: 0.7200, Avg MRR: 0.7700\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  62%|\u2588\u2588\u2588\u2588\u2588\u2588\u258f   | 62/100 [00:03<00:02, 17.23it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 60/100 items. Current Avg Precision: 0.4611, Avg Recall: 0.7361, Avg MRR: 0.8000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  72%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f  | 72/100 [00:04<00:01, 17.01it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 70/100 items. Current Avg Precision: 0.4429, Avg Recall: 0.7060, Avg MRR: 0.7595\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  82%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f | 82/100 [00:05<00:01, 15.70it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 80/100 items. Current Avg Precision: 0.4583, Avg Recall: 0.7302, Avg MRR: 0.7896\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  92%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 92/100 [00:05<00:00, 15.71it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 90/100 items. Current Avg Precision: 0.4593, Avg Recall: 0.7287, Avg MRR: 0.7889\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100/100 [00:06<00:00, 16.18it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processed 100/100 items. Current Avg Precision: 0.4533, Avg Recall: 0.7142, Avg MRR: 0.7733\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   1%|          | 1/100 [00:04<07:26,  4.51s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key information from the Correct Answer about creating multiple test cases - specifically that you need to click the 'Add Test Case' button and fill in values for variables in your prompt for each new test case. While the Generated Answer provides additional details about running the evaluation suite and updating prompts, these extra details don't contradict the core information, they just provide supplementary context. The fundamental process described for creating multiple test cases is the same in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   2%|\u258f         | 2/100 [00:10<08:26,  5.17s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key points from the Correct Answer:\n",
      "1. It correctly identifies Voyage AI as Anthropic's recommended embeddings provider\n",
      "2. It mentions that Voyage AI offers customized/domain-specific models\n",
      "3. It notes that Voyage AI provides bespoke fine-tuned models for individual customers\n",
      "4. It mentions specific domains like finance and healthcare\n",
      "\n",
      "While the Generated Answer provides more specific details about Voyage AI's model offerings that aren't mentioned in the Correct Answer, this additional information doesn't contradict anything in the Correct Answer - it simply provides more detail. The core substance of both answers is the same: Voyage AI is recommended and offers both domain-specific and customized models.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   3%|\u258e         | 3/100 [00:15<08:43,  5.40s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the key points mentioned in the Correct Answer and even provides additional helpful details. Both answers mention the same key success metrics: accuracy, F1 score, consistency, structure, speed, and bias/fairness. Both answers also address the relationship between model choice and latency. While the Generated Answer provides more specific details about model options (mentioning claude-3-haiku and Sonnet specifically), this additional detail doesn't contradict the Correct Answer but rather expands upon it. The core message about choosing the right model to reduce latency while meeting performance requirements is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   4%|\u258d         | 4/100 [00:19<07:45,  4.84s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is partially correct but misses a key element from the correct answer. While both answers correctly identify parallel evaluation/testing as one advantage, the second point differs significantly. The correct answer specifically mentions Claude for Sheets' excellence at office tasks like survey analysis and online data processing, while the generated answer instead talks about an integrated workflow and centralized environment. This represents a substantial difference in the functionality being described. Since one of the two key advantages is missing from the generated answer, it cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   5%|\u258c         | 5/100 [00:24<07:29,  4.73s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that missing the \"\\n\\nHuman:\" and \"\\n\\nAssistant:\" turns in the prompt will result in an API error. The Generated Answer actually provides slightly more context by explaining that these turns are expected to indicate the start of human input and assistant response, but this additional detail doesn't change the fundamental correctness of the answer. There are no contradictions between the two answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   6%|\u258c         | 6/100 [00:30<08:16,  5.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer:\n",
      "\n",
      "1. It correctly states that tool use requests are priced the same way as regular API requests\n",
      "2. It accurately lists all the additional token sources that contribute to the total cost:\n",
      "   - Tools parameter\n",
      "   - Tool use content blocks\n",
      "   - Tool result content blocks\n",
      "   - Special system prompt\n",
      "3. It explains that these additional tokens are added to the normal input/output tokens to calculate the total cost\n",
      "\n",
      "The Generated Answer actually provides slightly more detail than the Correct Answer, but doesn't contradict it in any way. The core message that tool use requests follow the same pricing structure but include additional tokens that affect the total cost is preserved in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   7%|\u258b         | 7/100 [00:34<07:15,  4.69s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the essential information from the Correct Answer - specifically the release date (June 27th, 2024) and what features will be available (API usage, billing details, and rate limits). While the Correct Answer provides slightly more detail by mentioning the specific tabs (Usage, Cost, and Rate Limits), this is a minor detail that doesn't change the core meaning. Both answers convey the same fundamental information about what will be available and when.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   8%|\u258a         | 8/100 [00:39<07:21,  4.80s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses a critical element from the Correct Answer. While both answers discuss latency implications of CoT, the Generated Answer fails to mention one of the key decision factors - whether the task requires in-depth thinking that a human would need to work through. The Generated Answer focuses heavily on performance and latency considerations, essentially repeating the same point twice, but doesn't address the fundamental question of whether the task's complexity actually warrants using CoT in the first place. This is a significant omission since it's one of the two key factors mentioned in the Correct Answer for determining when CoT is appropriate.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   9%|\u2589         | 9/100 [00:43<07:10,  4.73s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that Claude can be used to summarize PDF documents, making it easier to understand long documents without reading everything. While the Generated Answer provides additional details about text analysis capabilities and mentions the Claude Cookbooks, these are supplementary details that don't contradict the core message. The essential functionality - uploading PDFs and getting summaries to more easily digest long documents - is accurately captured in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  10%|\u2588         | 10/100 [00:47<06:44,  4.49s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers indicate that you can view the API rate limits in a \"Rate Limits\" tab within Anthropic's console interface. While the Correct Answer specifically mentions \"Developer Console\" and the Generated Answer just says \"Claude Console,\" this is a minor difference in terminology that doesn't change the core substance of the answer. Both answers convey the same essential information - that rate limits can be viewed in a dedicated Rate Limits tab.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 10/100 questions. Current Accuracy: 0.8000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  11%|\u2588         | 11/100 [00:54<07:41,  5.19s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect when compared to the correct answer. While the generated answer provides several reasonable metrics for evaluating a ticket classification system, it misses the specific key metrics mentioned in the correct answer: the 95th percentile response time and average cost per classification. The generated answer discusses cost and speed in more general terms, but doesn't mention these specific metrics that were identified in the correct answer. While the additional metrics suggested in the generated answer (like robustness, explainability, adaptability, etc.) might be useful, they don't align with the specific metrics outlined in the correct answer. Since the generated answer is missing these critical pieces of information from the correct answer, it should be marked as incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  12%|\u2588\u258f        | 12/100 [00:59<07:39,  5.22s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately describes both methods of specifying system prompts:\n",
      "\n",
      "1. For Text Completions API: Both answers indicate that the system prompt goes before the first \"\\n\\nHuman:\" turn in the prompt text.\n",
      "\n",
      "2. For Messages API: Both answers specify that the system prompt is provided using the \"system\" parameter in the API request.\n",
      "\n",
      "The Generated Answer actually provides helpful concrete code examples to illustrate these concepts, which goes beyond but doesn't contradict the Correct Answer. The substance and core information about how to specify system prompts in both APIs is consistent between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 9, column 2\n",
      "Evaluating End-to-End:  13%|\u2588\u258e        | 13/100 [01:07<08:35,  5.92s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>\n",
      "The generated answer, while detailed and structured, misses a key element from the correct answer. The correct answer specifically mentions using tags like <thinking> and <answer> in combination with chain of thought reasoning where Claude explains its step-by-step thinking process. While the generated answer does discuss using XML tags and breaking down tasks into steps, it doesn't explicitly mention the core concept of using <thinking> tags to prompt Claude to show its reasoning process.\n",
      "\n",
      "The generated answer focuses more on a general methodology of breaking down tasks and using XML tags for structure, rather than the specific combination of XML tags with chain of thought reasoning that the correct answer describes. The correct answer provides a more focused and specific approach about using tags to explicitly prompt Claude's reasoning process.\n",
      "\n",
      "Additionally, the correct answer provides a specific example of how to prompt Claude (\"Before answering, explain your reasoning step-by-step in <thinking> tags\"), which is a crucial piece of information missing from the generated answer.\n",
      "</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  14%|\u2588\u258d        | 14/100 [01:13<08:36,  6.01s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect for several reasons:\n",
      "\n",
      "1. While it correctly identifies that accuracy, cost, and response time are measured, it fails to provide the specific values that were given in the correct answer (89.01% accuracy, 1.61 seconds for 95th percentile response time, $0.0004 per request).\n",
      "\n",
      "2. The response time metric is described incorrectly - the correct answer specifically mentions \"95th percentile response time\" while the generated answer refers to \"average time.\"\n",
      "\n",
      "3. The cost metric is described differently - the correct answer specifies \"cost per request routing\" while the generated answer refers to \"total cost.\"\n",
      "\n",
      "4. The generated answer includes placeholder text ([RESULT_ACCURACY], [RESULT_COST], [RESULT_RESPONSE_TIME]) instead of actual values.\n",
      "\n",
      "These differences and omissions make the generated answer incomplete and partially incorrect compared to the correct answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  15%|\u2588\u258c        | 15/100 [01:18<07:52,  5.55s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all three key elements from the Correct Answer:\n",
      "1. Having clear success criteria\n",
      "2. Having ways to empirically test against those criteria\n",
      "3. Having a first draft prompt to improve\n",
      "\n",
      "The Generated Answer even presents these points in the same order as the Correct Answer. While it adds an additional detail about using the prompt generator in the Claude Console, this extra information doesn't contradict the core message and doesn't affect the fundamental correctness of the answer. The substance and main requirements are identical between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  16%|\u2588\u258c        | 16/100 [01:22<07:11,  5.14s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key distinction between how mid-response prompting works in both APIs:\n",
      "1. For the Text Completions API, it mentions that you can pre-fill part of the response in the prompt\n",
      "2. For the Messages API, it explains that you can continue a response by setting the last message to have the assistant role\n",
      "\n",
      "The Generated Answer essentially communicates the same information as the Correct Answer, just with slightly more detailed wording. There are no contradictions or missing critical pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  17%|\u2588\u258b        | 17/100 [01:29<07:51,  5.68s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the essential point made in the Correct Answer - that Claude's response with a role-based system prompt produces a more detailed, structured, and actionable financial analysis compared to not having a specific role. In fact, the Generated Answer goes into even more specific detail about how the analysis differs, breaking down concrete examples of the improvements (like flagging CAC concerns and providing strategic recommendations). While it provides more granular details than the Correct Answer, it does not contradict anything in the Correct Answer and maintains the same core message about the role-based prompt leading to more insightful and structured analysis. The key comparison point about the quality difference between role-based and non-role-based responses is preserved in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  18%|\u2588\u258a        | 18/100 [01:37<08:42,  6.38s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>\n",
      "The Generated Answer is correct as it captures the key elements from the Correct Answer:\n",
      "\n",
      "1. It mentions important quantitative metrics (F1 score is explicitly mentioned in both answers)\n",
      "2. It explains that specific targets should be determined based on industry benchmarks and prior experiments (mentioned in both answers)\n",
      "\n",
      "While the Generated Answer provides more specific examples and detailed metrics (like response time and percentage of non-toxic outputs) that aren't in the Correct Answer, this additional detail doesn't contradict the core message. It's simply elaborating on the basic framework established in the Correct Answer.\n",
      "\n",
      "The Generated Answer maintains the essential substance of the Correct Answer - that quantitative metrics should be used to evaluate sentiment analysis models, and that targets should be set based on industry standards and prior experience. There are no critical omissions or contradictions between the two answers.\n",
      "</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 9, column 182\n",
      "Evaluating End-to-End:  19%|\u2588\u2589        | 19/100 [01:41<07:41,  5.70s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key elements from the Correct Answer:\n",
      "1. The core concept of combining XML tags with other prompt engineering techniques\n",
      "2. Specifically mentions multishot prompting using <examples> tags\n",
      "3. Mentions chain of thought using <thinking> and <answer> tags\n",
      "4. Notes that this creates \"super-structured, high-performance prompts\"\n",
      "\n",
      "While the wording is slightly different, the substance and meaning are identical. There are no missing critical pieces of information and no contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  20%|\u2588\u2588        | 20/100 [01:48<08:10,  6.13s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the essential elements from the Correct Answer. Both answers describe using Claude to grade LLM outputs by:\n",
      "\n",
      "1. Providing it with the output to be graded\n",
      "2. Using a detailed rubric as evaluation criteria\n",
      "3. Having the LLM evaluate the output against the rubric to determine correctness\n",
      "\n",
      "While the Generated Answer goes into more implementation details (like specific functions and steps), the core concept matches the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information from the Correct Answer are missing from the Generated Answer. The differences are mainly in the level of detail provided, not in the fundamental approach described.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 20/100 questions. Current Accuracy: 0.7000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  21%|\u2588\u2588        | 21/100 [01:53<07:43,  5.87s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it contains all the essential steps and information present in the Correct Answer. Both answers outline the same key process:\n",
      "1. Accessing/subscribing to the model on AWS Marketplace\n",
      "2. Selecting the model and agreeing to terms\n",
      "3. Obtaining the Product ARN for the region\n",
      "4. Creating a JupyterLab space in SageMaker Studio\n",
      "5. Using Voyage's notebook to deploy the model with the ARN\n",
      "\n",
      "The Generated Answer actually provides slightly more detail in its step-by-step breakdown, but the core substance matches the Correct Answer completely. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  22%|\u2588\u2588\u258f       | 22/100 [02:00<07:52,  6.06s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect because it misses several key points from the correct answer and provides different guidance. Specifically:\n",
      "\n",
      "1. The correct answer emphasizes using a SINGLE tool, while the generated answer talks about tools in plural without this key specification.\n",
      "\n",
      "2. The correct answer mentions explicitly setting tool_choice to instruct the model to use the tool, which is completely missing from the generated answer.\n",
      "\n",
      "3. The correct answer emphasizes that tool names/descriptions should be from the model's perspective since it will pass the input to the tool - this important perspective consideration is missing from the generated answer.\n",
      "\n",
      "Instead, the generated answer focuses more on the general process of tool usage and implementation details that weren't part of the core guidance in the correct answer. While some of the implementation details provided might be useful, it misses the specific key points that were identified as critical for getting JSON output using tools.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  23%|\u2588\u2588\u258e       | 23/100 [02:07<08:11,  6.39s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more detailed information than the Correct Answer while maintaining all the key points. The Generated Answer covers all the essential elements mentioned in the Correct Answer:\n",
      "\n",
      "1. Vision capabilities of Claude 3 Haiku\n",
      "2. Faster and more performant nature compared to Claude Instant 1.2\n",
      "3. Higher intelligence/better capabilities\n",
      "4. More up-to-date capabilities (implied through the description of advanced features)\n",
      "\n",
      "The Generated Answer then goes beyond this to provide additional details about context windows, specific pricing, and other technical specifications. While these details aren't in the Correct Answer, they don't contradict it and only serve to provide more context. The core message and key differences between the models are accurately represented in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  24%|\u2588\u2588\u258d       | 24/100 [02:10<07:02,  5.56s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers emphasize the same key point - that using examples helps reduce misinterpretation of instructions when working with Claude. While the Generated Answer adds some additional detail about enforcing uniform structure and style, this doesn't contradict the core message, and the fundamental benefit of reducing misinterpretation is clearly stated in both answers. The Generated Answer effectively captures the essential concept presented in the Correct Answer, just with slightly different wording and additional context.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  25%|\u2588\u2588\u258c       | 25/100 [02:16<06:58,  5.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer, while providing additional details about resource efficiency and other benefits, does not directly address the key advantage mentioned in the Correct Answer - which is the ability to adapt models to new domains by providing domain-specific context in prompts without retraining. While the Generated Answer isn't wrong in what it states, it misses this critical piece of information about domain adaptation through context provision that is central to the Correct Answer. The Generated Answer focuses more on operational benefits (resource efficiency, speed, etc.) rather than the core functional advantage of domain adaptation through contextual prompting that was specified in the Correct Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  26%|\u2588\u2588\u258c       | 26/100 [02:20<06:24,  5.20s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that users can get started quickly by making a copy of Anthropic's provided Claude for Sheets template workbook. While the Generated Answer provides additional details about next steps after copying the template, the fundamental starting point matches the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes the key piece of information about making a copy of the template.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  27%|\u2588\u2588\u258b       | 27/100 [02:25<06:15,  5.15s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the essential meaning of the Correct Answer. Both answers explain that:\n",
      "\n",
      "1. The \"index\" field identifies which specific content block the text delta applies to\n",
      "2. The field is used to track/update content for specific blocks in the response\n",
      "3. Both imply the relationship between the index and the streaming of text content\n",
      "\n",
      "While they use slightly different wording and structure, the fundamental explanation of how the index field relates to text streaming and content blocks is consistent between both answers. The Generated Answer may be more technical in its explanation about \"cumulative results\" and \"Message content array,\" but it doesn't contradict or miss any critical information from the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  28%|\u2588\u2588\u258a       | 28/100 [02:31<06:27,  5.39s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides additional helpful details beyond the Correct Answer. Both answers agree on the key points:\n",
      "\n",
      "1. Images must be base64-encoded\n",
      "2. The supported formats are JPEG, PNG, GIF, and WebP\n",
      "3. Images are included as part of the message content\n",
      "\n",
      "The Generated Answer provides extra information about file size limits and maximum number of images per request, but this additional information doesn't contradict the Correct Answer - it simply provides more detail. The slight differences in how they describe the technical implementation (e.g., \"image content block\" vs \"content field with type set to image\") are minor variations in wording that describe the same fundamental concept.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  29%|\u2588\u2588\u2589       | 29/100 [02:37<06:28,  5.47s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that TTFT is a specific component of overall latency, measuring specifically the time to generate the first token of a response. The Generated Answer actually provides additional relevant context about factors affecting TTFT and latency, but this extra information doesn't contradict the Correct Answer - it merely elaborates on it. The key relationship between TTFT and latency is accurately captured in both answers, with both emphasizing that TTFT is a component of overall latency that specifically measures the time to first token generation. The Generated Answer maintains the same essential meaning as the Correct Answer, just expressed with slightly different wording.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  30%|\u2588\u2588\u2588       | 30/100 [02:44<06:44,  5.78s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that providing examples of edge cases to Claude can improve its performance in routing support tickets. The Generated Answer actually goes into more detail by breaking down specific types of edge cases (implicit requests, emotional prioritization, intent vs. routing, and issue prioritization) and explaining how each type of example can help improve Claude's performance. While it provides more detail than the Correct Answer, it doesn't contradict it and maintains the same fundamental point about examples improving Claude's ability to handle edge cases in ticket routing. The substance and main message are aligned between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 30/100 questions. Current Accuracy: 0.7333\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  31%|\u2588\u2588\u2588       | 31/100 [02:50<06:51,  5.96s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the essential elements of the Correct Answer. Both answers describe:\n",
      "\n",
      "1. That Claude determines when a tool is needed and generates a tool use request\n",
      "2. That this results in a stop_reason of \"tool_use\"\n",
      "3. That the user needs to extract the tool input from Claude's request\n",
      "4. That the tool execution happens client-side\n",
      "5. That the results need to be sent back to Claude\n",
      "\n",
      "The Generated Answer actually provides slightly more detail in some areas, but doesn't contradict anything in the Correct Answer. The core workflow and relationship between the stop_reason=\"tool_use\" and the overall tool integration process is accurately represented in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  32%|\u2588\u2588\u2588\u258f      | 32/100 [02:54<06:08,  5.43s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key elements from the Correct Answer:\n",
      "1. It correctly identifies the error event as \"overloaded_error\"\n",
      "2. It specifies that this occurs during periods of high usage\n",
      "3. It correctly states that this corresponds to HTTP 529 error code in non-streaming contexts\n",
      "4. It properly contextualizes this within streaming responses\n",
      "\n",
      "The Generated Answer simply rephrases the same information in a slightly different way, but maintains all the critical substance and technical details. There are no contradictions or missing pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  33%|\u2588\u2588\u2588\u258e      | 33/100 [02:58<05:34,  5.00s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It identifies both types of deltas that can be contained in a content_block_delta event: text_delta and input_json_delta. While the formatting and presentation are slightly different (using a numbered list instead of prose), the substance and key information are exactly the same as the Correct Answer. Both answers convey the same two specific delta types without any omissions or contradictions.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  34%|\u2588\u2588\u2588\u258d      | 34/100 [03:03<05:20,  4.86s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. According to the Correct Answer, Claude 3.5 Sonnet and tool use became generally available on different dates:\n",
      "- Claude 3.5 Sonnet: June 20th, 2024\n",
      "- Tool use: May 30th, 2024\n",
      "\n",
      "The Generated Answer incorrectly states that both became available on the same date (June 20th, 2024). This is a critical factual error as it misses the key distinction that these were separate releases with different availability dates. The difference in timing between these releases is an important piece of information that is missing from the Generated Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  35%|\u2588\u2588\u2588\u258c      | 35/100 [03:06<04:44,  4.38s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct in substance. While it doesn't include the specific timing (May 2024 for Europe and June 2024 for Canada), it accurately captures the key information about the order of launches - that Anthropic launched Claude.ai and the Claude iOS app in Europe first, followed by Canada. The omission of specific months doesn't change the fundamental accuracy of the sequence of events described.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  36%|\u2588\u2588\u2588\u258c      | 36/100 [03:11<04:56,  4.64s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the essential elements from the Correct Answer:\n",
      "\n",
      "1. It correctly identifies that \"tool_use\" indicates Claude has decided to use a tool\n",
      "2. It outlines the same key steps that need to be taken:\n",
      "   - Extracting the tool name and input\n",
      "   - Executing the tool code client-side\n",
      "   - Sending back results in a tool_result content block\n",
      "\n",
      "While the wording is slightly different, the substance and technical accuracy are completely aligned with the Correct Answer. There are no missing critical pieces of information and no contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  37%|\u2588\u2588\u2588\u258b      | 37/100 [03:15<04:42,  4.49s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same essential information as the Correct Answer. Both answers indicate that the anthropic library is used to interact with Claude/Anthropic's AI capabilities. While the Generated Answer provides slightly more detail by explaining what the anthropic library does, the core substance - that the anthropic library is the Python library used in the example - is consistent between both answers. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  38%|\u2588\u2588\u2588\u258a      | 38/100 [03:20<04:48,  4.66s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both main authentication methods described in the Correct Answer:\n",
      "\n",
      "1. Direct provision of AWS credentials (access key, secret key, and optional session token)\n",
      "2. Using default AWS credential providers (including both the ~/.aws/credentials file and environment variables)\n",
      "\n",
      "The Generated Answer conveys the same essential information as the Correct Answer, just with slightly different wording. There are no missing critical pieces of information and no contradictions between the two answers. The substance and meaning are equivalent.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  39%|\u2588\u2588\u2588\u2589      | 39/100 [03:25<04:51,  4.78s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the same two key factors mentioned in the Correct Answer:\n",
      "\n",
      "1. The risk/potential of prompt leaks (protecting sensitive information)\n",
      "2. The impact on model performance due to added complexity\n",
      "\n",
      "While the Generated Answer elaborates more on each factor with additional examples and details, the core substance and trade-off described is identical to the Correct Answer. Both answers emphasize the need to balance protecting against leaks with maintaining model performance. There are no contradictions between the two answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  40%|\u2588\u2588\u2588\u2588      | 40/100 [03:31<04:57,  4.96s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. Anthropic offers different Claude models with varying capabilities and performance characteristics\n",
      "2. Selecting the right model that matches your specific needs helps optimize for speed and performance\n",
      "3. The choice of model affects the balance of performance and output quality\n",
      "\n",
      "While the Generated Answer provides additional details about model families and the model overview page, these don't contradict the Correct Answer but rather expand upon it. The fundamental point about choosing the appropriate model to reduce latency is preserved in both answers. There are no critical omissions or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 40/100 questions. Current Accuracy: 0.7750\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  41%|\u2588\u2588\u2588\u2588      | 41/100 [03:36<05:02,  5.13s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the essential information from the Correct Answer and even provides more detailed implementation examples. Both answers highlight the key points that:\n",
      "\n",
      "1. You use the client.messages.stream() method\n",
      "2. You iterate over the stream.text_stream attribute in a for loop\n",
      "\n",
      "The Generated Answer expands on this with a practical code example and additional context, but the core information matches perfectly with the Correct Answer. There are no contradictions or missing critical pieces between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  42%|\u2588\u2588\u2588\u2588\u258f     | 42/100 [03:42<04:59,  5.16s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both key points from the Correct Answer:\n",
      "\n",
      "1. It explains that you can guide Claude's response by pre-filling part of it in the messages list (though it specifically mentions the \"assistant\" message, which is just a more detailed explanation of the same concept)\n",
      "\n",
      "2. It correctly identifies that the \"max_tokens\" parameter is used to generate short responses by limiting the length of the output\n",
      "\n",
      "The substance and main concepts are the same between both answers, even though the exact wording differs slightly. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  43%|\u2588\u2588\u2588\u2588\u258e     | 43/100 [03:46<04:50,  5.09s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core message: that when building an eval set, it's better to have a larger number of test cases with automated grading rather than fewer test cases with high-quality human grading. The Generated Answer expands on this with additional details about automated grading methods, but the fundamental point matches exactly with the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes all critical information from the Correct Answer. While the Generated Answer provides more detail, this additional context doesn't change or contradict the main point.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  44%|\u2588\u2588\u2588\u2588\u258d     | 44/100 [03:51<04:28,  4.79s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses a critical required field mentioned in the Correct Answer. According to the Correct Answer, the required fields are \"index\" and \"delta\" (where \"delta\" contains the type and text). The Generated Answer instead lists \"type\" and \"text\" as the required fields, which is not accurate according to the Correct Answer. While the Generated Answer correctly identifies that there needs to be text content and a type specification, it fails to mention the required \"index\" field and misunderstands the structure where \"type\" and \"text\" are nested within a \"delta\" field.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  45%|\u2588\u2588\u2588\u2588\u258c     | 45/100 [03:55<04:21,  4.76s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses a critical piece of information from the Correct Answer. While it correctly mentions the Claude Cookbooks as one interactive way to learn Claude's capabilities, it completely fails to mention the Developer Console and its prompt generator tool, which is the second key interactive learning method specified in the Correct Answer. Instead, it incorrectly references \"Claude for Sheets usage examples\" as the second method, which wasn't mentioned in the Correct Answer at all. The omission of the Developer Console and the inclusion of incorrect information makes this answer incomplete and partially inaccurate.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  46%|\u2588\u2588\u2588\u2588\u258c     | 46/100 [04:00<04:19,  4.81s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. The core concept from the Correct Answer - that breaking tasks into subtasks improves accuracy because each subtask gets Claude's full attention and reduces errors compared to handling everything at once - is fully captured in the Generated Answer's first point about accuracy. While the Generated Answer goes on to provide additional points about clarity and traceability, these are supplementary details that don't contradict the core concept. The essential reasoning about improved accuracy through focused attention on subtasks is present and aligned between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  47%|\u2588\u2588\u2588\u2588\u258b     | 47/100 [04:06<04:28,  5.06s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key distinction mentioned in the Correct Answer - that Messages streaming responses can contain multiple content blocks of varying types, making them more complex than Text Completions streaming. While the Generated Answer provides additional details about the specific implementation differences, its core message aligns with the Correct Answer's main point about the fundamental difference in complexity and structure between the two streaming formats. There are no contradictions between the answers, and the Generated Answer includes all critical information from the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  48%|\u2588\u2588\u2588\u2588\u258a     | 48/100 [04:10<04:12,  4.86s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is partially incorrect. While it correctly mentions claude.ai (the web Console) as one way to experiment with Claude, it incorrectly lists the Quickstart guide/API call as the second method instead of just the web Console. The Correct Answer specifically states that the two ways are claude.ai and Anthropic's web Console, which are essentially referring to the same interface. The Generated Answer introduces a different method (API calls) that wasn't mentioned in the Correct Answer. This represents a substantive difference in the information provided, not just a difference in wording.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  49%|\u2588\u2588\u2588\u2588\u2589     | 49/100 [04:16<04:20,  5.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that chain prompts help reduce errors and inconsistencies by breaking complex tasks into smaller, more manageable subtasks that Claude can focus on individually. While the Generated Answer provides more detailed explanations and additional benefits (like traceability and debugging), it doesn't contradict the Correct Answer. The fundamental principle - that breaking tasks into smaller pieces helps reduce errors and maintain consistency - is preserved in both answers. The additional details in the Generated Answer simply elaborate on the basic concept without changing its essential meaning.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  50%|\u2588\u2588\u2588\u2588\u2588     | 50/100 [04:21<04:10,  5.01s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers state that an overloaded_error event corresponds to HTTP status code 529 in a non-streaming context for the Claude API. While the Correct Answer uses slightly more formal language (\"would normally correspond to\"), the core information - the 529 status code - is identical in both answers. The difference in phrasing does not change the fundamental meaning or accuracy of the response.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 50/100 questions. Current Accuracy: 0.7600\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  51%|\u2588\u2588\u2588\u2588\u2588     | 51/100 [04:25<03:53,  4.77s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the exact same two ways to specify the embedding format as mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers indicate that leaving the format unspecified will return embeddings as lists of floating-point numbers\n",
      "2. Both answers state that setting the format to \"base64\" will return the embeddings as Base64 encodings\n",
      "\n",
      "The Generated Answer simply presents the information in a more structured bullet-point format, but conveys the same essential information as the Correct Answer. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  52%|\u2588\u2588\u2588\u2588\u2588\u258f    | 52/100 [04:31<04:08,  5.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same essential information as the Correct Answer. Both answers explain that:\n",
      "\n",
      "1. Tool use content blocks are sent as partial JSON strings in content_block_delta events\n",
      "2. The client needs to accumulate these partial JSON strings\n",
      "3. The complete JSON can be parsed once a content_block_stop event is received\n",
      "4. Parsing can be done using Pydantic or SDK helpers\n",
      "\n",
      "The Generated Answer actually provides additional helpful detail by showing an example of the delta structure, but this doesn't contradict anything in the Correct Answer. The core concepts and process are described accurately and consistently between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  53%|\u2588\u2588\u2588\u2588\u2588\u258e    | 53/100 [04:35<03:45,  4.79s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately identifies and describes both tutorials that Anthropic offers:\n",
      "1. The GitHub tutorial which covers prompt engineering concepts with examples\n",
      "2. The Google Sheets tutorial which is described as a lighter-weight version\n",
      "\n",
      "The Generated Answer captures the key distinctions between the two tutorials and matches the substance of the Correct Answer. While the exact wording differs slightly, the essential information about both tutorials and their differences is preserved. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  54%|\u2588\u2588\u2588\u2588\u2588\u258d    | 54/100 [04:44<04:38,  6.05s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more comprehensive detail than the Correct Answer. It covers all the key points mentioned in the Correct Answer:\n",
      "\n",
      "1. The 200K token context window\n",
      "2. Tool use capabilities for integration with specialized applications\n",
      "3. Multimodal input capabilities\n",
      "4. Enterprise-grade security and data handling for sensitive information\n",
      "\n",
      "The Generated Answer then goes beyond these points to provide additional relevant details about enterprise capabilities, such as HIPAA compliance, SOC II certification, reliability features, and global language support. While it contains more information than the Correct Answer, it doesn't contradict any points and includes all the critical elements specified in the Correct Answer. The additional information simply provides more context and depth to the core capabilities mentioned in the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  55%|\u2588\u2588\u2588\u2588\u2588\u258c    | 55/100 [04:47<03:50,  5.12s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it omits a key region where Claude.ai API and iOS app are available - the United States. While the Generated Answer correctly mentions Canada and Europe, leaving out the United States represents a significant omission of information. The availability in all three regions (United States, Canada, and Europe) is a critical part of the complete and accurate answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  56%|\u2588\u2588\u2588\u2588\u2588\u258c    | 56/100 [04:54<04:06,  5.60s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer and even provides additional helpful detail while maintaining the same core information:\n",
      "\n",
      "1. It correctly identifies the two main approaches (push-based with webhooks and pull-based)\n",
      "2. It accurately describes that push-based is more scalable but has security implications due to requiring a public endpoint\n",
      "3. It correctly states that pull-based is easier to implement but has the drawback of making unnecessary calls to the support ticket system\n",
      "\n",
      "The Generated Answer expands on these points with more detail, but does not contradict or omit any critical information from the Correct Answer. The substance and main distinctions between the two approaches are preserved.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  57%|\u2588\u2588\u2588\u2588\u2588\u258b    | 57/100 [04:58<03:39,  5.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is completely correct. It contains all the key information from the Correct Answer: the release date (May 10th, 2024), what was released (a prompt generator tool), and where it's available (through the Developer Console). The wording is slightly different but conveys exactly the same information and meaning. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  58%|\u2588\u2588\u2588\u2588\u2588\u258a    | 58/100 [05:03<03:34,  5.10s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers identify the Claude 3 Sonnet model as providing the best balance of intelligence and speed for high-throughput tasks like sales forecasting and targeted marketing. While the Generated Answer provides additional details and comparisons with other models, its core conclusion matches exactly with the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes all critical information present in the Correct Answer. The additional context provided in the Generated Answer doesn't detract from its correctness.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  59%|\u2588\u2588\u2588\u2588\u2588\u2589    | 59/100 [05:07<03:26,  5.02s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information:\n",
      "\n",
      "1. They both state that you can use either dot product or cosine similarity to calculate the similarity between Voyage embedding vectors\n",
      "2. They both explain that these methods are equivalent because Voyage embeddings are normalized to length 1\n",
      "3. The Generated Answer actually provides slightly more explanation about why this equivalence exists, but this additional detail doesn't change the core correctness\n",
      "\n",
      "While the Generated Answer presents the information in a slightly different order and with different phrasing, the fundamental technical content and meaning is identical to the Correct Answer. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  60%|\u2588\u2588\u2588\u2588\u2588\u2588    | 60/100 [05:14<03:44,  5.61s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer and even expands on them in a complementary way. Both answers emphasize that examples help:\n",
      "1. Reduce misinterpretation of instructions\n",
      "2. Enforce consistent structure and style\n",
      "3. Guide Claude toward desired output/performance\n",
      "\n",
      "The Generated Answer provides additional details and examples, but these don't contradict the core message of the Correct Answer - they simply elaborate on it. The substance of both answers is fundamentally the same, even though they're worded differently. There are no critical omissions or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 60/100 questions. Current Accuracy: 0.7833\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  61%|\u2588\u2588\u2588\u2588\u2588\u2588    | 61/100 [05:20<03:32,  5.45s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately identifies and describes the two types of content block deltas:\n",
      "\n",
      "1. It correctly explains that input JSON deltas contain partial JSON strings for tool use inputs\n",
      "2. It correctly identifies text deltas as containing text content updates\n",
      "\n",
      "While the wording is slightly different from the Correct Answer, the substance and key information is the same. Both answers convey that there are two types (text and input JSON deltas), and both explain what each type contains. The Generated Answer even provides some additional context about how the JSON deltas can be accumulated, but this extra detail doesn't contradict the core information in the Correct Answer.\n",
      "\n",
      "There are no critical omissions or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  62%|\u2588\u2588\u2588\u2588\u2588\u2588\u258f   | 62/100 [05:25<03:28,  5.48s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it focuses on different capabilities than those mentioned in the Correct Answer. The Correct Answer specifically highlights question answering and text analysis as key capabilities that enable interactive systems and personalization. In contrast, the Generated Answer discusses text/code generation and tool use capabilities. While these are valid capabilities of Claude, they are not the specific ones identified in the Correct Answer as enabling interactive systems and personalized experiences. Additionally, the Correct Answer emphasizes understanding sentiment and preferences as part of personalization, which is not mentioned in the Generated Answer. The answers are discussing different aspects of Claude's capabilities without substantial overlap in their core points.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  63%|\u2588\u2588\u2588\u2588\u2588\u2588\u258e   | 63/100 [05:30<03:18,  5.38s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key elements from the Correct Answer and presents them in essentially the same order:\n",
      "\n",
      "1. Both answers mention the message_start event coming first\n",
      "2. Both describe the content blocks structure with start, delta, and stop events\n",
      "3. Both mention message_delta events\n",
      "4. Both include the final message_stop event\n",
      "5. Both note that ping events may be dispersed throughout\n",
      "\n",
      "The Generated Answer actually provides slightly more detail by explicitly mentioning that the message_start contains a Message object with empty content, but this additional detail doesn't contradict the Correct Answer. The core sequence and components are identical between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  64%|\u2588\u2588\u2588\u2588\u2588\u2588\u258d   | 64/100 [05:34<02:56,  4.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys the same key information as the Correct Answer - specifically that the Claude API allows up to 20 images per request while the claude.ai interface has a 5 image limit. While the Correct Answer uses slightly different wording (\"per turn\" vs \"per request\"), the substance and numerical limits stated are identical. There are no critical missing pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  65%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 65/100 [05:38<02:46,  4.74s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key substance of the Correct Answer, which is that when Claude's response contains an incomplete tool use block due to hitting the max_tokens limit, you should retry with a higher max_tokens value. The Generated Answer conveys the same essential instruction and solution as the Correct Answer, just with slightly different wording. There are no missing critical pieces of information or contradictions between the two answers. Both answers identify the problem (incomplete tool use block due to max_tokens limit) and provide the same solution (retry with higher max_tokens).</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  66%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 66/100 [05:42<02:32,  4.49s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it correctly identifies \"develop your test cases\" as one of the steps, it differs significantly on the other required step. The Correct Answer states that reviewing Anthropic's guide to developing test cases is the second step, while the Generated Answer claims \"build a strong input prompt\" is needed. This represents a substantive difference in the steps required, not just a minor wording variation. The Generated Answer is missing the critical component of consulting Anthropic's guide, and instead introduces a different step that isn't mentioned in the Correct Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  67%|\u2588\u2588\u2588\u2588\u2588\u2588\u258b   | 67/100 [05:50<02:57,  5.37s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is partially correct but includes additional information that goes beyond what is specified in the correct answer and may not be accurate according to Anthropic's documentation. While it correctly mentions that you can pre-fill Claude's response using the \"Assistant\" role in messages, it adds several other claims about system prompts, simulating conversations, and max_tokens that aren't mentioned in the correct answer and may not be accurate implementations of the content parameter specifically. The core functionality - using the content parameter with assistant role to pre-fill responses - is present in the generated answer, but it's mixed with other unverified claims. Since we want to be strict about accuracy when dealing with documentation, and the answer includes potential misinformation alongside the correct information, it should be marked as incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  68%|\u2588\u2588\u2588\u2588\u2588\u2588\u258a   | 68/100 [05:55<02:48,  5.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both key advantages mentioned in the Correct Answer:\n",
      "\n",
      "1. It correctly states that prompt engineering preserves general knowledge while fine-tuning risks catastrophic forgetting\n",
      "2. It accurately notes that prompt engineering is more effective at helping models understand and utilize external content/retrieved documents\n",
      "\n",
      "The Generated Answer essentially restates the same two key points from the Correct Answer, just with slightly different wording. There are no missing critical pieces of information and no contradictions between the two answers. The substance and meaning are identical.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  69%|\u2588\u2588\u2588\u2588\u2588\u2588\u2589   | 69/100 [05:59<02:38,  5.12s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. The key difference is that the Generated Answer describes steps for using Anthropic's direct API (obtaining an Anthropic account/API key), while the Correct Answer specifically addresses Bedrock API integration, which requires AWS CLI configuration and a Bedrock SDK. These are fundamentally different authentication and setup processes. The Generated Answer misses the critical AWS-specific components required for Bedrock integration and instead describes a different API access path entirely. This represents a substantive difference in the technical requirements and implementation approach, not just a difference in wording.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  70%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 70/100 [06:05<02:35,  5.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It provides the exact same command structure and functionality as the Correct Answer, including:\n",
      "1. The correct AWS CLI command `aws bedrock list-foundation-models`\n",
      "2. The correct use of the `--region` parameter\n",
      "3. The correct use of `--by-provider anthropic`\n",
      "4. The correct query parameter to get model IDs\n",
      "5. A specific example using `us-west-2` region\n",
      "\n",
      "The Generated Answer conveys the same essential information and instructions as the Correct Answer, just with slightly different wording in the explanatory text. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 70/100 questions. Current Accuracy: 0.7571\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  71%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 71/100 [06:09<02:24,  4.99s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers identify that the `input_type` argument/parameter can be passed to specify whether the input text is a query or document. The Generated Answer actually provides additional detail about how the input_type parameter affects processing, but the core information matches exactly with the Correct Answer. Both answers specifically mention that \"query\" and \"document\" are the possible values. There are no contradictions between the answers, and the Generated Answer includes all critical information present in the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  72%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f  | 72/100 [06:14<02:15,  4.84s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is missing a critical piece of information that is present in the Correct Answer. While it correctly describes the basic difference between tool_use deltas (partial JSON strings for input field) and text deltas (simple text updates), it fails to mention that tool_use deltas may have delays between streaming events as the model emits one complete key-value pair at a time. This timing/delay characteristic is an important distinction mentioned in the Correct Answer that is completely absent from the Generated Answer. Since this represents a meaningful omission of a key technical detail about how the streaming works, the Generated Answer cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  73%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e  | 73/100 [06:19<02:09,  4.81s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It provides the exact same file size limits as the Correct Answer - 5MB for API uploads and 10MB for claude.ai uploads. The Generated Answer simply presents this information in a slightly different format (bullet points) and adds a minor detail about error messages, but the core information about the file size limits matches perfectly with the Correct Answer. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  74%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d  | 74/100 [06:23<02:02,  4.70s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers emphasize the key point of choosing a model that appropriately balances requirements for the specific use case. The Generated Answer actually provides more detail by mentioning Claude 3 Haiku as a specific example, but the core message about selecting a model based on the balance of speed/latency and output quality is present in both answers. There are no contradictions between the two answers, and the Generated Answer captures the essential consideration mentioned in the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  75%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 75/100 [06:28<01:58,  4.75s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the key points from the Correct Answer:\n",
      "1. It correctly identifies voyage-code-2 as the recommended embedding model\n",
      "2. It correctly states that according to Voyage AI, the model offers 17% better performance compared to alternatives\n",
      "\n",
      "The only minor difference is that the Generated Answer doesn't mention that the model achieves state-of-the-art results on general-purpose corpora. However, this is a supplementary detail rather than a critical piece of information about the core recommendation and performance comparison. The essential substance about the model recommendation and its 17% performance improvement is accurately conveyed.\n",
      "\n",
      "Since the Generated Answer maintains the core accuracy of the information without any contradictions, just omitting a non-critical detail, it should be considered correct.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  76%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 76/100 [06:32<01:52,  4.70s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is essentially correct. Both answers highlight that the Claude Cookbooks provide interactive Jupyter notebooks that demonstrate API functionality, specifically mentioning PDF uploads and embeddings. While the Generated Answer splits this into two points and adds some additional context about hands-on learning, the core information matches the Correct Answer. There are no contradictions or missing critical pieces of information between the two answers - they're conveying the same fundamental message about how the Cookbook helps developers learn through interactive notebooks and demonstrations.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  77%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b  | 77/100 [06:38<01:56,  5.08s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the core concept presented in the Correct Answer - that the size of the context window directly impacts how much retrieved information can be utilized in RAG. Both answers emphasize that larger context windows allow for better utilization of retrieved information, leading to improved performance. While the Generated Answer goes into more detail about specific implications and practical considerations (like computational resources and latency), it doesn't contradict the Correct Answer. The fundamental relationship between context window size and RAG effectiveness is consistently presented in both answers. The Generated Answer simply elaborates on the basic principle established in the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  78%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a  | 78/100 [06:44<01:56,  5.29s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the key points from the Correct Answer and even expands on them in a helpful way. Both answers emphasize:\n",
      "\n",
      "1. The tool's ability to identify edge cases where prompts might not perform well\n",
      "2. The capability to rate individual results to assess prompt performance\n",
      "3. The importance of ensuring consistent performance across different inputs\n",
      "4. The ability to review results and spot patterns for making improvements\n",
      "5. The ultimate goal of creating more robust and reliable AI applications\n",
      "\n",
      "The Generated Answer adds some additional context about the beta status and feedback process, but this doesn't contradict anything in the Correct Answer - it just provides extra information. The core substance and main points about how the Evaluation tool helps improve prompts are consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  79%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589  | 79/100 [06:47<01:38,  4.68s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers state that Claude 3 Haiku has the fastest comparative latency. The Generated Answer simply adds a bit more context by specifying \"among the Claude models\" but the core information - that Claude 3 Haiku is the fastest - is identical. There are no contradictions between the answers, and no critical information is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  80%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 80/100 [06:54<01:47,  5.35s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys the same core concept as the Correct Answer - that to have a multi-turn conversation using the Anthropic Messages API, you need to send the full conversation history with each request because the API is stateless. The Generated Answer actually provides more detail through code examples and step-by-step instructions, but the fundamental principle matches the Correct Answer. Both answers emphasize the importance of maintaining and sending the complete conversation history for each API call. There are no contradictions between the answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 80/100 questions. Current Accuracy: 0.7750\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  81%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 81/100 [07:02<01:56,  6.15s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the core message of the Correct Answer. Both answers emphasize that using XML tags to provide a specific role context (like General Counsel) helps Claude catch critical legal issues and risks in contract analysis that might otherwise be missed. While the Generated Answer provides more detail and additional benefits (like improved focus and parseability), it doesn't contradict the Correct Answer and includes the key point about helping to identify critical legal issues that could save the company from significant risks. The essence of both answers is the same - role prompting with XML tags improves Claude's ability to analyze legal contracts by providing important context that leads to better identification of crucial legal issues.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  82%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f | 82/100 [07:07<01:41,  5.65s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is essentially correct. Both answers convey the same core distinction between how the two models handle missing information in tool calls: Claude 3 Opus is more likely to ask for clarification when information is missing, while Claude 3 Sonnet is more likely to try to infer or fill in missing information on its own. While the Generated Answer uses slightly different wording and adds some additional context about the models' general capabilities, the fundamental comparison regarding how they handle missing information matches the Correct Answer. There are no critical omissions or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  83%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e | 83/100 [07:14<01:45,  6.21s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the key points mentioned in the Correct Answer and even provides additional helpful detail. Both answers emphasize:\n",
      "\n",
      "1. Implementing retry logic for error handling\n",
      "2. Conducting thorough staging/testing\n",
      "3. Load testing\n",
      "4. Error handling and logging setup\n",
      "5. Gradual rollout process\n",
      "6. Documentation and training\n",
      "7. Monitoring and alerting\n",
      "\n",
      "The Generated Answer expands on these points with more specific implementation details, but the core recommendations align perfectly with the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information from the Correct Answer are missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  84%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d | 84/100 [07:20<01:38,  6.17s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It covers all three key evaluation metrics mentioned in the Correct Answer:\n",
      "1. Accuracy (\"Accuracy: The percentage of correct predictions...\")\n",
      "2. Cost (\"Average Cost per Classification...\")\n",
      "3. Speed (\"95th Percentile Response Time...\")\n",
      "\n",
      "While the Generated Answer provides additional details and context beyond what's in the Correct Answer, it fully encompasses the core evaluation criteria specified in the Correct Answer. The extra information doesn't contradict the Correct Answer, it merely elaborates on it. Since all three essential components (accuracy, cost, and speed) are present and there are no contradictions, the Generated Answer should be considered correct.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  85%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 85/100 [07:25<01:25,  5.68s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers identify the same two recommended methods for learning prompt engineering with Claude:\n",
      "1. The GitHub prompting tutorial\n",
      "2. The Google Sheets prompting tutorial\n",
      "\n",
      "The Generated Answer provides slightly more detail by mentioning that the GitHub tutorial is \"example-filled\" and that the Google Sheets version is a \"lighter weight version,\" but these are just additional descriptive details that don't change the core substance. The fundamental recommendation of these two specific tutorials matches exactly between both answers. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  86%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 86/100 [07:32<01:24,  6.04s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the key substantive differences outlined in the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. Pretrained LLMs are initially trained on unlabeled text data\n",
      "2. These base models are not inherently good at following instructions/answering questions\n",
      "3. Claude has undergone additional training/fine-tuning (including RLHF) to make it more capable at various tasks\n",
      "\n",
      "While the Generated Answer includes additional details about interpretability and adaptability that aren't mentioned in the Correct Answer, these additions don't contradict the core message. The Generated Answer maintains the essential contrast between basic pretrained models and Claude's enhanced capabilities through additional training.\n",
      "\n",
      "The substance and main points align between both answers, even though they are worded differently.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  87%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b | 87/100 [07:39<01:22,  6.36s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides a more detailed expansion of the key points mentioned in the Correct Answer. It covers all the main advantages mentioned in the Correct Answer:\n",
      "\n",
      "1. Cost and resource efficiency (points 1 and 2)\n",
      "2. Speed and time efficiency (point 4)\n",
      "3. Less data requirements (point 5)\n",
      "4. Flexibility and rapid iteration (point 6)\n",
      "5. Preservation of general knowledge (point 9)\n",
      "6. Transparency (point 10)\n",
      "\n",
      "The Generated Answer not only includes all the core concepts from the Correct Answer but also provides additional relevant details and examples. There are no contradictions between the two answers, and the Generated Answer doesn't miss any critical information from the Correct Answer. While the Generated Answer is more verbose and detailed, the substance and main points align perfectly with the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  88%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a | 88/100 [07:43<01:07,  5.65s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information - that you need to run the command `gcloud auth application-default login` to authenticate with GCP before accessing Claude models on Vertex AI. The Generated Answer adds a bit more context about why this authentication is needed (to access resources), but this additional detail doesn't change or contradict the core instruction. The substance and critical information is identical between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  89%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589 | 89/100 [07:48<00:58,  5.32s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the core information about the Prompt Generator tool being introduced on May 10th, 2024, and its main purpose of helping users create tailored prompts for specific tasks. While the Correct Answer provides additional context about the Claude iOS app and Claude Team plan, these are supplementary details rather than critical pieces of information about the Prompt Generator capabilities. The Generated Answer accurately conveys the key functionality of the tool - that it helps users create high-quality prompts tailored to specific tasks. There are no contradictions between the two answers, and the essential information about the tool's purpose and functionality is preserved in the Generated Answer, just in a more concise form.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  90%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 90/100 [07:51<00:48,  4.83s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys exactly the same information as the Correct Answer - that both Claude 3.5 Sonnet and the Artifacts feature became available on June 20th, 2024. While the wording is slightly different (omitting \"both\" and having a slightly different sentence structure), the core information and meaning are identical. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 90/100 questions. Current Accuracy: 0.8000\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  91%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 91/100 [07:55<00:39,  4.40s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information - that to limit Claude's response to a single token, you should use \"max_tokens\" with a value of 1. The Generated Answer uses slightly different wording but communicates the same essential concept. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  92%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 92/100 [07:59<00:34,  4.35s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that temperature controls randomness in model generation. The Generated Answer simply provides more detail and elaboration about what higher and lower temperatures do specifically, but the fundamental meaning matches the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes all critical information from the Correct Answer. The additional detail in the Generated Answer supplements rather than contradicts the core concept.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  93%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e| 93/100 [08:04<00:31,  4.47s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is partially correct but misses a critical piece of information. While it correctly identifies that API parameters can be specified as additional arguments after the prompt and model (the first way), it completely fails to mention the second key way mentioned in the Correct Answer - the ability to pass in an API key for a specific cell using \"api_key\". Instead, it just describes the basic syntax of the CLAUDE function. Since one of the two main ways to specify API parameters is missing from the Generated Answer, it cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  94%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d| 94/100 [08:08<00:26,  4.46s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the key points from the Correct Answer:\n",
      "1. Prefilling with { makes Claude skip the preamble\n",
      "2. Results in direct JSON output\n",
      "3. Makes the response more concise\n",
      "4. Makes it easier for programs to parse\n",
      "\n",
      "While the wording is slightly different, the substance and meaning are essentially identical. Both answers convey the same core concept about how prefilling with { affects Claude's output formatting and the benefits of doing so. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  95%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 95/100 [08:13<00:22,  4.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is partially correct but contains extra information that is not verified by the Correct Answer. The first two points about the multimodal cookbook and API reference documentation match the Correct Answer's substance. However, the third point about a developer community is not mentioned in the Correct Answer and appears to be additional unverified information. While this extra information doesn't directly contradict the Correct Answer, we should be conservative in our evaluation when additional claims are made that aren't supported by the reference answer. Therefore, the Generated Answer cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  96%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 96/100 [08:18<00:19,  4.81s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same essential information as the Correct Answer. Both answers indicate that:\n",
      "\n",
      "1. You can specify the API key as a parameter when creating a new Anthropic client\n",
      "2. If no API key is provided, it defaults to using the ANTHROPIC_API_KEY environment variable\n",
      "\n",
      "The Generated Answer actually provides more detail by showing code examples in both Python and TypeScript, but the core information matches the Correct Answer. There are no contradictions between the two answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  97%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b| 97/100 [08:22<00:13,  4.60s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the same two key benefits mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers mention identifying edge cases where prompts might fail/falter\n",
      "2. Both answers emphasize ensuring consistent performance across different test inputs/cases\n",
      "\n",
      "The Generated Answer breaks these points down in a slightly different format but conveys the same core information about using the Evaluation tool to find problematic cases and ensure reliable performance. There are no contradictions between the answers, and no critical information is missing from the Generated Answer compared to the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  98%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a| 98/100 [08:30<00:10,  5.48s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses a crucial element from the Correct Answer. The Correct Answer emphasizes that the key difference is that the pretrained model is not inherently good at following instructions or answering questions, and that specifically RLHF (reinforcement learning from human feedback) was used to create the helpful and safe Claude assistant. While the Generated Answer mentions \"finetuning and optimization,\" it does not capture this fundamental distinction about the model's base capabilities vs its trained behaviors, nor does it specifically mention RLHF. The Generated Answer instead focuses on more technical and operational aspects like API integration, versioning, and customization, which weren't mentioned in the Correct Answer as key differences. The Generated Answer therefore misses the core conceptual difference between the base model and the final assistant.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  99%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589| 99/100 [08:33<00:04,  4.67s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is exactly identical to the Correct Answer, stating that Anthropic's IPv6 address range is 2607:6bc0::/48. There are no differences in wording or substance, and all critical information is present.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100/100 [08:37<00:00,  5.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It identifies the same two methods for specifying the API key as mentioned in the Correct Answer:\n",
      "1. Using the environment variable ANTHROPIC_API_KEY\n",
      "2. Passing the API key directly when initializing the client via the api_key parameter\n",
      "\n",
      "While the Generated Answer is more concise, it captures all the essential information from the Correct Answer. There are no contradictions between the two answers, and no critical information is missing. The differences are merely in phrasing and level of detail, but the core substance is identical.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 100/100 questions. Current Accuracy: 0.7900\n",
      "Detailed results saved to evaluation_results_detailed.csv\n",
      "Average Precision: 0.4533\n",
      "Average Recall: 0.7142\n",
      "Average MRR: 0.7733\n",
      "Average F1: 0.5546\n",
      "End-to-End Accuracy: 0.7900\n",
      "Evaluation complete. Results saved to evaluation_results_level_two.json, evaluation_results_detailed_level_two.csv\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "# Initialize the SummaryIndexedVectorDB\n",
    "level_two_db = SummaryIndexedVectorDB(\"anthropic_docs_v2\")\n",
    "level_two_db.load_data('data/anthropic_summary_indexed_docs.json')\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "# Run the evaluations\n",
    "avg_precision, avg_recall, avg_mrr, f1, precisions, recalls, mrrs  = evaluate_retrieval(retrieve_level_two, eval_data, level_two_db)\n",
    "e2e_accuracy, e2e_results = evaluate_end_to_end(answer_query_level_two, level_two_db, eval_data)\n",
    "\n",
    "# Create a DataFrame\n",
    "df = pd.DataFrame({\n",
    "    'question': [item['question'] for item in eval_data],\n",
    "    'retrieval_precision': precisions,\n",
    "    'retrieval_recall': recalls,\n",
    "    'retrieval_mrr': mrrs,\n",
    "    'e2e_correct': e2e_results\n",
    "})\n",
    "\n",
    "# Save to CSV\n",
    "df.to_csv('evaluation/csvs/evaluation_results_detailed_level_two.csv', index=False)\n",
    "print(\"Detailed results saved to evaluation_results_detailed.csv\")\n",
    "\n",
    "# Print the results\n",
    "print(f\"Average Precision: {avg_precision:.4f}\")\n",
    "print(f\"Average Recall: {avg_recall:.4f}\")\n",
    "print(f\"Average MRR: {avg_mrr:.4f}\")\n",
    "print(f\"Average F1: {f1:.4f}\")\n",
    "print(f\"End-to-End Accuracy: {e2e_accuracy:.4f}\")\n",
    "\n",
    "# Save the results to a file\n",
    "with open('evaluation/json_results/evaluation_results_level_two.json', 'w') as f:\n",
    "    json.dump({\n",
    "        \"name\": \"Summary Indexing\",\n",
    "        \"average_precision\": avg_precision,\n",
    "        \"average_recall\": avg_recall,\n",
    "        \"average_f1\": f1,\n",
    "        \"average_mrr\": avg_mrr,\n",
    "        \"end_to_end_accuracy\": e2e_accuracy\n",
    "    }, f, indent=2)\n",
    "\n",
    "print(\"Evaluation complete. Results saved to evaluation_results_level_two.json, evaluation_results_detailed_level_two.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluating This Method vs Basic RAG"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABW4AAAJOCAYAAAAnP56mAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8pXeV/AAAACXBIWXMAAA9hAAAPYQGoP6dpAADGaElEQVR4nOzdd3yN9///8efJRMyoxB5FgghCasZKrNTeq1ZLaUsXWsQqVaO+tEWpqhqt2SBWtEWpmrVXUZtqM2wSmef3h1/Ox5GEJCTnSB73283tlut9va/rel0n532OPM913pfBaDQaBQAAAAAAAACwGjaWLgAAAAAAAAAAYI7gFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAnpnRaLR0CZkej3HmwO8RAAAAKWVn6QIAID3s3btXPXv2THa9vb298uTJIzc3N3Xt2lVNmjR54v4iIyNVp04d3b9/X35+fvr6669TVIfRaNSePXu0du1aHT9+XP/995+ioqLk7OysSpUqqUmTJvL395etrW2Kz83X11f//PNPonYbGxtly5ZNrq6uqlGjht544w0VL148xftNi1u3bmnChAn6/fffFRERIVdXV23atEl2dry9ZKRHn++2trb6448/5OzsnGz/kJAQNWjQQPHx8XJ1ddXvv//+TMdfv369tmzZounTp6d4mx49emjfvn36/vvvVbt27Wc6/vMQFham1q1bq0+fPurXr5/ZuiNHjmjZsmXat2+fwsLCZG9vL1dXV9WsWVPdu3dX6dKl072+o0eP6tNPP9WSJUvSbXytXLlSI0eOVNu2bTVp0qQn9r169ar8/Pyey/MnNZ722v64RYsWqUaNGs+9jtjYWHl4eEiSTp8+neLt/vjjD3333Xf6/vvvn3tNj3J3d09x34EDB2rQoEHpUkfXrl118ODBVP8e1qxZo48//liSNGvWLDVq1Chd6suK7t27p2bNmql9+/b64IMPLF0OAAB4Cv6yBpCp5ciRQ35+fona79y5o7Nnz2rXrl3atWuXPvjgAw0YMCDZ/WzatEn379+Xo6Ojtm3bppCQELm6uj7x2FevXtUHH3ygo0ePSpJKly4tb29v2dvb68qVK/rll1/0888/69tvv9WcOXNUqFChVJ1b7dq1lT9/ftOy0WhUZGSk/vrrLy1btkzr16/X999/r0qVKqVqv6nx2Wefae3atcqfP78aNmyovHnzEtpaWFxcnH799Vd17tw52T7BwcGKj49/Lsf7888/NXjwYFWtWvW57M9SAgIClDt3bvXu3dusfebMmZo5c6ZsbGzk6ekpT09P3b9/XxcuXNCPP/6o5cuXa/To0U98vJ+HTp06caXmI5J7bX/cSy+9lAHVpMy1a9f0xhtvPPW943lq1KiRsmfP/sQ+qQl5M0pgYKAcHR0VFRWlpUuXEtw+Rzlz5tSQIUM0fPhw1atXT9WqVbN0SQAA4An46xpAppYvXz5NnTo1yXXx8fFasGCBJk+erBkzZqh169bJhqeBgYGSpN69e+ubb77RihUrnniFUkhIiLp166aQkBD5+Pho+PDhKlOmjFmfq1ev6tNPP9Vvv/2md955RytWrEhV6DlgwIAkr2CKjY1VQECA1qxZo5EjR2rt2rUp3mdqHTlyRJI0ffr0dLmqDamTO3du3blzR5s2bXpqcGtvb6+YmJhnPmZaA+DJkycrMjJShQsXfuYanlVwcLC2b9+uWbNmyd7e3tS+e/duzZgxQ4ULF9bChQvNrmA3Go1av369Pv74Y40dO1aVK1dWuXLl0q1GQltzT3ptt1bP68OS1Bg+fLiKFi2a4cd9FleuXNGff/4pHx8fXb9+XTt37tSVK1dUrFgxS5eWabRu3VoLFizQmDFjtGbNGj5wBQDAijHHLYAsy8bGRq+//roqVqyo2NhY7dixI8l+ly9f1v79++Xu7q7u3bvLxsZGK1euVFxcXLL7Hjt2rEJCQtS4cWPNnTs3UWgrSUWLFtWsWbPk4eGhEydO6Jdffnku52VnZ6cRI0bI1tZWp0+f1uXLl5/LfpOSEPwVLFgw3Y6BlCtTpoxKlCihffv26ebNm0n2+eeff3T48GHVrVs3g6szV7hwYZUuXfqpVwOmt9jYWE2bNk0lS5ZMdFVfUFCQJOmdd95JNO2IwWBQy5Yt1a1bN8XHx2v58uUZVjOQma1atUpGo1E+Pj7y9/eX0WhkfD1nBoNBffv21d9//63Vq1dbuhwAAPAEBLcAsrwiRYpIejhfa1ICAwNlNBrVrFkz07yWISEh2rp1a5L9L1++rK1btypHjhwaN27cE+evtbW11fvvv6/OnTs/1/AzT548ypMnjyTp+vXrZusuXryoYcOGqV69eqpYsaLq1aungICAJOfN9fX1lbe3t86cOaN27dqpYsWKatiwoXr16iV3d3fTNk2aNJG7u7v27t1r2vbs2bP66KOPVLduXVWsWFE+Pj4aOnSozp49m+g4PXr0kLu7u86cOaOePXvK09NTPj4+2rhxo/bu3St3d3dNnjxZZ8+e1TvvvKPq1auratWq6tWrl44dOyZJ2r9/v3r06CEvLy/Vq1dPw4YN040bNxId68KFCxo9erSaNm2qKlWqqFKlSmrUqJE++eQThYSEmPVdtWqV3N3dtWDBAh05ckR9+/bVK6+8oipVqqhr167JPgfCwsI0efJkNW3aVJUqVVLDhg314Ycf6ty5c4n6RkVF6dtvv1WrVq1UuXJlVatWTT179kx230/j7++v2NhYbd68Ocn1GzZskCS1aNEi2X2ktKZhw4aZ5hs9ePCg3N3d1aNHD0ky/d4mTJigxYsXq3bt2qpcubIp6Ez4ne/atctsn9HR0VqwYIHatWsnLy8v1apVSz179kxyHtW1a9fqtddeU+3atVWpUiU1btxYn3zyif79998UP16bNm3S5cuX1aFDh0TrEsaOwWBIdvtWrVqpVatWcnNzS7TueYyBb7/91uyr7B4eHom+2n7nzh1Nnz5dzZo1k6enp2rUqKH+/ftr//79SdZ89+5dTZs2TY0bN1alSpXUvHlzrVy5MtlzfJqrV6/qvffek7e3t6pWrarevXvrjz/+MOszYMAAubu7JxvAzZ07V+7u7po5c2aa63iaYcOGyd3dXX/99ZeCgoLUvn17ValSRdWrV9egQYP0999/J7ldcHCwunTpoqpVq6pmzZoaPXq0bt++napjz5gxwzS1Q0hIiNzd3eXr62vWJzXPl/SS1te848eP6+2331atWrXk5eWlfv366cyZM6k+fnx8vNasWSODwaCmTZuqZcuWMhgMCgwMVHR0dLLb3bt3TzNnzlSLFi1UpUoV1a1bVwMGDNDhw4cT9U3pa0xyr1HS/x6nIUOGmNqe9ponpe7959H9vv322/Lx8ZGXl5datmyp2bNnKyIiQtLDx97d3V316tVL8qruqKgovfLKK/Ly8jJtIz18386bN6/mzZvHFf0AAFgxglsAWdr9+/d14MABSVLZsmUTrU/4I9LGxkatW7eWJLVt21aStGzZsiT3uW7dOklSw4YNn3iDqAT16tXTuHHjnuscodevXzeFlo9+FX337t1q27atVq9erTx58sjX11d58uTRTz/9pHbt2un48eOJ9hUTE6M333xTd+7cUf369WUwGOTr66uWLVsqR44ckiQ/Pz+1bNnSNJ/k1q1b1a5dOwUFBSlfvnzy8/OTs7Oz1q5dq/bt2+u3335Lsu5Bgwbp/Pnzql+/vuzs7FSxYkXTuhMnTqhjx446ceKEqlevrgIFCmjPnj3q2bOnVq5cqZ49e+rWrVuqU6eOoqKitHr1avXt29fsD9L9+/erbdu2Wr58uZycnFSvXj15eXkpPDxcS5YsUZcuXXTv3r1Ede3Zs0fdu3fXuXPnVL16dZUoUUIHDx7UW2+9pZ9//tms75kzZ9S2bVvNnz9fsbGxatCggfLly6cNGzaoffv2pqBZehg2vPbaa5o6darCwsJUs2ZNVapUybTvGTNmPPV3/Th/f39JDwPJpAQHB6tIkSKqXLlykutTU5OXl5fpxmLOzs5q2bJlohuN/f7775owYYLKlCmjKlWqqESJErKxSfq/H/fv31ePHj00ceJEXb16VbVq1VK5cuV04MAB9evXTz/++KOp79dff62hQ4fqxIkTqlChgurXr6+4uDgtWbJEHTp0UFhYWIoer4SrzZKaLzVh6oMvvvhCv//+e5KhSKVKlfT555+ra9euZu3PawxUrlxZLVu2NK1v0aKF2fJ///2nDh06aM6cOYqMjFTdunVVtmxZ/f777+rRo0eiQPb27dvq3r27vvnmG0VFRalBgwbKli2bRo4cqQULFqToMXvU/fv31bVrV/3xxx+qXr26PDw8tGfPHr3xxhtmv6+EYHzNmjVJ7mf16tUyGAym19f0NGvWLH300UeKjY1V3bp1lT17dv3yyy/q0qWLrly5Ytb3yy+/1Pvvv6/jx4+ratWqqly5soKCglJ1gzTp4TyyCVd0Z8+eXS1btjS7wjutz5f0kprXvO3bt6tr167asmWLihcvrrp16+rkyZPq2rWrrl27lqrj7tq1S9euXVONGjVUqFAhFSpUSLVq1dKNGzeS/VZKSEiIOnTooBkzZujWrVuqW7euihUrpt9++y1R2Jya15i0Su41Ly3vP/PmzVOvXr3022+/qUSJEqpTp45u3rypL774Qn379lV0dLQqVqyocuXKKSQkRHv27ElUz5YtW3Tnzh35+/ub3rMlycHBQT4+Prp48aLp/0EAAMAKGQEgE9qzZ4/Rzc3N2LBhw0Tr4uLijLdu3TLu2rXL2LlzZ6Obm5uxbdu2xtjY2ER9t2/fbnRzczP27t3b1PbgwQOjt7e30d3d3Xj58uVE2/Tv39/o5uZmXLZs2fM9qf+vYcOGRjc3N+OePXuSXB8REWGqoXv37qb2GzduGKtXr24sX768cePGjWbbLFu2zOjm5mb08/MzRkVFJTpW27ZtTe1xcXGJ1l+8eNHUFhoaaqxSpYrR3d3duGrVKrPjrFy50uju7m708vIy/vfff6b21157zejm5masX7++8ebNm2bHSfhdurm5GT/44ANjdHS00Wg0GqOioozt27c3rZs1a5ZZDd7e3kY3Nzfj0aNHTe0tWrQwurm5JTr/0NBQ07kEBQWZ2gMDA037/+yzz0zHNhqNxokTJxrd3NyM7dq1M7XFxcUZ27RpY3RzczNOmTLF7LH64YcfjG5ubsYWLVqY2j7++GOjm5ub8cMPPzTev3/f1H7hwgVTPTt37jQ+TcJj1KVLF6PRaDQ2bdrUWKFCBeONGzfM+p0/f97o5uZmnDp1qvHKlStGNzc3Y926dc36pLamx4/9eLubm5vx22+/NXuMjMb//c4f3df48eONbm5uxp49exrv3Lljaj9y5IixcuXKxgoVKhjDw8ONUVFRxsqVKxurV69uDAkJMfWLiYkxDhw40Ojm5mb86quvnvq4RUZGGj09PY21atVKcn1ISIixbt26pvOoVauW8cMPPzQuXbrUePbs2WT3+7zHgNFoNNUQExNjtr+E7SZNmmT2/Dx8+LDR29vb6OHhYfz7779N7ePGjTO6ubkZ33rrLeODBw9M7StWrDAd4+OPP37Co/ZQwvPHzc3N2Lx5c2NYWJhp3Y4dO4weHh7GihUrGq9cuWI0Gh/+bmrVqmV0c3NL9Lp55MgRo5ubm7FXr15PPe6TXtufJuG5Xb58eeOGDRtM7Q8ePDB26dLF9Dg+Wpe7u7uxevXqxtOnT5vaL1++bGzQoIHp/FMquTGXlufL0yTUlvD4p1RqX/Pu3btnrFOnjtHd3d24bt06U/v9+/eNr7/+umlfyb1fPe79999P9Dq8fv36RO9njxowYIDpNevR96/Nmzcby5UrZ6xevbrpPFL6GmM0Jv0a9fjjNHjwYFPb017zUvv+c/ToUWO5cuWMVatWNe7fv9/UHhERYapt/vz5RqPRaFy4cKHRzc3NOHTo0ES19u3b1+jm5mb8888/E61bunSp0c3NzTht2rSkHloAAGAFuOIWQKb2zz//yN3d3exf+fLlVb16dfXu3VuHDh1S/fr19e233yY5pcGqVaskSe3atTO1OTo6qkWLFjIajUledZvwNe3k7hz+2WefaciQIYn+ffbZZ6k6tzlz5phtP3jwYL3++uvy8fHRb7/9pvz582v8+PGm/itXrtStW7fUrVs301WZCTp37qyGDRvqypUr+vXXXxMdq1OnTnJwcJCkZK+WTLB8+XJFRESobdu2ia6e69Chg9q2bav79+9r6dKlibZt2bKl8ubNm+RxDAaDRo4cabp5lIODg5o2bSpJKlSokPr372/qW6BAAXl5eUmSaY7f+/fvq2LFimrfvn2i8y9QoIDp6rerV68mqit//vwaOnSo2Y2rEq64e/Tr1YcOHdLJkydVtmxZDRkyxOwcunfvrurVqytnzpy6ceOGQkJCtHbtWhUoUEDjx483uxKqZMmSGjZsmCTpu+++S1TP0yQ3XcLGjRslSc2bN09yu/SoydbWVt26dTMtJ/f8iY6OVmBgoOzs7DRlyhTlypXLtK5SpUrq3r273NzcdObMGd29e1eRkZHKnj278uXLZ+pnZ2enwYMHa+zYsWrYsOFTazt8+LCioqKSvamYi4uLli5dqvr160t6eCX7+vXrNWbMGL366qtq2LChpk+fnugqufQaA487cuSI9u3bp3LlyiV6flauXFlvv/22YmJitGjRIkkPH+NVq1bJ3t5en376qRwdHU39O3bsmKLHLCmjRo0yXW0vST4+PurSpYvpdyo9/N0kfGvh8atuE656fvR19mmSem1//N/bb7+d5La+vr569dVXTcuOjo6mm/k9Op6XL18uo9God955x2wqjGLFimn48OEprvVpnuX58jR+fn5PfIy8vb2T3C6lr3mbN29WWFiYGjVqZDb9So4cOTRp0iSz7Z/m9u3b2rx5s3LlyqUmTZqY2hs3bqy8efPqzz//TDRtRMK0RXnz5tWECRNM71MJ5/7qq6+qePHiunjxYqpeY55FUq95aXn/Wb58ueLj4zVgwABVq1bN1J49e3YNGzZMxYsXN32zoFWrVnJwcNCvv/5qNh1CWFiYdu7cqZIlSyb5u0547Xt0miMAAGBduIUogEwtR44cpq9AG41G/ffff6Z5H5s3b653331XJUuWTHLbW7duacuWLcqdO7fZH5HSwz+mlyxZolWrVum9994z+2Mx4evUxmTmjNu8eXOS88kWKVJEI0aMSPG5PT7vnq2trZycnFSiRAn5+PioR48eKlCggGl9wh9mNWrUSHJ/devW1W+//aa9e/cmCvaSC7aS8ueff0qSKVR93KuvvqpVq1Zp3759idY96TjFixdPNPVEwnLZsmUTBe8Jf5RHRUVJkpycnDRx4sRE+w0JCdFff/2lU6dOSVKS8yh6eHgkuuu2i4uLaf/x8fGysbExnVODBg2SnBd18eLFpp83bNiguLg4eXp6mgWkCerUqSMbGxsdOHBAcXFxT5wr+XH+/v76+uuv9fPPP6tjx46m9uDgYJUuXVrlypVLMqDev3//c6+pePHiSe7rcceOHVNERIQqV66c5IceQ4cONVt++eWXdf78ebVv314tW7ZUvXr15O7urpIlSyY7ph+X8CFLwjzXSSlSpIjmzp2rS5cuaevWrdqzZ48OHDigu3fv6tq1a5ozZ46CgoK0ePFi013v02sMPC5hTL/yyitJhrx169bVpEmTTMdJeIy9vLySnMalUaNGqf5KvouLS5KvKb6+vlq8eLHZOXbo0EHz589XUFCQBg0aJOnheNu4caNy5syZ6HX2SR59bU/Oo1OtPCqpaUISxnNkZKSpLeH3WK9evUT9GzRoIDs7O8XGxqa45uQ8y/PlaRo1avTEGwAmNzZT+pr3pMeoQIECqly5crJzLT9u/fr1io6OVtu2bZUtWzZTu4ODg1q2bKnFixdr2bJlGjlypGldwhioXbu22TYJ/u///s/084EDB1L1GpNWSb3mpeX9J+H3/fhcyNLD38+jH7LmzZtXfn5+Cg4O1i+//KI2bdpIejgPeFxcXLJTkBQtWlTSwylXAACAdSK4BZCp5cuXT1OnTjVrO3DggN58801t2LBBbm5uGjBgQJLbrlu3TtHR0cqWLZv69u2baL2NjY1u3Lihn3/+2WzOSRcXF505c0bh4eFJ7vfxm7tcunQpVYFFgkWLFiUbwiYlIaQaOHDgE/sl9Qdcwo3OUiI0NFRS8mFYwh+KSc1B+qTjJLUuIRx90rrHHTx4UCtWrNCJEyd0+fJlPXjwwKx/UoF77ty5E7U9GmokhBgJ51SoUKFkzyNBwtyPW7duTXSzqUdFRkbq9u3bKZovOYGbm5vKlCmjPXv26Pbt28qTJ49Onz6tv//+2xSYZVRNCVePPk1qHjvp4byzgwYN0unTp3X69GlNnTpVBQoUkK+vrzp16pRsaPeohJuP5cyZ86l9S5QooT59+qhPnz6Kj4/XiRMntGnTJi1ZskT//vuvPvzwQ9N8suk1Bh6X8PtavHix2YcCj0sY0wl1JfdtgIS6UiO5c0y42eKjN1wqXbq0vLy8dOjQIe3fv1/e3t767bffdOvWLXXu3DnJ4C05Sb22p1RSj3HChxCPzmP8pMfLwcFBLi4uZnO4Ll++3BRkPqpLly7JXtn66HFS+nz55ZdfkpzvtUmTJoneS4YPH56m32tKX/NS8pxKaXCbcHX2n3/+abrJYYKEOduDgoI0ePBgUxidmteN1L7GpNWTXvNS8/6TUO+j89Q/SYcOHRQcHKw1a9aYgtuEOfoTlh+X8AHn4zcxBQAA1oPgFkCWU61aNU2ePFnvvPOOpk+frmLFiiX51fGEPyLv3LnzxCudli1bZhbcli9fXn/88YcOHjxodrWjpcXFxUl6eNO0JwVVZcqUSdT2tK9sPyq5K40TJAQjj16lnJLjPH71V1p88sknWrJkiWxsbFSuXDn5+/urdOnSqly5snbu3Kk5c+YkuV1yIfDjUnP1XcLjUKZMGZUvXz7F26WUv7+/ZsyYoc2bN6t9+/ZPnSYhvWpK6WOX8PxMKXd3d23cuFE7d+7Ub7/9pt27d+vixYtavny5VqxYoREjRjz1BlIJv6+kbjoWERGhs2fPytbWVh4eHmbrbGxs5OnpKU9PT/n7+6tz5846evSoLly4oFKlSqXbGEhuP56enk+8yjjhd/C030Vaxtij0y2kZJ/t27fXoUOHtHbtWnl7eysoKEhS6qZJeFYpfU4+rd/jV5wfOnTIdHPKR9WuXfuJwW1qny+nT59O8jglSpRI04eASUmvxyg5p06d0okTJyRJ58+f1/nz55Psd+fOHW3YsMF0s7vUvG6k9jUmrftK7jFJ7ftPaq/mrl27tgoXLqy9e/cqJCREN27c0JkzZ+Tj42P6IOVxCc+t5/nYAACA54vgFkCW1KhRI3Xo0EE//fSTxo4dq1deecX0NVBJ+uuvv/TXX3/J1dVV27ZtSzJMCQsLU/369bV//36dPXvWFHi2bt1a3377rTZv3qzhw4cneeWSJbi4uOjixYvq2bOnateuna7HuXDhgv755x+VLVs20fqEu7bnz58/3WpIyr59+7RkyRIVKlRI8+bNSxRQJ3fH8tRIeA49epXho3bv3q3w8HBVr17dNI1F+fLl03zl4JMkBLebNm1S+/btFRwcLA8PD5UqVSrZbdK7pidJOHZyj92FCxd04MABeXp6mq4GtrOzU/369U1z0F67dk2LFi3S999/r+nTp6tLly5JhqMJEq68vHnzZqJ1p0+fVpcuXVS6dGlT6J2UhDu6Hz9+XLdv35aUcWMg4TGrU6eOPvjgg6f2T7gq8tGrRB+VcPVkaiS3TcJ0MI9fLfjqq6/qs88+0+bNm/XRRx/pjz/+0Msvv6wqVaqk+tjpLeE189q1aypdurTZuvj4+ETfqpg0aZImTZqUpuOk5vkyaNCgJ145n5ESnlNJTf8jpfw5lfBB6ZtvvqnBgwcn2ee7777TlClTtGzZMlNw+7TXjWPHjuncuXOqWrVqql9jEgLYpELNu3fvpui8EqTl/adAgQL6559/9N9//yX5ur1s2TK5uLiYplKwsbFR27ZtNWvWLG3evNn02Ldv3z7ZuhKuZLaW/6cAAIDEuDkZgCzr448/VoECBXTnzp1Ec88l/BH56quvJnsFXIECBeTj4yNJZjeNKVu2rJo0aaI7d+4oICAgyav5HpVwlVF6e+WVVyRJ27dvT3L9lClT1KZNG61YseK5HOfnn39Ocn1wcLAkqXr16s90nNQ6fPiwpIdfJ378j+a4uDjt2bNH0tOvfnuSqlWrSpJ+//33JNdPnz5dQ4YM0Y0bN0yP059//mk2r2aCY8eOqUmTJho0aFCaaipdurTc3Ny0e/du7dmzR5cuXTK7IVNS0lJTSq/MexoPDw85ODjo+PHjSX5tNzAwUAEBAdq9e7d2794tf39/jRo1yqxP4cKFNWzYMOXOnVsRERG6devWE4+ZcJVqUkGOm5ubnJycdO7cOe3evTvZfURFRenatWuyt7c37S+jxkDCcXbs2JHk68yvv/4qf39/jR07VtLDkDl37tw6ceJEkuHttm3bUl3DhQsXkpwvOSGIevwcnZyc1KxZM12/fl1ffPGFoqKiMvRq29RI+IArqVBt7969SY6RJ0lurFjra2ZK1KpVS5KSvKnl3bt3deDAgafuIzo62nQF8aM3OHtcy5YtZWtrq2PHjpneNxNec3fv3p3k/OTz58/Xxx9/rDNnzqTqNUb63/y/SfVNeD9JqbS8/yScW1Lv2efOndOYMWP05ZdfmrW3a9dOBoNBv/76qzZv3qw8efKYbnyWlIRw90kf6AEAAMsiuAWQZeXOnVsff/yxJGnjxo2mm32l9I9ISaZ544KCgsz+iJ8wYYKKFy+uX375Ra+99pqOHTuWaNsrV65o9OjRpquLUjOHaVp07txZOXLk0A8//KANGzaYrdu6dasWLVqkU6dOydPT85mO06lTJ+XIkUOrV6823S0+QWBgoIKCgpQjR45kb5aSXvLlyyfp4R/4j/6uIiMjNWrUKNOd0hNuZpYWNWvWVOnSpfXXX39p5syZZn+EL1myREeOHJGbm5vKly+vYsWKyc/PT//9958CAgJ07949U9/r168rICBAly5dUqFChdIcjvr7+ysmJkaffPKJDAbDE6dJkJSmmhK+Kv9o37RwcnJS27ZtFRMToxEjRpj9jo4dO6YffvhB2bJlU9OmTeXu7q7Lly8rKCgoUTC0bds23blzR4ULFza7OV9SKlWqJDs7Ox07dizRVXVOTk56/fXXJUnvvfee1q9fnygcvXHjhimIb9++vWluy/QYAwmP86NX+tWoUUPly5fXiRMnNGXKFLPg6tKlS/r00091/vx5Uyhjb2+vbt26KS4uTh999JHZ7+znn39O8uv3T2M0GjVs2LBE+woMDFSuXLmSnC4m4QrAH3/8Uba2tmrdunWqj5sRunfvLnt7e82dO9dsntbQ0FCNGzcu1ftL+B1GRESYPZes9TUzJXx9fVW8eHHt2rVLCxYsMLVHR0dr5MiRioiIeOo+tm7dqps3b8rNze2Jc2u7uLioTp06kv73YWnCzTivX7+u8ePHm00v8Ntvv2nTpk3Knz+/6tSpk6rXGOl/NwpctmyZ2djatGlTkkH1k6Tl/ad79+4yGAz6+uuvdfLkSVP7/fv3Tc+/Vq1amR2naNGiqlmzpvbu3auzZ8+qefPmT/zWwaFDhyT9LyQGAADWh6kSAGRpLVu2VGBgoHbv3q1PPvlE69at0+bNm3Xr1i2VLFnyqTc48vPzU+7cuXXnzh2tX7/eFFLkzp1bK1eu1KhRo/TLL7+oQ4cOKlasmEqVKiVHR0dduXLFdBdpR0dHdevWTe+++266nqurq6smT56sDz/8UB9++KFmzZqll19+Wf/++6+OHz8uSRoxYsQzz2366HGGDRumBQsWqFSpUrpw4YJOnTql7Nmza8qUKcneiCe9+Pv7a+bMmTpz5owaNWqkKlWqKDo6WocOHdLdu3dVtmxZ/f3338neVC4lbGxsNG3aNPXu3VszZszQ+vXr5ebmpsuXL+uvv/6Sk5OTpk+fbuo/fvx4Xbp0SRs2bNDOnTvl6ekpg8Gg/fv3KyIiQlWrVk3RV+CfdM5ffvmlzp8/r6pVq6bopjypralo0aKytbXVmTNn1KtXL7m7u2vEiBFpqvejjz7S8ePHtW3bNvn6+srb21u3b9/W/v37FRcXp8mTJ5vOYejQoZo4caK6d++uKlWqyMXFRSEhITp8+LBsbW01evTopwbeTk5Oql69unbt2qWTJ08m+tDinXfeUXh4uJYuXarBgwdrwoQJ8vDwUM6cORUaGqqjR48qJiZG9evXV0BAgGm79BgDJUqU0JkzZ9SzZ0+VLFlSkydPVo4cOTR9+nT16tVL33//vTZs2CAPDw89ePBA+/fvV0xMjJo2barXXnvNtJ+3335bBw8e1L59+9SoUSO98sorCg8P18GDB003DkuNUqVK6e+//1bjxo3l7e2tsLAwHTp0SPb29poyZUqS00F4e3ubHo8GDRqYTVOTUjdv3tSQIUOe2u+VV15R586dU71/6eFczyNGjNC4cePUs2dPvfLKK3JyctKePXv00ksvydnZ2fRV85RwdnY2vV906dJFxYsX19SpU9P1NXPixImmG3k9qa60jllHR0dNnTpVffv21cSJE7VmzRoVL15cR48e1Y0bN1ShQgWz0DEpCd9wedoHS5LUtm1b/f7779qwYYOGDRumnDlzasKECerevbtWrFihP/74Q56engoNDdWhQ4dkZ2enadOmmR6D1LzGdOrUST/++KMOHTqkJk2aqFKlSrpy5YpOnjyptm3bJgrZnyQt7z9eXl5677339MUXX6hjx47y9vaWk5OTDh8+rOvXr6tOnTrq06dPomN16NDBdNXw065mT7iZXsJ0CwAAwPoQ3ALI8saMGaNWrVrp4sWL+vbbb03BxdOutpUe/tHq7++v5cuXa9myZWZXl+XNm1czZszQ0aNHtXbtWu3fv1/Hjh3TvXv3lC9fPtWpU0e1a9dWu3bt0v1q2wRNmjRRYGCg5s2bpz179mjbtm3Knz+/GjZsqD59+qhGjRrP7Tg//fSTvv32W+3du1fnzp1TgQIF1KFDB73++uuJ5ovMCDlz5tSKFSv05Zdfas+ePdq+fbucnJxUoUIFdenSRTVr1lTt2rX1xx9/KCYmRvb29mk6Trly5bR69WrNmTNHv//+u7Zu3apcuXKpRYsWGjhwoNlXUvPnz68VK1Zo4cKFCg4O1p9//ikHBweVKlVKrVu3VufOnZUtW7Y0n3OpUqVUvnx5/fXXXykKRdJSU/78+TVhwgTNnDlTBw4c0LVr19IcAuXMmVM//vijFixYoA0bNmjbtm2ys7NT9erV1bdvX9PUJJLUu3dvubi4aOnSpTp16pSOHTumfPny6dVXX1Xfvn0T3VAsOR07dtSuXbv0888/JwpuDQaDxo4dq1atWmn16tXav3+/jhw5osjISOXNm1d169ZV69at1axZs0T7fd5jYMKECRo7dqz+/vtvhYaG6sqVK3J3d1epUqW0Zs0azZs3T1u2bNHOnTvl5OSkihUrqlOnTmrVqpXZDaIcHR313XffacGCBVq9erW2b98uFxcXDRkyRBUrVlTv3r1TVVfBggU1a9YsTZo0SX/88YdsbGzUsGFDDRo06Im/g6pVq+rChQtpniYhIiIiRVcI29nZpTm4laRu3bqpZMmSmjt3ro4fPy6DwaD69etr2LBh6tq1a6r2ZWNjo6lTp2ry5Mk6efKkrly5otu3bytPnjzp9pq5efPmp/YpUqRImsesJFWuXFkrVqzQrFmzTDcJ9PDw0LRp07RixYonBrchISHauXOnpJQFt49+WBoUFKTu3burYMGCCgwM1Ny5c7V582Zt3bpV2bNnV8OGDfX222+rUqVKpu1T8xpTuHBhLVu2TF9++aX27t2r7du3q2zZspo+fbrc3d1TFdym9f3nrbfeUoUKFbRw4UIdO3ZMkZGRKlq0qF577TX17ds3yamcqlWrJunhdC9P+gbNvXv3tGfPHpUpU8a0DQAAsD4G47NM5gcAAPACi4+PV8uWLXXz5k1t27btiV8rxvMRHR2tevXqydbWVtu2bUvzhyQAEluwYIEmTpyokSNHqkePHsn2++GHHzR+/HhNnz79qfOfAwAAy2GOWwAAkGXZ2Nho4MCBun79ujZu3GjpcjKt+Ph4RUdHKzY2VlOnTtXNmzfVpUsXQlvgOXjw4IEk6cyZM/r222+VM2fOJ86JHB8fryVLlsjNzS3JbwwAAADrwVQJAAAgS/P399eaNWv01Vdfyd/f33QTKTw/sbGx8vLyksFgUExMjFxdXVM9LQOApH399ddasGCB6eZmH330kXLmzJls/5UrV+rixYtasmRJktMtAAAA62E179TR0dFq0aKF9u7dm2yfkydPqmPHjqpcubLat29vupkOAADAs/jss88UFRWl+fPnW7qUTMnBwUHlypWTwWCQl5eX5s2bp1y5clm6LCBTKF++vGxtbeXs7Kx33nlHr7/+erJ97927py+//FIDBgxQlSpVMq5IAACQJlYxx21UVJQGDx6sX3/9VYsWLUry5jgRERFq0qSJWrZsqQ4dOmjp0qUKDg7Wr7/+qhw5cligagAAAAAAAABIHxa/4vbs2bPq1KmTLl++/MR+GzdulKOjoz766COVLl1aAQEBcnJy0qZNmzKoUgAAAAAAAADIGBYPbvft26caNWpo+fLlT+x35MgRVatWTQaDQZJkMBhUtWpVHT58OAOqBAAAAAAAAICMY/Gbk3Xr1i1F/cLCwlSmTBmztvz58+vvv/9O8bHi4+MVGxsrGxsbUwAMAAAAAACQ2RmNRsXHx8vOzo6bEwIvCIsHtykVGRkpBwcHszYHBwdFR0eneB+xsbE6duzY8y4NAAAAAADgheDp6ZkoXwFgnV6Y4NbR0TFRSBsdHa1s2bKleB8JnyhVqFBBtra2z7U+vHji4uJ08uRJng9AKjF2gLRh7ABpw9gB0oaxg8clPCe42hZ4cbwwwa2rq6vCw8PN2sLDw+Xi4pLifSRMj+Dg4MAbFxQXFyeJ5wOQWowdIG0YO0DaMHaAtGHs4HEJzwmmjgReHC/MxyyVK1fWoUOHZDQaJT2cm+XgwYOqXLmyhSsDAAAAAAAAgOfLqoPbsLAwPXjwQJLUrFkz3blzRxMmTNDZs2c1YcIERUZGyt/f38JVAgAAAAAAAMDzZdXBrY+PjzZu3ChJypkzp7755hsdOHBA7dq105EjRzR37lzlyJHDwlUCAAAAAAAAwPNlVXPcnj59+onLlSpV0urVqzOyJAAAAAAAALzg4uPjE930HrAEBweHFN8k0KqCWwAAAAAAAOB5io6O1oULFxQfH2/pUgDZ2NioVKlScnBweGpfglsAAAAAAABkSkajUf/++69sbW1VrFixFF/pCKSH+Ph4Xbt2Tf/++6+KFy8ug8HwxP4EtwAAAAAAAMiUYmNjFRERocKFC3OfJFiFAgUK6Nq1a4qNjZW9vf0T+/IxAwAAAAAAADKluLg4SUrR19KBjJDwXEx4bj4JwS0AAAAAAAAytad9JR3IKKl5LhLcAgAAAAAAAICVIbgFAAAAAAAAXmDu7u5yd3fXtWvXEq1bunSp3N3dNWPGjBTt6/r16woODjbb9969e59brb6+vlq1atVz219mRnALAAAAAAAAvODs7e21devWRO2bN29O1dfzp06dqu3btz/P0pBGBLcAAAAAAADAC87b2ztRcHvv3j0dOnRIFSpUSPF+jEbj8y4NaURwCwAAAAAAALzg/Pz8tG/fPt27d8/Utm3bNnl7e8vJycms77Jly+Tr6ysvLy/16NFDp0+fliTNmDFDq1ev1urVq+Xr62vqv3//frVs2VKenp567bXX9M8//5jWnTt3Tm+88YaqVq2qunXraubMmYqPjzc7VoMGDVS1alV9/fXXZnWcOnVKXbp0UeXKlU3b4n8IbgEAAAAAAIAXnJubm1xdXfX777+b2n799Vc1atTIrN/WrVs1c+ZMjRo1SqtXr1a1atXUs2dP3b59W6+//rr8/f3l7++vn376ybTNypUrNXLkSP3000+6ffu2pk6dKkm6ceOGunXrJhcXF61cuVJjxozRDz/8oEWLFkmSduzYoQkTJuj999/X8uXLdezYMbPQ96OPPlL58uW1fv16TZgwQfPmzWOahkcQ3AIAAAAAAACZgJ+fn2m6hOjoaO3cuVN+fn5mfebNm6f+/furYcOGKlmypN5//30VKVJEa9eulZOTk7Jly6Zs2bLJ2dnZtM1bb72lGjVqyN3dXR06dNCpU6ckSevXr1f27Nk1fvx4lS5dWo0aNdJ7772nefPmSXoY+LZs2VJt2rRR2bJl9dlnn8nR0dG033/++Ud58+ZVkSJFVK9ePX3//fepmtYhsyO4BQAAAAAAADIBPz8/7dixQ7Gxsdq9e7fc3NyUP39+sz7nzp3T559/Li8vL9O/U6dO6eLFi8nut3jx4qafc+XKpaioKNO+PDw8ZGdnZ1rv5eWlsLAw3blzR+fOnVP58uVN6/Lly6dixYqZlvv376/Zs2fLx8dHI0aMUHR0tAoUKPCsD0OmYff0LgAAAAAAAACsXbVq1SRJBw4c0ObNm9W4ceNEfeLi4jRixAjVqlXLrD1nzpzJ7tfGJulrPx+9ejZBwvy2cXFxkhLf7Mze3t7085tvvil/f39t3rxZW7duVa9evTR+/Hh17Ngx2VqyEq64BQAAAAAAADIBOzs71a9fX1u3btVvv/2WaH5bSSpVqpT+++8/lShRwvRvzpw5Onz4sCTJYDCk+HilSpXSiRMnFBMTY2o7dOiQnJ2dlTdvXpUtW1bHjh0zrbt3754uXbokSYqKitKnn34qBwcH9enTR4sXL1anTp30888/p/HsMx+CWwAAAAAAACCT8PPz08qVK5U/f36zaQkS9OnTRwsXLtSaNWt0+fJlff755woODlbp0qUlSdmzZ9c///yjkJCQpx6rZcuWio6O1ujRo3Xu3Dlt3rxZM2bMUNeuXWUwGPTaa68pODhYK1as0Llz5zR69Gg9ePBA0sOrdQ8ePKjx48fr/PnzOnbsmPbv388ct49gqgQAAAAAAAAgk/Dx8VFsbGySV9tK0quvvqrw8HB99dVXCg8PV5kyZTR79myVLFlSktS6dWu98847atWqlfbs2fPEY+XMmVPz5s3ThAkT1KZNGzk7O6tXr17q37+/JMnb21sTJ07UF198oRs3bqh9+/Zmc95Onz5d48aNU4cOHWRnZ6dmzZrp7bfffj4PRCZgMD4+0UQmFhcXp8OHD6tKlSqytbW1dDmwMJ4PQNowdoC0YewAacPYAdKGsYPHZdXnxIMHD3ThwgWVKlVK2bJls3Q5QKqek0yVAAAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmCWwAAAAAAAACwMgS3AAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAACwIu7u7mb/atasqZEjR+r+/fvPvO+9e/fK3d091dtdvXo1UV0eHh7y8fHR+PHjFR0dnWibGTNmyN3dXbt3705yn7Gxsfruu+/UqlUrValSRd7e3urbt68OHDiQ6voyIztLFwAAAAAAAABkpHijUTYGg1Ufb8aMGfLy8lJ8fLz+/fdfjR49WlOmTNEnn3zyTLV4eXnpjz/+SPP2K1euVKFChSRJUVFR2rdvn8aMGaN8+fJp4MCBZn3Xr1+v4sWLa82aNapVq5bZuvj4ePXv319//fWXPv74Y1WtWlUREREKCgpS7969tWjRInl5eaW5zsyA4BYAAAAAAABZio3BoLUX7+r6g9h0P1b+bHZqVTJXqrfLkyePChQoIElydXVV//799cknnzxzcOvg4GDab1o4OzubbV+0aFEdPHhQmzdvNgtuT5w4ocuXL2vChAkaP368Ro8eLScnJ9P6pUuX6sCBA1q3bp2KFStmav/oo490+/ZtffPNN5ozZ06a68wMmCoBAAAAAAAAWc71B7EKiYxL93/PKxzOnj272XJISIjeffddvfLKK6pYsaLatm1rNsXAokWL1LBhQ3l6eqpdu3bav3+/pMRTJVy6dElvvPGGvLy81KBBAy1atCjVtTk4OMjW1tasbf369SpXrpyaNm2qmJgY/fLLL2brAwMD1a5dO7PQNsHgwYM1derUVNeR2RDcAgAAAAAAAFbsxo0bWrx4sVq1amVqGzJkiOLi4rRs2TKtWbNGrq6uGjt2rCTp5MmTmjJlisaMGaPg4GB5e3vr/fffV3x8vNl+o6Ki9Prrr8vJyUkrVqzQ6NGjNX36dP32228pqstoNGrv3r1at26dmjZtatYeHByshg0bysnJSbVq1dLq1atN66Ojo3Xy5El5e3snuV9nZ2flzJkzpQ9PpsVUCQAAAAAAAICV6devn2xtbWU0GhUZGam8efOaglmj0ahGjRqpadOmKliwoCSpe/fuevPNNyVJ//zzjwwGgwoXLqyiRYvq/fffV8OGDRMFt3/88Ydu3Lihzz77TDlz5lTZsmU1cuRI2dgkf61nixYtZPj/8/VGR0fL2dlZPXv21BtvvGHqc+DAAf37779q1KiRJKlJkyYaNWqU/vnnHxUpUkS3bt2S0WhUnjx5TNtcuHBB7dq1MzvWoUOH0vjoZQ4EtwAAAAAAAICV+fTTT1W5cmUZjUbdvHlTP/zwg7p27ap169Ypf/786tq1qzZu3KiDBw/qwoULOn78uCmY9fHxkZubm1q2bKkKFSrIz89PHTt2lJ2deRR44cIFlSpVyuzq1vbt2z+xrrlz58rV1VXXrl3TuHHjVK5cOQ0YMMBsqoQNGzaoSJEiqlChgiTJz89Po0ePVlBQkN5++21TYHvnzh3TNkWLFtWaNWskSUeOHNHQoUPT/uBlEkyVAAAAAAAAAFgZV1dXlShRQiVLlpSXl5cmTpyoyMhIBQcHKz4+Xq+//rrmz5+vwoUL64033tCUKVNM22bPnl0rV67UwoULVb16da1atUrt2rVTSEiI2TEeD3JTonDhwipRooRq1aqlb775Rtu2bdPkyZNN6+Pi4rRp0yZdu3ZNFSpUUIUKFeTj46P4+HgFBQVJkhwdHeXu7m52Ra29vb1KlCihEiVKyNXVNdV1ZUYEtwAAAAAAAICVs7GxkdFoVFxcnM6ePas///xTCxYs0IABA9SgQQOFhoZKejiNwqFDh/TNN9+oZs2aGj58uDZt2qSoqCizm5dJUsmSJXXp0iVFRkaa2iZPnqxPP/00RTUVL15cgwYN0g8//KAjR45Iknbv3q0bN27oq6++0po1a0z/hg0bposXL+rgwYOSpM6dO2vVqlX6999/E+338YA5qyK4BQAAAAAAAKzM7du3FRYWprCwMF28eFHjxo1TXFycfH19lTt3btnY2GjDhg36559/tGnTJs2YMUPSw3lns2XLplmzZmnlypW6evWqNmzYoIiICLm7u5sdw8fHRy+99JJGjx6tc+fOacuWLVq2bJl8fHxSXGfPnj1VunRpjRs3TvHx8dqwYYPKli2rJk2ayM3NzfSvW7duyps3r2k6hK5du6pGjRrq0qWLVq9erUuXLunUqVP6/PPPNWLECFWrVu25PZYvKua4BQAAAAAAQJaTP1vGxGJpPc6gQYNMP2fPnl0VK1bUt99+q2LFikmSxo4dq1mzZmnatGkqVaqURo4cqY8//lgnT56Ul5eXJkyYoK+//lrjxo1T4cKF9fnnn6t06dIKDw837dfOzs7Up23btnrppZf00UcfqUGDBimu087OTiNHjlTv3r21YsUK/frrrxo4cGCifo6OjmrXrp1++uknBQQEyNHRUTNnztSKFSu0ZMkSjRs3TgaDQeXLl9f48ePVqlWrND1umYnBaDQaLV1ERomLi9Phw4dVpUoVswmTkTXxfADShrEDpA1jB0gbxg6QNowdPC6rPicePHhgugFXtmzZTO3xRqNsDIYMqyOjjwfrldxzMilMlQAAAAAAAIAsJaNDVEJbpAXBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAYEViYmI0Y8YM+fn5qWLFimrQoIEmTpyoe/fuWbq0dDdjxgz16NEjzdv36NFDM2bMeOY6fH19tWrVqmfez7Ows+jRAQAAAAAAgAwWb4yXjSHjrmdM7fGmTp2qXbt26dNPP1WxYsV05coVTZgwQZcuXdKcOXPSsVIk+Omnn5QjRw6L1kBwCwAAAAAAgCzFxmCjTfc36UbcjXQ/lrOts5o5NUvVNqtXr9Znn32mWrVqSZKKFi2qsWPHqnv37goNDZWLi0t6lIpHODs7W7oEpkoAAAAAAABA1nMj7obC4sLS/V9awmGDwaA9e/YoPj7e1Obl5aUNGzYoX758khJ/lX/v3r1yd3eXJF29elXu7u7atm2bfH195eXlpU8//VRnzpxRu3btVKVKFfXv39809cKwYcP0+eef6/3331flypX16quv6uTJk5o+fbq8vb1Vr149BQcHm4514MABde3aVZUrV1aVKlXUr18/hYaGSpJWrVqlLl266J133lG1atU0e/ZsVahQQTdu/O9xOH78uCpXrvzUqR8SzuOXX35Ro0aN5Onpqf79++vWrVumPr/++quaNm2qKlWqaNy4cYqLizPbx7Jly0yPQY8ePXT69GlJ0rlz51SxYkWtWbNGkhQdHa2mTZvqs88+S/T49ujRQ7Nnz9Ybb7yhSpUqqWnTptqxY4fpGDdv3tTAgQPl5eUlPz8/LV261PS7eBYEtwAAAAAAAIAV6dmzpxYvXixfX1+NGTNGP//8sx48eKAyZcrI3t4+xfuZO3euvv76a40fP16LFy/WwIEDNXjwYH333Xc6fPiwfvrpJ1PfhQsXqnr16lq7dq3y5s2rXr166fr161q+fLmpjvj4eN29e1f9+/dXnTp1tH79en333Xe6fPmy5s6da9rXoUOHVKZMGa1YsUKdO3eWq6urfv31V9P64OBg1a9fXzlz5kzRecyZM0fTpk3TDz/8oGPHjun777+XJJ09e1bvv/++unbtqsDAQMXGxurAgQOm7bZu3aqZM2dq1KhRWr16tapVq6aePXvq9u3bKl26tN58801NnTpV9+7d06xZsxQfH68PPvgg2RqaN2+u9evXq1y5cho1apQpWP/www9148YNLV26VKNHj9asWbNS/Dt6EoJbAAAAAAAAwIq88847+vzzz1WwYEGtWLFC7777rurWravAwMBU7eftt99WuXLl1KJFC+XPn1/NmzdXnTp1VK1aNdWqVUvnz5839a1YsaK6deumEiVKqEWLFoqMjNTIkSNVunRp9ejRQ7dv31Z4eLgePHigt99+W++8846KFSumatWqqUmTJvr7779N+zIYDHrrrbdUunRpOTs769VXX9WmTZtM6zdt2qTmzZun+DzeffddVapUSZUrV1bLli117NgxSVJgYKC8vb3Vu3dvlS5dWqNGjTKbRmLevHnq37+/GjZsqJIlS+r9999XkSJFtHbtWknSgAEDlCtXLgUEBOi7777ThAkTlD179iRrqF+/vtq1a6fixYvrrbfe0r///quwsDBduHBBu3bt0uTJk1WuXDnVr19fAwcOTPG5PQlz3AIAAAAAAABWplWrVmrVqpVu3rypP/74Qz/88IMCAgLk7u6uihUrpmgfxYoVM/2cLVs2FSlSxGw5OjratFy0aFGzdS+99JKyZcsmSXJ0dJT0cDqBokWLqk2bNlqwYIH++usvnT17VqdPn1bVqlVN2+fPn9+0rSS1aNFCCxYs0M2bN3XlyhXdvHlTDRo0SPFjUaJECdPPOXPmVExMjKSH0x2UL1/etM7e3t5s+dy5c/r88881bdo0U1tUVJQuXrwoSXJwcNAnn3yiHj16qH379qpevXqyNZQsWdKsBkmKjY3V6dOnlTdvXrPHukqVKik+tychuAUAAAAAAACsxKlTp7RmzRoNGzZMkpQvXz61bNlSTZs2VZMmTbRnz54kg9vH53aVJFtbW7NlG5vkv3xvZ2ceEybXNyQkRO3bt5eHh4dq166tTp06adu2bTpy5IipT0LQm6B8+fIqXry4Nm/erIsXL8rPzy9Rnyd50vQQRqMx2b5xcXEaMWKE6SZvCR6douHUqVOytbXVoUOHFB0dLQcHhxTXYDQaZWdnl6iG54WpEgAAAAAAAAArERcXp++//14nT540a3dwcFC2bNnk7Ows6WGQeP/+fdP6K1euZEh9v/76q/LkyaNvvvlGvXr1kre3t65cufLU8LJFixb67bfftH379lRNk/AkZcuWNU2bIEnx8fE6deqUablUqVL677//VKJECdO/OXPm6PDhw5Kk//77T1988YUmTZqkmJgYzZkzJ9U1lC5dWrdv3zZ7/I8fP572k3oEwS0AAAAAAABgJTw8PNSgQQO9/fbbWrduna5evarDhw9rzJgxio6OVpMmTSRJnp6e+umnn3TmzBnt3btX8+fPz5D68ubNq2vXrmn37t26cuWK5s6dq19++cVs2oWktGjRQn/88YfCwsJUp06d51JLp06ddPz4cc2ePVvnz5/X5MmTde3aNdP6Pn36aOHChVqzZo0uX76szz//XMHBwSpdurQk6ZNPPpGXl5datWqlESNGaO7cuTp79myqaihVqpR8fHw0YsQInTp1Sjt37tRXX331XM6PqRIAAAAAAACQ5TjbOlvtcb744gvNmTNHM2fO1LVr15QjRw75+Pjohx9+MH3N//3339fw4cPVrl07vfzyy3rvvff0wQcfPO/yE/H399eff/6pd999VwaDQZ6envr44481Y8aMJ4a3JUqUUJkyZVShQoUnTn2QGiVKlNDs2bM1ceJEzZ49W40aNVL9+vVN61999VWFh4frq6++Unh4uMqUKaPZs2erZMmS+vnnn7Vjxw6tW7dOkuTr66s6depo1KhRWrJkSarqmDhxokaNGqVOnTrJ1dVV7dq107x58575/AzG9JqEwQrFxcXp8OHDqlKlSqI5PpD18HwA0oaxA6QNYwdIG8YOkDaMHTwuqz4nHjx4oAsXLqhUqVJmN8uKN8bLxpBxX0TP6ONZo/j4eDVs2FCTJ09WzZo1LV3OcxMZGaldu3apXr16pkA6ODhYn3/+ubZu3Zqof3LPyaRwxS0AAAAAAACylIwOUbN6aLtt2zb98ccfypYtm6pXr27pcp4rR0dHjRgxQl27dlX79u0VHh6uWbNmqWnTps+8b4JbAAAAAAAAAOnmu+++04ULF/TFF1/IxiZzhdg2NjaaNWuWpkyZou+//145c+ZUq1atnsu0FQS3AAAAAAAAANLN4sWLLV1CuvL29taKFSue+34zV8QNAAAAAAAAAJkAwS0AAAAAAAAAWBmCWwAAAAAAAGRqRqPR0iUAklL3XGSOWwAAAAAAAGRK9vb2MhgMCgsLU4ECBWQwGCxdErIwo9GosLAwGQwG2dvbP7U/wS0AAAAAAAAyJVtbWxUtWlRXr17VxYsXLV0OIIPBoKJFi8rW1vapfQluAQAAAAAAkGnlzJlTZcuWVUxMjKVLAWRvb5+i0FYiuAUAAAAAAOkgODhYM2fOVExMjFq1aqWBAwea1oWEhOjNN980Ld+/f18hISHau3evIiIiNHLkSF29elVOTk4aNmyYvLy8LHEKyERsbW1THJYB1sLiNyeLiorSiBEj5O3tLR8fH82fPz/Zvr/++qv8/f3l5eWlrl276sSJExlYKQAAAAAASImwsDBNmTJFixcv1oYNG7R//37t2LHDtN7V1VVBQUEKCgrSmjVrVKJECY0cOVI5cuTQpEmTVK5cOa1fv15Tp07V0KFD9eDBAwueDQBYhsWD2ylTpuj48eNauHChxowZo5kzZ2rTpk2J+v39998aPHiw+vfvr6CgIJUvX179+/dXZGSkBaoGAAAAAADJ2blzp2rWrClnZ2fZ29urTZs22rhxY5J9161bp9jYWHXu3FmS9Ndff6l58+aSpGLFiilv3rw6dOhQhtUOANbCosFtRESEVq5cqYCAAHl4eKhx48bq27evfvzxx0R9d+7cqTJlyqhNmzYqXry4PvzwQ4WFhens2bMWqBwAAAAAACQnNDRULi4upmUXFxeFhIQk6hcfH69Zs2ZpyJAhprYKFSpo/fr1MhqNOnPmjM6ePavw8PAMqRsArIlFg9tTp04pNjbWbK6aatWq6ciRI4qPjzfrmzdvXp09e1YHDhxQfHy8Vq1apZw5c6p48eIZXTYAAAAAAHiCx/+mlx7eSf1xu3btkouLizw9PU1tw4cP1+XLl9WqVSstWrRINWrUkL29fbrWCwDWyKI3JwsLC1O+fPnk4OBganvppZcUFRWlW7duydnZ2dT+6quvauvWrerWrZtsbW1lY2Ojb775Rnny5En1cePi4p5L/XixJTwPeD4AqcPYAdKGsQOkDWMHSBtLjx0XFxf9+eefpuOHhITI1dU1UT2//vqrXn31VbP2e/fuaezYscqZM6ckqU2bNipSpAivA8+Ixw948Vg0uI2MjDQLbSWZlqOjo83ab968qbCwMI0ePVqVK1fW0qVLNXz4cK1evVr58+dP1XGPHTv2bIUjU+H5AKQNYwdIG8YOkDaMHSBtLDV2cufOrd9//13bt2+Xk5OTfvzxRzVq1EiHDx8267dz50698sorZu0//PCDcuXKpdatW+vo0aO6e/euHjx4kGhbAMjsLBrcOjo6JgpoE5azZctm1j516lS5ubmpe/fukqTx48fL399fgYGBevPNN1N1XE9PT9na2j5D5cgM4uLidOzYMZ4PQCoxdoC0YewAaWMNY2fTpk2aNWuWYmJi1LJlS73zzjumdaGhoerfv79pOSIiQiEhIdq1a5c++OADhYaGSpJprs7vvvtOtWrVyvBzQNZjDWMnKipK06ZNU3R0tHx9fdWvXz+NGjVKDRs2lK+vryQpPDxcvr6+cnR0NG1XsmRJDRkyRKNHj1bOnDn1zTffqEyZMhY5h8wk4TkB4MVh0eDW1dVVN2/eVGxsrOzsHpYSFhambNmyKXfu3GZ9T5w4oR49epiWbWxsVK5cOV27di3Vx7W1teUPJpjwfADShrEDpA1jB0gbS42dsLAwTZ06VYGBgcqVK5f69eunXbt2qW7dupKkQoUKae3atZIehrN9+/ZVv379lCtXLs2bN8+0nzlz5sjNzU0+Pj4Zfg7I2iz5vtO8eXM1b97crO2zzz4zW07qKtr8+fPr+++/T8/SAOCFYNGbk5UvX152dnZmL9QHDhyQp6enbGzMS3NxcdG5c+fM2i5cuKCiRYtmRKkAAAAAsqCdO3eqZs2acnZ2lr29vdq0aaONGzcm2XfdunWKjY1V586dzdqvXLmiJUuWaOTIkRlRMmDCDb0A4MVm0eA2e/bsatOmjcaOHaujR49q8+bNmj9/vnr27Cnp4afbDx48kCR16tRJK1as0Jo1a3Tp0iVNnTpV165dU9u2bS15CgAAAAAysdDQULm4uJiWXVxcFBISkqhffHy8Zs2apSFDhiRaN3v2bPXu3Vv58uVL11qBx3lU9OBbHs9JvDHe0iUAyIIsOlWCJA0fPlxjx45Vr169lDNnTg0aNEhNmjSRJPn4+GjixIlq166dXn31Vd2/f1/ffPON/vvvP5UvX14LFy5M9Y3JAAAAACCl4uMThzUGgyFR265du+Ti4iJPT0+z9nv37mnLli0KCAhItxqB5Nja2GrT/U26EXfD0qW80JxtndXMqZmlywCQBVk8uM2ePbsmT56syZMnJ1p3+vRps+WOHTuqY8eOGVUaAAAAgCyuYMGC2rdvn2k5NDRUBQsWTNRv8+bNatGiRaL233//XT4+PnJyckrXOoHk3Ii7obC4MEuXAQBIA4tOlQAAAAAA1qxWrVras2ePwsPDFRMTo7Vr16pBgwaJ+h04cEDVq1dP1H7w4MEk2wEAAJ6G4BYAAAAAkuHq6qqhQ4eqT58+atGihdzd3dW4cWMFBARoy5Ytpn5XrlxR4cKFE21/+fJlFSpUKCNLBgAAmYTFp0oAAAAAAGvm7+8vf39/s7YJEyaYLR8+fDjJbefOnZteZQEAgEyOK24BAAAAWDV7e3tLlwAAAJDhCG4BAAAAWDWPih6ytbW1dBmZQrwx3tIlAACAFGKqBAAAAABWzdbGVpvub9KNuBuWLuWF5mzrrGZOzSxdBgAASCGCWwAAAABW70bcDYXFhVm6DAAAgAzDVAkAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAJlUcHCwmjdvriZNmmjmzJlm60JCQtS6dWvTv0aNGsnT01MRERGmPvfu3VOjRo20d+/ejC4dAAAgy2OOWwAAACATCgsL05QpUxQYGKhcuXKpX79+2rFjh+rWrStJcnV1VVBQkCTJaDSqb9++6tevn3LkyGHax/jx43Xnzh2L1A8AAJDVccUtAAAAkAnt3LlTNWvWlLOzs+zt7dWmTRtt3Lgxyb7r1q1TbGysOnfubGrbuHGjnJyc5O7unlElAwAA4BEEtwAAAEAmFBoaKhcXF9Oyi4uLQkJCEvWLj4/XrFmzNGTIEFPbtWvXtHDhQn300UcZUisAAAASI7gFAAAAMqH4+PhEbQaDIVHbrl275OLiIk9PT9N2AQEBGjVqlLJly5budQIAACBpBLcAAABAJlSwYEGFhYWZlkNDQ1WwYMFE/TZv3qwWLVqYls+fP6/z588rICBArVu31vHjxzVy5Ejt2rUrQ+oGAADAQwS3AAAAQCZUq1Yt7dmzR+Hh4YqJidHatWvVoEGDRP0OHDig6tWrm5bLlCmj7du3KygoSEFBQapYsaI+/fRT1a5dOwOrBwAAAMEtAAAAkAm5urpq6NCh6tOnj1q0aCF3d3c1btxYAQEB2rJli6nflStXVLhwYQtWCgAAgKTYWboAAAAAAOnD399f/v7+Zm0TJkwwWz58+PAT97F48eLnXRYAAABSgOAWAPDCCQ4O1syZMxUTE6NWrVpp4MCBpnUhISF68803Tcv3799XSEiI9u7dqxw5ckh6eCOeb775RgsXLszw2gEAAAAASAmCWwDACyUsLExTpkxRYGCgcuXKpX79+mnHjh2qW7eupIdfDQ4KCpIkGY1G9e3bV/369VOOHDkUFxenBQsWaO7cuXJzc7PkaQDIIuKNRtkYDJYu44Vma2tr6RIAAAAsguAWAPBC2blzp2rWrClnZ2dJUps2bbRx40ZTcPuodevWKTY2Vp07d5Yk/f3337pw4YLGjx/PV38BZAgbg0FrL97V9Qexli7lhfVybgfVL+xk6TKQxfDtHgCANSC4BQC8UEJDQ+Xi4mJadnFxUUhISKJ+8fHxmjVrlqZOnWpqK1eunD799FPt3bs3Q2oFAEm6/iBWIZFxli7jhZXfkccOGYtv9wAArIWNpQsAACA14uPjE7UZkvga8q5du+Ti4iJPT8+MKAsAAGQSj367x97e3vTtnqQ86ds9AAA8K4JbAMALpWDBggoLCzMth4aGqmDBgon6bd68WS1atMjI0gAAQCaQ2m/3DBkyxNSW8O2ePHnyZEitAIDMjeAWAPBCqVWrlvbs2aPw8HDFxMRo7dq1atCgQaJ+Bw4cUPXq1TO+QAAA8ELj2z0AAGtBcAsAeKG4urpq6NCh6tOnj1q0aCF3d3c1btxYAQEB2rJli6nflStXVLhwYQtWCgAAXkR8uwcAYC24ORkA4IXj7+8vf39/s7YJEyaYLR8+fDjZ7WvUqKEaNWqkR2kAAOAFV6tWLX311VcKDw9Xnjx5tHbtWnXt2jVRvwMHDqhXr14WqBAAkFVwxS0AINXs7e0tXQIAAEC64Ns9AABrwRW3AIBU86joIVsbW0uXkSnEG+NlY+BzVAAArAnf7gEAWAOCWwBAqtna2GrT/U26EXfD0qW80JxtndXMqZmlywAAAAAAWCGCWwBAmtyIu6GwuLCndwQAAEileKNRNgaDpct4odna8u0oAHjREdwCAAAAAKyKjcGgtRfv6vqDWEuX8sJ6ObeD6hd2snQZAIBnQHALAAAAALA61x/EKiQyztJlvLDyO/LYAcCLjruhAAAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCQAYKDg5W8+bN1aRJE82cOTPR+tDQUL355ptq3bq1unTpoqtXr0qS7t27p8GDB6t169Zq06aNTpw4kdGlAwAAAACADERwCwAZJCwsTFOmTNHixYu1YcMG7d+/Xzt27DDr89FHH6lhw4YKCgpS69atNWXKFEnSxIkTVahQIQUFBenDDz/U6NGjLXEKAAAAAAAgg9hZugAAyCp27typmjVrytnZWZLUpk0bbdy4UXXr1pUk3bhxQ6dOndL3338vSWrfvr1q1aolo9GoX375RVu2bJEk1atXTwULFrTMSQAAAAAAgAzBFbcAkEFCQ0Pl4uJiWnZxcVFISIhp+cqVKypcuLAmTZqkVq1aadCgQbK3t9f169fl4OCgJUuWqE2bNurRo4fi4+MtcQoAAAAAACCDENwCQAZJKmw1GAymn2NjY3XixAm98sorWrt2rRo1aqRhw4YpLi5O4eHhypEjh9asWaMBAwbonXfeycjSAQAAAABABiO4BYAMUrBgQYWFhZmWQ0NDzaY8KFCggHLkyKFGjRpJklq0aKGjR48qX758srOzU4sWLSRJderUUUREhK5fv56xJwAAAAAAADIMwS0AZJBatWppz549Cg8PV0xMjNauXasGDRqY1hcvXlyFChXS1q1bJUnbt29XhQoV5ODgoNq1a2vDhg2SpKNHjyp79uzKly+fJU4DAAAAAABkAG5OBgAZxNXVVUOHDlWfPn0UHR0tX19fNW7cWAEBAfL19ZWfn59mzpypMWPGaNq0aXJyctKkSZMkSRMmTNDo0aO1fPly2dra6v/+7/9kY8NnbwAAAAAAZFYEtwCQgfz9/eXv72/WNmHCBNPPL7/8shYvXpxoOxcXF82ZMyfd6wMAAAAAANaBy7UAAAAAAAAAwMoQ3ALIMuKNRkuXkCnY2tpaugQAAAAAADI9pkoAkGXYGAxae/Gurj+ItXQpL7SXczuofmEnS5cBAAAAAECmRnALIEu5/iBWIZFxli7jhZbfkccPAAAAAID0xlQJAAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZiwe3UVFRGjFihLy9veXj46P58+cn2/f06dPq2rWrKlWqpJYtW2rPnj0ZWCkAAAAAAAAAZAyLB7dTpkzR8ePHtXDhQo0ZM0YzZ87Upk2bEvW7e/euXn/9dZUpU0br1q1T48aNNXDgQF2/ft0CVQMAAAAAAABA+rFocBsREaGVK1cqICBAHh4eaty4sfr27asff/wxUd/Vq1crR44cGjt2rEqUKKF3331XJUqU0PHjxy1QOQAAAAAAAACkHztLHvzUqVOKjY2Vl5eXqa1atWqaM2eO4uPjZWPzv1x537598vPzk62traktMDAwQ+sFAAAAAAAAgIxg0Stuw8LClC9fPjk4OJjaXnrpJUVFRenWrVtmfa9cuSJnZ2eNGjVKderUUadOnXTgwIEMrhgAAAAAAAAA0p9Fr7iNjIw0C20lmZajo6PN2iMiIjR37lz17NlT3377rTZs2KA33nhDwcHBKlSoUKqOGxcX92yFI1NIeB7wfMg6Hr1iH7AmvA5lDbzvZE2898AavQivQ4wdWKMXYew8yYteP5AVWTS4dXR0TBTQJixny5bNrN3W1lbly5fXu+++K0mqUKGCdu7cqaCgIA0YMCBVxz127NgzVI3MhudD1pA9e3ZVqFDB0mUASTp9+rQiIyMtXQYyCO87WQfvPbBW1v6+w9iBtbL2sQMg87FocOvq6qqbN28qNjZWdnYPSwkLC1O2bNmUO3dus74FChTQyy+/bNZWsmRJ/fvvv6k+rqenJ5/gQnFxcTp27BjPBwAW5+7ubukSkAF43wFgLXjfAdLmRR87Cf8XAfDisGhwW758ednZ2enw4cPy9vaWJB04cECenp5mNyaTpCpVqujPP/80azt//rxatGiR6uPa2tryBxNMeD4AsDReg7IW3ncAWBqvQUDaMHYAZDSL3pwse/bsatOmjcaOHaujR49q8+bNmj9/vnr27Cnp4dW3Dx48kCR16dJFp0+f1owZM3Tp0iV9+eWXunLlilq3bm3JUwAAAAAAAACA586iwa0kDR8+XB4eHurVq5c++eQTDRo0SE2aNJEk+fj4aOPGjZKkIkWKaN68efrtt9/UokUL/fbbb5o7d65cXV0tWT4AAAAAAAAAPHcWnSpBenjV7eTJkzV58uRE606fPm22XK1aNa1atSqjSgMAAAAAAAAAi7D4FbcAAAAAAAAAAHMEtwAAAAAAAABgZSw+VQIAAAAyRnBwsGbOnKmYmBi1atVKAwcONFu/Y8cODRkyRAULFpQkVahQQRMnTlRkZKRq166t4sWLm/quWrWKu2sDAAAA6YjgFgAAIAsICwvTlClTFBgYqFy5cqlfv37asWOH6tata+pz9OhRvfXWW+rdu7fZtidOnFDNmjU1e/bsDK4aAAAAyLqYKgEAACAL2Llzp2rWrClnZ2fZ29urTZs22rhxo1mfY8eOadu2bWrTpo3eeust/ffff6b2kJAQdezYUV26dNH+/fstcQoAAABAlkJwCwAAkAWEhobKxcXFtOzi4qKQkBCzPnny5NHrr7+uNWvWqG7duho8eLAkyWAwqFmzZlqxYoVGjRql999/Xzdv3szQ+gEAAICshqkSAAAAsoD4+PhEbQaDwWx58uTJpp+7deumadOm6e7du2ZTJ3h4eMjT01MHDx6Un59futULAAAAZHVccQsAAJAFFCxYUGFhYabl0NBQ003IJCkqKkrffPON2TZGo1F2dnZauXKl/v3330TtAAAAANIPwS0AAEAWUKtWLe3Zs0fh4eGKiYnR2rVr1aBBA9N6R0dHrVq1Slu3bpUkBQYGqkqVKsqePbuOHTumRYsWSZLOnj2rkydPqlq1apY4DQAAACDL4FIJAACALMDV1VVDhw5Vnz59FB0dLV9fXzVu3FgBAQHy9fWVn5+fpk2bprFjx+r//u//lD9/ftPUCR988IGGDx+u5s2by8bGRlOmTFHOnDktfEYAAABA5kZwCwAAkEX4+/vL39/frG3ChAmmnz08PLRy5cpE2+XLl09z5sxJ9/oAAAAA/A9TJQAAAGQQe3t7S5cAAAAA4AWRpuB23759Onz4sCTp2rVrGjBggFq2bKlZs2Y9z9oAAAAyFY+KHrK1tbV0GZlCvDHe0iUAAAAA6SrVUyWsWbNGw4cP1+uvv64qVapo9OjROnDggOrUqaM5c+bI3t5eb775ZnrUCgAA8EKztbHVpvubdCPuhqVLeaE52zqrmVMzS5cBAAAApKtUB7cLFixQ27ZtNXToUIWFhWnXrl0aPHiw3njjDc2fP1/Lly8nuAUAAEjGjbgbCosLs3QZAAAAAKxcqqdKOH/+vNq0aSNJ2r59u4xGo/z8/CRJnp6e+vfff59rgQAAAAAAAACQ1aQ6uM2dO7fu3bsnSdqxY4cKFy6skiVLSpIuX76sfPnyPdcCAQAAAAAAACCrSfVUCTVq1NDMmTN19uxZbdmyRX369JEk/fzzz/ryyy/l4+Pz3IsEAABZV3BwsGbOnKmYmBi1atVKAwcONFu/Y8cODRkyRAULFpQkVahQQRMnTtTNmzcVEBCgq1evymg0asCAAWrevLklTgEAAAAAUi3VwW1AQICGDh2qmTNnqlatWurfv78kaeLEiSpcuLAGDx783IsEAABZU1hYmKZMmaLAwEDlypVL/fr1044dO1S3bl1Tn6NHj+qtt95S7969zbb96quvVKFCBX399dcKCwtT27ZtVaNGDb300ksZfBYAAAAAkHqpDm6dnZ313XffJWpfsmSJChcu/FyKAgAAkKSdO3eqZs2acnZ2liS1adNGGzduNAtujx07pgcPHmjNmjUqVKiQxowZo4IFC6pevXqqWLGiJKlAgQLKmzevwsPDCW4BAAAAvBBSPcdtgnPnzmnRokWaOnWqQkJCdO3aNdPctwAAAM9DaGioXFxcTMsuLi4KCQkx65MnTx69/vrrWrNmjerWrWv69k/Dhg1VoEABSdKGDRsUHR2tMmXKZFzxAAAAAPAMUn3FbXx8vEaPHq3AwEAZjUYZDAb5+/vr66+/1qVLl/Tjjz+a5pgDAAB4FvHx8YnaDAaD2fLkyZNNP3fr1k3Tpk3T3bt3lStXLklSUFCQPv/8c82bN092dqn+rw8AAAAAWESqr7j9+uuvtW7dOn366afauXOnjEajJGno0KEyGo2aPn36cy8SAABkTQULFlRYWJhpOTQ01OwD4qioKH3zzTdm2xiNRlNAO3fuXH355ZdauHChypUrlzFFAwAAAMBzkOrgNjAwUO+++67at2+vvHnzmtrLly+vd999Vzt37nye9QEAgCysVq1a2rNnj8LDwxUTE6O1a9eqQYMGpvWOjo5atWqVtm7dKunh/1OqVKmi7Nmza9WqVVq9erWWL1+u0qVLW+gMAAAAACBtUv19wfDwcJUvXz7Jda6urrpz584zFwUAACA9/L/F0KFD1adPH0VHR8vX11eNGzdWQECAfH195efnp2nTpmns2LH6v//7P+XPn980dcL06dNlMBjUt29f0/7GjRunypUrW+p0AAAAACDFUh3clihRQtu3b1ft2rUTrdu3b59KlCjxXAoDAACQJH9/f/n7+5u1TZgwwfSzh4eHVq5cmWi7HTt2pHttAAAAAJBeUh3c9urVS6NHj1ZMTIwaNmwog8GgS5cuae/evZo/f76GDRuWHnUCAAAAAAAAQJaR6uC2Y8eOunHjhmbPnq2lS5fKaDTqww8/lL29vfr27auuXbumR50AAMBC4o1G2RgMli7jhWdra2vpEgAAAAC8QFId3EpS//791b17dx08eFC3b99W7ty5VblyZbOblQEAgMzBxmDQ2ot3df1BrKVLeaG9nNtB9Qs7WboMAAAAAC+INAW3kpQzZ07Vq1fvedYCAACs1PUHsQqJjLN0GS+0/I48fgAAAABSLtXBbc+ePZ/aZ9GiRWkqBgAAAAAAAACQhuDWaDQmaouIiNC5c+eUI0cONWnS5LkUBgAAAAAAAABZVaqD28WLFyfZfvv2bfXr108vv/zyMxcFAAAAAAAAAFmZzfPaUZ48efTmm29qwYIFz2uXAAAAAAAAAJAlPbfgNsH169ef9y4BAAAAAAAAIEtJ9VQJf/75Z6K2uLg4/ffff/r666/l4eHxXAoDAAAAAAAAgKwq1cFtjx49ZDAYErUbjUYVKlRII0aMeC6FAQAAAAAAAEBWlergdtGiRYnaDAaDcubMKXd3d9nYPPfZFwAAAAAAAAAgS0l1cFu9evX0qAMAAAAAAAAA8P+lKLgdPnx4indoMBj02WefpbkgAAAAAAAAAMjqUhTc7t27N8U7TGr+WwAAAAAAAABAyqUouN26dWt61wEAAAAAAAAA+P+e653EIiIi9Pvvvz/PXQIAAAAAAABAlpPqm5P9888/Gjt2rPbt26fo6Ogk+/z111/PXBgAAAAAAAAAZFWpDm4nTpyogwcPqmPHjjp48KCyZ8+uKlWqaOfOnTpz5oxmzJiRHnUCmVJwcLBmzpypmJgYtWrVSgMHDkyy38mTJ9WpUycdP35ckhQZGanatWurePHipj6rVq2Sra1thtQNAAAAAACA9JXqqRL+/PNPffDBBxo5cqTatWsnR0dHDR06VIGBgXrllVe0ZcuW9KgTyHTCwsI0ZcoULV68WBs2bND+/fu1Y8eORP0iIyM1btw4xcTEmNpOnDihmjVrKigoyPSP0BYAAAAAACDzSHVwe//+fbm7u0uSXn75ZZ08eVKSZGtrq27dumnPnj3Pt0Igk9q5c6dq1qwpZ2dn2dvbq02bNtq4cWOifpMmTVLv3r3N2o4dO6aQkBB17NhRXbp00f79+zOoagAAAAAAAGSEVAe3Li4uCg8PlySVKFFCt2/fVlhYmCQpb968un79+vOtEMikQkND5eLiYlp2cXFRSEiIWZ8tW7bowYMHatasmVm7wWBQs2bNtGLFCo0aNUrvv/++bt68mSF1AwAAAAAAIP2lOritX7++vvjiCx06dEhFihRRwYIFNX/+fN27d0+BgYFydXVNjzqBTCc+Pj5Rm8FgMP0cFham2bNna9SoUYn69e7dW2+++aYMBoM8PDzk6empgwcPpmu9AAAAAAAAyDgpCm579OihtWvXKioqSu+++65y586tL7/8UpL0wQcfaOHChXrllVe0bt069enTJ10LBjKLggULmq5Wlx5egVuwYEHT8rZt23Tr1i11795drVu3liS1bt1ad+7c0cqVK/Xvv/+a+hqNRtnZpfpegwAAAAAAALBSKUp6bt26pY8++kjjx49XixYtNGbMGNOVta1atVLhwoV1+PBhVapUSdWrV0/XgoHMolatWvrqq68UHh6uPHnyaO3ateratatpfceOHdWxY0fTsru7u4KCgiQ9nOP2/Pnz+vjjj3X27FmdPHlS1apVy/BzAAAAAAAAQPpIUXC7bt06nThxQqtXr9bGjRu1bNkyubu7q2PHjmrZsqW8vb3l7e2d3rUCmYqrq6uGDh2qPn36KDo6Wr6+vmrcuLECAgLk6+srPz+/ZLf94IMPNHz4cDVv3lw2NjaaMmWKcubMmYHVAwAAAAAAID2l+LvVHh4e8vDw0LBhw7R9+3atWbNGkyZN0pQpU9S4cWN16NBBNWvWTM9agUzH399f/v7+Zm0TJkxIsu/p06dNP+fLl09z5sxJ19oAAAAAAABgOameFNPOzk5+fn7y8/PT7du3tX79eq1du1a9e/dWsWLF1L59ew0YMCA9agWeO3t7e0uXAAAAAAAAACSSopuTJSdPnjzq3r27li9frsWLF8vW1tZ00zJkXsHBwWrevLmaNGmimTNnJtvv5MmTqlixomn52rVr6tmzp1q1aqWOHTvqr7/+yohyn8ijoodsbW0tXUamEG+Mt3QJAAAAAAAAmcYz3YY+LCxMGzZs0Pr163XixAkVKlRIb7/99vOqDVYoLCxMU6ZMUWBgoHLlyqV+/fppx44dqlu3rlm/yMhIjRs3TjExMaa2SZMmqWXLlurYsaN+//13ffLJJ1q2bFlGn4IZWxtbbbq/STfibli0jheds62zmjk1s3QZAAAAAAAAmUaqg9v79+/rl19+0bp167R3717Z2tqqUaNG+uCDD1S7dm0ZDIb0qBNWYufOnapZs6acnZ0lSW3atNHGjRsTBbeTJk1S7969dejQIVPbF198Yfr56tWryp07d4bU/DQ34m4oLC7M0mUAAAAAAAAAJikKbmNjY7V9+3atW7dO27Zt04MHD1S+fHkNHz5cLVu2VJ48edK7TliJ0NBQubi4mJZdXFwUEhJi1mfLli168OCBmjUzvwLTxubhzBxNmjTRtWvXNHv27PQvGAAAAAAAAHgBpSi4rVOnju7cuaPcuXOrffv2at++vSpUqJDetcEKxccnnsf00ausw8LCNHv2bC1YsCDZffzyyy86ceKE3njjDW3atEl58+ZNh0oBAAAAAACAF1eKglsPDw+1b99ejRs3loODQ3rXBCtWsGBB7du3z7QcGhqqggULmpa3bdumW7duqXv37qa21q1ba/Hixdq3b598fHyULVs2eXh4qEiRIrpy5QrBLQAAAAAAAPCYFAW38+fPT+868IKoVauWvvrqK4WHhytPnjxau3atunbtalrfsWNHdezY0bTs7u6uoKAgSdLKlSsVEhKi7t2768yZM7p+/bpKly6d4ecAAAAAAAAAWLtU35wMWZurq6uGDh2qPn36KDo6Wr6+vmrcuLECAgLk6+srPz+/ZLcdM2aMRowYoRUrVsjR0VHTpk1Tjhw5MrB6AAAAAAAA4MVAcItU8/f3l7+/v1nbhAkTkux7+vRp08+FCxd+4ty3AAAAAAAAAB6ysXQBAAAAAAAAAABzBLcvoHij0dIlZAq2traWLgEAAAAAAABIElMlvIBsDAatvXhX1x/EWrqUF9rLuR1Uv7CTpcsAAAAAAAAAEiG4fUFdfxCrkMg4S5fxQsvvyOMHAAAAAAAA68RUCQAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJWxeHAbFRWlESNGyNvbWz4+Ppo/f/5Tt7l69aq8vLy0d+/eDKgQAAAAAAAAADKWxW9ONmXKFB0/flwLFy7UtWvX9PHHH6tw4cJq1qxZstuMHTtWERERGVglAAAAAAAAAGQciwa3ERERWrlypb799lt5eHjIw8NDf//9t3788cdkg9u1a9fq/v37GVwpAAAAAAAAAGQci06VcOrUKcXGxsrLy8vUVq1aNR05ckTx8fGJ+t+8eVOff/65xo0bl5FlAgAAAAAAAECGsmhwGxYWpnz58snBwcHU9tJLLykqKkq3bt1K1H/SpElq27atypYtm4FVAgAAAAAAAEDGsuhUCZGRkWahrSTTcnR0tFn7rl27dODAAa1fv/6ZjxsXF/fM+7AkW1tbS5cAJMnaxxZjB9aKsQOkjbWPHYnxA+vE2AHS5kUYO0/yotcPZEUWDW4dHR0TBbQJy9myZTO1PXjwQKNHj9aYMWPM2tPq2LFjz7wPS8mePbsqVKhg6TKAJJ0+fVqRkZGWLiNJjB1YM8YOkDbWPHYkxg+sF2MHSBtrHzsAMh+LBreurq66efOmYmNjZWf3sJSwsDBly5ZNuXPnNvU7evSorly5onfffdds+379+qlNmzapnvPW09OTT3CBdODu7m7pEoAXEmMHSBvGDpA2jB0gbV70sRMXF/dCX8gGZEUWDW7Lly8vOzs7HT58WN7e3pKkAwcOyNPTUzY2/5t+t1KlSvrll1/Mtm3SpIk+/fRT1alTJ9XHtbW1JbgF0gHjCkgbxg6QNowdIG0YO0DaMHYAZDSLBrfZs2dXmzZtNHbsWH322WcKDQ3V/PnzNXHiREkPr77NlSuXsmXLphIlSiTa3tXVVfnz58/osgEAAAAAAAAgXdk8vUv6Gj58uDw8PNSrVy998sknGjRokJo0aSJJ8vHx0caNGy1cIQAAAAAAAABkLItecSs9vOp28uTJmjx5cqJ1p0+fTna7J60DAAAAAAAAgBeZxa+4BQAAAAAAAACYI7gFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmCWwAAAAAAAACwMgS3AAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmCWwAAAAAAAACwMgS3AAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmCWwAAAAAAAACwMgS3AAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyFg9uo6KiNGLECHl7e8vHx0fz589Ptu+2bdvUunVreXl5qWXLltqyZUsGVgoAAAAAAAAAGcPiwe2UKVN0/PhxLVy4UGPGjNHMmTO1adOmRP1OnTqlgQMHqn379lqzZo26dOmi9957T6dOnbJA1QAAAAAAAACQfuwsefCIiAitXLlS3377rTw8POTh4aG///5bP/74o5o1a2bWd/369apZs6Z69uwpSSpRooS2bt2q4OBglStXzhLlAwAAAAAAAEC6sGhwe+rUKcXGxsrLy8vUVq1aNc2ZM0fx8fGysfnfBcFt27ZVTExMon3cvXs3Q2oFAAAAAAAAgIxi0eA2LCxM+fLlk4ODg6ntpZdeUlRUlG7duiVnZ2dTe+nSpc22/fvvv7V792516dIl1ceNi4tLe9FWwNbW1tIlAEmy9rHF2IG1YuwAaWPtY0di/MA6MXaAtHkRxs6TvOj1A1mRRYPbyMhIs9BWkmk5Ojo62e1u3LihQYMGqWrVqvLz80v1cY8dO5bqbaxF9uzZVaFCBUuXASTp9OnTioyMtHQZSWLswJoxdoC0seaxIzF+YL0YO0DaWPvYAZD5WDS4dXR0TBTQJixny5YtyW3Cw8PVp08fGY1GffXVV2bTKaSUp6cnn+AC6cDd3d3SJQAvJMYOkDaMHSBtGDtA2rzoYycuLu6FvpANyIosGty6urrq5s2bio2NlZ3dw1LCwsKULVs25c6dO1H/kJAQ083JFi1aZDaVQmrY2toS3ALpgHEFpA1jB0gbxg6QNowdIG0YOwAyWuovV32OypcvLzs7Ox0+fNjUduDAAXl6eia6kjYiIkJ9+/aVjY2NfvjhB7m6umZwtQAAAAAAAACQMSwa3GbPnl1t2rTR2LFjdfToUW3evFnz5883XVUbFhamBw8eSJK++eYbXb58WZMnTzatCwsL0927dy1WPwAAAAAAAACkB4tOlSBJw4cP19ixY9WrVy/lzJlTgwYNUpMmTSRJPj4+mjhxotq1a6eff/5ZDx48UMeOHc22b9u2rSZNmmSJ0gEAAAAAAAAgXVg8uM2ePbsmT55supL2UadPnzb9vGnTpowsCwAAAAAAAAAsxqJTJQAAAAAAAAAAEiO4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmCWwAAAAAAAACwMgS3AAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmCWwAAAAAAAACwMgS3AAAAAAAAAGBlCG4BAAAAAAAAwMoQ3AIAAAAAAACAlSG4BQAAAAAAAAArQ3ALAAAAAAAAAFaG4BYAAAAAAAAArAzBLQAAAAAAAABYGYJbAAAAAAAAALAyBLcAAAAAAAAAYGUIbgEAAAAAAADAyhDcAgAAAAAAAICVIbgFAAAAAAAAACtDcAsAAAAAAAAAVobgFgAAAAAAAACsDMEtAAAAAAAAAFgZglsAAAAAAAAAsDIEtwAAAAAAAABgZQhuAQAAAAAAAMDKENwCAAAAAAAAgJUhuAUAAAAAAAAAK0NwCwAAAAAAAABWhuAWAAAAAAAAAKwMwS0AAAAAAAAAWBmLB7dRUVEaMWKEvL295ePjo/nz5yfb9+TJk+rYsaMqV66s9u3b6/jx4xlYKQAAAAAAAABkDIsHt1OmTNHx48e1cOFCjRkzRjNnztSmTZsS9YuIiNCbb74pb29vrVq1Sl5eXurfv78iIiIsUDUAAAAAAAAApB+LBrcRERFauXKlAgIC5OHhocaNG6tv37768ccfE/XduHGjHB0d9dFHH6l06dIKCAiQk5NTkiEvAAAAAAAAALzILBrcnjp1SrGxsfLy8jK1VatWTUeOHFF8fLxZ3yNHjqhatWoyGAySJIPBoKpVq+rw4cMZWTIAAAAAAAAApDuLBrdhYWHKly+fHBwcTG0vvfSSoqKidOvWrUR9XVxczNry58+v//77LyNKBQAAAAAAAIAMY2fJg0dGRpqFtpJMy9HR0Snq+3i/JzEajaZ929rapqVkq2Bra6uXHAyyMRosXcoLLY+9FBcXp/zKLxvLT/f8QsunfIqLi1Pc/2vvzuOqrvI/jr/uZVPADUwR13ICNFxQJxtRIddyKccty36maWrikiHuaTqJIm4oLiiCJaiEFu6aa26p5UimqOOCDmGZC4aIslzu7w+HO1DWNE110ft+zmMej7z3e7+c64Pj95z393w/x2SydlN+lvrOb0N957ejvmNb1Hd+Ow9L3wH1n9+C+s5vR33Htqjv/HYepr7zcwrbX5iNiEjJZ9Xg1snJ6UfBa+GfS5Uq9YuO/eFxP6ew/EJKSsqvaW6J4vmv/8v/IBuSr0LFf/1P/jfJJFu7Cb+I+s5vQH3nN6W+Y0PUd35TD0vfAfWf/5n6zm9KfceGqO/8ph6mvvOf/LA0pYiUXFYNbitXrkxGRgb5+fnY299vyrVr1yhVqhRly5b90bHXr18v9tr169d/VD7h59jb21OvXj2MRqOlVq6IiIiIiIiIyKPObDZTUFBgyV9EpOSzam+tU6cO9vb2JCcn06RJEwCOHTtmCVeLatCgAcuWLcNsNmMwGDCbzfz9739n8ODBv/jnGY3GH5VbEBERERERERERESlprFropnTp0nTp0oV3332XEydOsHPnTmJiYujTpw9wf/XtvXv3AHjuuefIzMxk2rRpnD9/nmnTpnH37l2ef/55a34FERERERERERERkd+cwWzlqtR3797l3Xff5ZNPPsHV1ZX+/fvTt29fALy9vZk+fTpdu3YF4MSJE0yePJkLFy7g7e3NlClTqFu3rhVbLyIiIiIiIiIiIvLbs3pwKyIiIiIiIiIiIiLFWbVUgoiIiIiIiIiIiIj8mIJbERERERERERERkRJGwa2IiIiIiIiIiIhICaPgVkRERERERERERKSEUXArIiIiIiIiIiIiUsIouBUREZYtW0ZycrK1myEiIjasoKDA2k0QeSiYzeaf/bOIiDw6FNyKiNi469evs2LFCpYvX86pU6es3RyRh8Znn33GrVu3rN0MkYfejh07ADAaNTUR+SUMBgOAZdxW+GcREXn0aHQkj4yfWqWhO9AiP2379u1UrFiRNWvW8M9//pPFixcrvBX5BY4ePcqECRPYsWMHmZmZ1m6OyEMrOjqaiRMncvLkSWs3ReShcujQIaZPn265BmnOIyLyaFJwK48Es9lsWaWxfft21q5dy5dffkl+fj4Gg0EDGZEHOHz4MCNGjCAqKorq1asTGRnJ5cuXFd6K/AJPPvkkmZmZrFixgu3bt3P79m1rN0nkobN161bOnj1LaGgovr6+1m6OyEPlySef5OLFiyQkJABadSsi8qhScCsPvYKCAstAJTQ0lIkTJzJr1iymTJnC0qVLyc3NVXgr8gDPPPMMoaGhzJs3jyVLllC9enUWLVqk8FbkZ9y9exeA/Px8XF1deeyxx1iyZAnbtm1TeCvyX0hLSyMmJoaNGzdy6NAh8vLyAK0aFHmQB/WLxx57jPHjx7N3714uXLhghVaJiMgfQcGtPPQKV9qmpqaSmprKypUr2bBhAy1atODIkSNER0crvBX5gcLSIl27duW9995TeCvyC4SEhHDo0CHy8/M5d+4c5cuXZ8WKFXTo0IGlS5cqvBX5hebOncu8efNYtmwZ7dq144svvuDAgQOWJ6VEpLjCfrFx40ZWrFhheb1evXoYjUbOnj0LaIM/EZFHkYJbeSRs27aN1157jbt37+Lh4UGlSpUYNGgQDRs25PDhwwpvRYooKCgotgFMt27dmDp16gPD2yVLlii8FQHCwsLYt28frVu3xt7eHh8fH5577jnMZjPBwcG0a9dO4a3IL7B7926OHDlC27ZtKV++PLNmzaJixYpERUVx5MgR8vPzAa28FYHiQWxubi7bt29n3bp1dOrUiV27dlGpUiW6du1KaGgoN2/e1AZ/IiKPIP3LLg+lwsG82WymoKCAWrVq8eSTT1ruNgM4OzszePBgGjVqxOeff868efO0kkNsXtHQ9osvvmDXrl1kZWXRs2fPB668TUtLY8aMGVy8eNHKLRexLicnJ5566imys7P56KOPcHNzY9CgQZZrSkhIiMJbkf/g1KlTxMXFcfHiRf70pz8B4OjoyOLFi3FxcSEyMpKjR49qvCZC8TFbcnIyZ86cYdCgQSxfvpz69euzfPlyXn31VRwdHWnUqBHbtm0DdNNDRORRYzDrX3Z5yBQdxOTm5mJnZ4ednR2XL1/m7bffJicnh/Xr12NnZwfcr0c4e/ZsACZMmKCJgNgss9ls+f2fOXMma9euxcHBAYDY2Fi8vLxYu3YtEydO5K233mLw4MGkpqayePFiZsyYoVUcYtM++eQTli9fDsCXX37JwYMHcXd3t9xALLzmhIeHs3PnTnr37k337t1xdna2ZrNFSpTCFYNz586ldu3aREREWPpIbm4uQUFB/POf/yQ8PJz69etbubUiJUNYWBg7d+7kzp07tG7dmkGDBlGtWjWSk5M5dOgQ77//PgUFBfzpT39i9erVQPExn4iIPNwU3MpDpeggZNmyZRw5coTs7Gz+9Kc/MWTIEG7fvs348eO5d+8e69evtwRNOTk5ODo6WkolaCAjtiwmJoaYmBjCwsKoV68effv25d69e8ybNw8vLy/WrVvHO++8w+uvv86oUaMsn/thiQURW/Pmm2+yb98+OnbsSEhICI899pjlPZPJZAlvJ02aRGZmJnPnztX1RgTYsGED9+7d4/HHH+fPf/4zW7Zs4f333+fxxx9n8uTJlC5dGrgf3s6ePZvRo0db+pOILduxYwdTpkxh2bJlVKpUiVu3blG7dm3g39edM2fOcOTIEWJjY3nllVcYOHCglVstIiK/JQW38lBatGgRcXFx9O3bF0dHR1atWoWbmxujRo2iQoUKlrBp3bp1xYImhbZi6/Ly8ggKCqJ58+b06dOH48eP8/bbb1O2bFm+++473n//fby8vEhISODjjz9m9erV6jNi80wmEwaDgfDwcMqVK8eePXto0KABL7/8Mo8//rjluKI3NwqvN7ruiK2bO3cuK1eupHz58pQtW5YuXbrQt29fNm/ezKpVq6hRowaTJk2yhLeFit4MEbFViYmJbNq0iZiYmGL94cCBA5w8eZIBAwZgb2+P2Wxm7dq1fP755/ztb3+zLFgREZGHn5ZOSYm3ceNGy3+bzWZu3rzJvn37ePfddxk4cKBl8G80GgkPD8fDw4P58+dz7do1xo0bV+xcGsCIrfnh7sIGg4F79+5RqlQp0tPTWbVqFS+//DJJSUlUrFiRkSNHsmnTJl566SXWrFmjDf1EAKPRiNFoZMyYMQwePJiXXnqJ48ePs2bNGlJTU4sdV9jnFNqK3Hfp0iVWrFjBihUr6NChAxs3biQmJoaOHTvyyiuv8PXXXxMcHExubm6xzym0FVvzwzEbwPfff8+ZM2cs/SEvLw+Azz//nL179xa75jg7O3P06FFu3bqla4+IyCNEwa2UaJ9++inLly8vNigpKCjgu+++s6zMyMnJwcHBgZiYGC5fvszq1aupXr06CQkJhIaGWrP5IlZVdPXfhQsX+Mc//oHJZOKNN96gXr16nDx5kjt37tC8eXPMZjOVK1fm2rVrfPLJJ8XCWg3+xdYV9oHCa1HXrl155ZVX+Pvf/05CQsKPwtsffk7EFiUlJbFw4UK+/vprSpcuTY0aNejSpQtt2rRh8+bNxMbG0rFjR1588UUqVaqEvb29tZssYjVFx2xpaWlcunQJgJdffhlPT0+Cg4Mtcx6Apk2bAvf38gC4desWly9f5u7du+pLIiKPGP2rLiVaQEAALVq0wGg0kpycTMOGDSlbtiz29vbs2rWLFi1a4OTkRF5eHo6Ojnh7e3Pv3j0AqlatCuhRO7FdhROA2bNns23bNsxmM76+vrz33nu4urry4YcfUq5cOerWrQuAg4MDCxYs4Omnn9ZqQZEHMBqNln7x17/+FYPBwOrVq8nMzGTYsGFUqVLF2k0UKRFmzZpFQkIClStX5vz58+zdu5cqVapQqVIlunXrhsFgYOvWrdy5c4ehQ4fSvXt3QLXUxTaZzWbL7/3cuXPZtGkTAE2aNCEsLIw333yT2NhYRowYwYQJE7h79y4ffPABbm5ulC1bFoAyZcoQEBBAx44dcXd3t9p3ERGR356CWynxjEYjZ86coVevXpad7kePHs2ECRNwc3Nj+PDhODg4YDabuXv3rmUAU0ihrdiyDRs28PHHHzNv3jzKlStHbm4urq6uAJQtW5Zt27bx/vvvs2PHDm7fvk2TJk0sK9s1eRb5saI3Nbp06UJ2djYpKSlUrlzZ2k0TKRG++eYbMjMziYmJwcvLi5kzZ7J7925cXFwsq2u7du3KnTt3+Oabb4rdJNR1R2xR4e///PnzSUhIYNSoUZQpU4bg4GDKly/PuHHjKFeuHBEREXTt2hVPT09Kly7NypUrLWM2Ozs7nnrqKSt/ExER+T1oczIpkR4UGiUmJjJlyhTefvttXn31VZKSkggLC8PPzw9PT09SU1PJyMggKSlJjwiJ/Mvs2bNJS0tj3rx5xV7fsGEDtWrV4uOPP+bChQuULVuWuXPn4uDgoFXqIr9A0bBJG5GJ3O8Hhw4don///nh4eLBgwQLq1atHQUEBU6dO5R//+AedOnXixRdfxMXFhYyMDMqXL6++Izar6O/91atXGTRoECNHjiQgIIDDhw8zZMgQcnNzee6555g1axYAJ0+exNnZmVq1amE0GsnPz9e8R0TkEad/5aXEKRraHjhwgG+++Ya6devSo0cP7O3tGT9+PEajkb59++Lr68vy5cvJy8vDx8eHMWPGYG9vr+BJ5F/s7e25ceMGt2/fpkyZMpbXP/vsM9auXcsHH3xAVlaWZRWuJgBii35uhflPvWcwGCz9xWAwkJubi6Oj4+/dVJESy2Aw4O/vz9ChQ4mMjOTs2bN4eXnh5OTEpEmT+Nvf/sbWrVvJzs7m1VdfpUKFCgAKbcUmFf29//bbb8nJySEvLw83NzcuXrxIXFwcw4cPp2HDhvTq1QtnZ2fefvttnnrqqWJ11zVmExF59OlfeilxCifIM2fOZP369bi7u9OoUSOefPJJ/vrXv2I2m5kwYQK5ubkMHDiQ2bNnF/u8giexRUXDpaL/3bhxYxITE9m9ezft27enVKlSADz11FPcvn0bwBLams1m9R2xOUX7y5YtW7h69So5OTk888wzNGzY8CcD3aL9JTExEbi/aZluGoqtKrxpPnToULKzs5kyZQouLi60bt0aR0dH3nnnHUJCQvj6669xcnKyfE6hrdiaoqHtvHnzOHPmDDNmzKBly5aUKlWKAwcOULZsWVq3bk2pUqWoWLEiH374IQUFBbz33nuW86i0iIiIbdAMXUqEQ4cO0axZM8ufd+7cyYYNG1i0aBENGjTg5s2b3Lhxg5s3b/L888/j7OzMqFGjuHPnDiNHjix2LgVPYmuKbmqxYsUKzp07xzfffENAQADdu3enT58+hIaGkpubS/369alWrRp79+61rHYqpMmz2KLCvhMeHk5SUhKBgYGkp6ezefNmOnbsyODBg4EHl0YASEhIYPLkyURGRiq0FZtmZ2dnuREyevRozGYzY8aMISwszBLezpo1S6VFxOYV/t7v3r2bw4cP8/rrr1O+fHlGjRqFnZ0doaGhtGzZkurVq/P999/TuHFj3njjDerUqWPllouIiDUo4RKr++ijj1i5ciUfffSRZSBz/fp1vLy8aNCgAUeOHOHDDz/k8OHD5Obm0rlzZ8aOHcv48ePZtGmTBv5i8wp//+fOnUtCQgLDhg3D09OTNWvW8OmnnxITE0N+fj6rVq1i3rx5PPbYYwAsXrwY0GOqIikpKezYsYPIyEj8/PzYvHkzY8eOxdfXlytXruDp6fnA0HbNmjWEh4ezYMEC2rRpY82vIFIiGI1GS3g7ZswYDAYD48ePZ/LkyXTs2BEHBwdtgCk2q+j148yZM7z//vukpqZSu3Zty/sAt2/f5quvviIlJYU5c+aQlZVF3bp1MRqNKgcnImKDtDmZlAiFg5Bz587x5JNPcvz4cV5++WUaNmzIV199RYsWLWjbti2lSpViwoQJrFmzBh8fH63aEJtW9Pc/IyODN998k6CgIFq2bMmnn37KiBEjCA0NxcvLi9q1a5Oens6VK1fIy8vjmWeewc7OTqVFxCb9MDTav38/U6ZMYefOnezcuZMxY8YQEhKCv78/0dHRvPHGG1SrVq3Y5wpD29DQUNq3b2+tryJSIhXtK5MmTeLSpUt88MEHVm6VSMlw4cIFateuzYYNG5g7dy5eXl5ERERYylklJycTFBSEm5sbLi4urFy5EgcHB93wEBGxUQpuxaqKBk9ffPEF//d//8eMGTPo0qULe/fuZceOHQQEBBAQEICTkxM5OTm8/PLLTJ48mQYNGhQ7h4gtKfp7n5GRgYODA61bt2bDhg2cP3+eYcOGMWrUKF544QWmTJnCs88+S4cOHYqdQ6s2xBYV7TspKSnUrVuX69evExISgp+fH7GxsYwbN46ePXuSnp5O+/btmTt3Lm3btrWcQ6GtyH9WNGTSWE3kvr179zJ//nwGDBhAhw4d2LhxI3FxcTzxxBO8++67lvrPt2/f5vr169SsWROj0agb7SIiNkz/+ovVFB3QGwwG/vznP/P222/zzjvv4ODgQMeOHQkMDOTKlSvs27ePypUrM3/+fIxGI/Xq1bOcRxMBsTVF+054eDhXrlxh1qxZBAQEMGXKFA4dOsQ777xDt27dMJvNnD17Fg8Pjx8FtwptxdYU7TunTp1i1KhRDB8+nOeff54yZcoQFRVFv3796NmzJwBOTk54eXnh5uZmOceWLVsICwtjxowZCm1F+PEK9kJFwyaDwUBubi6Ojo5WaKGI9fzwpkWNGjXw8vJi3bp1GAwGOnfuTEFBAQkJCUyZMoXJkyfj5OREmTJlKFOmDHC/jym0FRGxXboCiFUUHeRv2LCBixcv4u7uTpcuXTAajYSEhGA2m+nUqROpqalMmDCBGjVq4OzszOrVq4vVUBOxNYW/919++SWnT58mODjYckNjxYoVtGzZkm7dugGQl5dHmTJlqFKlijWbLGJ1RTfxW7p0KcnJyVy6dImIiAicnZ157733uHbtGhcuXGDhwoXUrVuXuLg4ABo2bGg5T61atYiMjMTf398aX0PEqrZs2cLVq1fJycnhmWeeoWHDhj85FjObzZawKTExEYCuXbvqpqHYlB8uMHniiScYMmQIS5YsISEhAYAXX3wRgLVr1xIcHMycOXOK3eTQfEdExLYpuBWrKByAzJo1i/Xr1+Pr60uZMmXw8vJiwIAB3Lt3j9GjR2M0GunQoQNJSUnY29vz2GOPYTAY9LiQ2DSz2czhw4fp168fnp6elCpVCoPBwKuvvsr169f54osv6NOnD/Xq1eP48eNkZmZaVhCK2KrCyXN0dDQxMTFMnjyZwMBATpw4QWRkJEFBQSxcuJCoqCg2b97M3r17cXd3JyEhATs7O0wmE0ajkbp161r5m4hYR3h4OElJSQQGBpKens7mzZvp2LEjgwcPBoqvLCz63wkJCUyePJnIyEiFtmKTYmJiOH78OAsWLADur7odNGgQUVFRxMfH4+joyIsvvsi9e/c4ffq05jgiIlKMatyK1Zw5c4ahQ4cSFhZG48aNyczMpGzZsgDcvHmT7du3M23aNCZPnkyPHj0sn9NKW5H7IiMjiYyMZPLkyXTp0oXSpUtjNpvZuXMnn332GdevX8fDw4PRo0djb2+vmrYiQEhICDVr1mTo0KEAXLlyhU2bNrFt2zZGjBhBQEAA+fn5ZGZmUqFCBd0sFOF+Pei33nqLsLAw/Pz82Lx5M2PHjmXx4sU88cQTeHp6Wo4tGtoW1oOeMWNGsTrRIo+yH85V9uzZQ3BwMO3bt2f69OmW19PT0xk2bBj5+fm8/vrrdOnS5SfPISIitktXA7Eqs9lsqR1YGNqePHmSt99+mzZt2vDaa6+RlJRU7DMaxIitM5lMAAwdOpQBAwYwbdo09uzZQ25uLgaDgbZt2zJp0iTmzZvH+PHjsbe3Jz8/X6Gt2Jyi96bNZjN5eXlcunSJq1evWl739PSkY8eOuLq6MnXqVDZt2oS9vT1ubm6WzTMV2oqtu3HjBgUFBfj5+bFz504mTZrEhAkTqFmzJlFRUXz99dfA/bDph6FtaGioQluxGUUD13PnznHq1Cl8fHxYuHAhu3fvZuzYsZZjq1atip+fH46OjqSlpRW7Zmm+IyIihXRFkD9EQUHBj14zGo189913nDlzBvh3GJWdnU1qaioZGRmEhIRYagyKyH12dnaWPjVq1Cj69OnD2LFj2bVrl6UfQfFBv4InsTVFA6T8/Hzy8vJwcHCgZ8+eHDt2jP3791uOrVq1KrVq1cLZ2ZnExET27t1reU8bYIotS0lJAaBOnTpUr16d+fPnExISwpgxY+jVqxdGo5F169Zx+vRp4N/XnaKhrTbxE1tS2AdmzpzJiBEj6NOnD/Hx8Xh6ejJ79mx27drF+PHjgftznlu3btGlSxeGDh2q642IiDyQZvLyuyt65/nixYsUFBRQoUIFvLy8CAoKYsqUKZQpU4bmzZsD4OPjg5ubG3fu3AGwrHjSYEbk34pu0Dd69GgAxo8fT05ODi+88IJWaohNK3rdiYmJITk5GWdnZ/r06UP79u05ePAg8fHxFBQUEBAQQFZWFt999x3PPvss169f58iRIwQGBlr3S4hY2alTpxg1ahTDhw/n+eefp0yZMkRFRdGvXz9L3XQnJye8vLwsT0/B/Q3MwsLCmDFjhkJbsUnr169n/fr1LF26FHt7ewoKCqhZsyY1a9YkIiKCcePG0aJFC1xcXLCzs2PmzJma74iIyE9SjVv5w8ydO5dPPvkEg8FAdnY2PXr04M9//jO7d+9m1apVBAUFUaFCBXbs2EFGRgYffvihwieR/6BoQDVp0iQuXbrEBx98YOVWiZQMYWFhrF27lsDAQO7cucOFCxeIjo4GYPHixezduxcPDw9LmZGNGzcSGxvL9u3bWblyJQ4ODlb+BiLWsXTpUpKTk9mzZw81a9Zk3Lhx+Pn5MWjQIMqXL4+vry9169YlLi6OjIwMEhMTLeV4UlJSyMjIwN/f38rfQsQ6li5dyldffWXZjKzQwYMH+fbbb2nZsiVr167FxcWFV155RfsQiIjIz1JwK3+IDz/8kIiICGbNmsVf/vIXQkJCLCueypUrx44dO1izZg3ly5enTJkyzJ49GwcHBxXmF/kFivYTrdYQue/QoUOMHz+eqKgovL29SUxM5J133qFmzZrMnz8fb29vjh8/zrFjxyhfvjzdu3cHYNq0ady8eZPp06fj6Oho5W8h8seLjo4mOjqayZMnc/v2bU6cOMHZs2cJCgqifv36REVFsX//flxcXHB3d2fBggU4ODhgMpkwGo26BolNedBcJSwsjD179rBt2zYAcnNzsbe359133yUjI+NHga42wBQRkZ+j4FZ+Fz8cxISGhlJQUMDEiRPZuXMnY8eOZdy4cdSoUYPMzExat25NVlYWpUuXtgz6NYgRW/RzNyt+7r2i/SU3N1eBk9icH/aPLVu2sGrVKuLi4jh9+jShoaH85S9/4cqVK3z++edERETg4+NDXl4eKSkpHD58mGvXrpGUlERcXBw+Pj5W/DYi1hMSEkLNmjUZOnQoAFeuXGHTpk1s27aNESNGEBAQQH5+PpmZmVSoUEFjNrFZRa87R44c4fbt23h4eFCqVCmGDRtGq1atCAkJsRy/bt06NmzYwJIlSyhdurS1mi0iIg8ZLWWU35zZbLYMYvbv38/NmzcpU6YMbm5u7Nq1i5CQEIKDg+nWrRtfffUVoaGhZGVl4ezsjJ2dnXbxFptVdAKwZcsWYmNjWbJkCcnJycBP7zBctL8kJiayfv36YpuUiTzqil53IiIi2LBhA66urjg5OXHz5k02btxIgwYN6NOnD02bNuXy5ct06dKFTz/9FJPJxI0bN9i+fTtZWVkKbcVmmc1m8vLyuHTpElevXrW87unpSceOHXF1dWXq1Kls2rQJe3t73NzcNGYTm1Z43QkLC2P8+PGEhoaSkJDArVu3ePnllzl8+DBTpkwhOzub9PR0PvnkEypVqqTQVkRE/isaZclvquhj2vHx8SxfvpzVq1fj4eFBaGgoeXl5TJ48mR49egBQvnx5PDw8cHJyKhZK6TE7sUWFfSA8PJykpCQCAwNJT09n8+bNdOzYkcGDBwPF+1nR/05ISGDy5MlERkaqTprYlMI+sG3bNtavX8/kyZNp2bIltWvXxmg0cuTIEYKCgnB1daVy5cp06NCBZ555Bn9/f+zt7WnVqhUBAQEKoMRm5efnU1BQgKOjIz179iQ2Npb9+/fTokULAKpWrUqtWrUs9WxdXV0tG/hpzCa2bO/evWzcuJGYmBgqVqxIdnY21apVo0mTJri6urJq1SpatWrFY489hqOjI5GRkYBKW4mIyC+n2Yn8pgoHIFu2bGHz5s0MGjSIypUr06NHDy5fvkxMTAzVqlXj66+/pnz58mzZsgV3d3dtACPyLykpKezYsYPIyEj8/PzYvHkzY8eOxdfXlytXruDp6fnA0HbNmjWEh4ezYMEC2rRpY82vIGIVu3btIj4+nvr16xMQEADcXyl48uRJzp8/T/Xq1TGbzcTExODi4kLPnj2Bf5cZ0c0OsVUxMTEkJyfj7OxMnz59aN++vWUfgoKCAgICAsjKyuK7777j2Wef5fr16xw5csQS3IrYstu3b1OlShWqVq2Ki4sLbm5uABw9epSUlBRWrVrF559/jqurK76+vtjZ2am0iIiI/Fd0xZDf3PXr10lOTiY5ObnYjsKjRo0iMzOTsWPHYjAYKFu2LEajkcTEREB3nsU2/bAu540bNygoKMDPz4+dO3cyadIkJkyYQM2aNYmKiuKNN96gWrVqxT5XGNqGhobStm1ba30VkT/UD68Zjo6OODg4cPDgQXbv3k2rVq0wGAz4+PjQpk0bOnfuzJNPPgnARx99ZDmHJs9iy8LCwli7di2BgYFkZmYycuRIoqOjCQ4OZvHixYwbNw4PDw9yc3MxGAwsWbKE2NhYtm/fTl5enm68i0150F4D33//PWlpaZQqVQr49z4DZ8+e5fjx4wDF5kMmk0nXHRER+a/oqiH/sx8OYipWrMjAgQPJzs4mKiqKp556yrIqY+rUqfz9738nMzMTk8lEYGCg7jyLzSpalzMlJYW6detSp04dqlevzvz584mNjWXcuHH07NmT9PR01q1bR/PmzalWrdoDQ9v27dtb8+uI/GGKXndyc3Oxs7OjRYsWeHp6Eh4eTnx8PKVKlaJZs2Y4ODgwYcIE2rdvT3Z2Np07d9Z1RwQ4dOgQW7duJS4uDm9vbxITE9m9ezcDBgxg/vz5hIaGcvz4cY4dO0b58uXp3r07cH+zsqpVq6L9jcWWFL3uXLx4kdzcXHx8fHjllVf48MMPGThwIMuXL7dsDuvt7Q3cX5Hr7u5uOY+e7hARkf+WZizyPykaPCUlJfHNN99gMBjo0qULwcHBODk5MWXKFIxGIy1btgSgUaNGxc6hO89ii4pOAE6dOsWoUaMYPnw4zz//PGXKlCEqKop+/fpZHud2cnLCy8vL8gge3C9JEhYWxowZMxTais0oet2Jjo7m9OnTpKam0rp1azp16sTo0aOZM2cOcXFxmM1m/P39cXNzo127dpZz6LojArdu3aJatWp4e3tz+vRpNmzYwPDhw7ly5QrDhw8nIiICPz8/fH19SUlJISoqimvXrpGUlERcXJwloBJ51BW97syZM4eNGzdy584dmjVrxsyZM5kwYQLTp0/n5ZdfZsqUKdy7d4/Y2Fjc3d2LjdtERER+DYNZt8vlVyoaPM2aNYt169ZRp04dbt26xcWLFwkPD+dPf/oTcXFx7Nmzh6lTp9K8eXOVRBCbV7QPLF26lOTkZPbs2UPNmjUZN24cfn5+DBo0iPLly+Pr60vdunWJi4uzbApTuFojJSWFjIyMYo/gidiKiIgIVq1axZgxY7h79y5r164F4OOPP+bLL78kOjoas9lM9+7dVYtTpIiIiAgef/xxypcvz/vvv094eDjR0dEYjUYGDx7Mnj17CAkJASAqKoqmTZty6NAhIiMj8fLyom/fvvj4+Fj5W4j88SIjI4mPj7eUEBk0aBBdu3YlODiYr7/+mqlTp5KWlkbZsmUpW7YsK1aswMHB4YElFkRERH4pBbfyP7t8+TLz58/npZde4umnnwZg9uzZrFy5kqVLl1K3bl3mzJlDYmIicXFxNGjQwMotFikZoqOjiY6OZvLkydy+fZsTJ05w9uxZgoKCqF+/PlFRUezfvx8XFxfc3d1ZsGABDg4OmEwmjEajboCIzcrKyuLNN9+kX79+tGrVigMHDjB06FBCQ0OpVasWNWrUIDU1lZkzZ9KwYUOCg4Ot3WSREmHbtm3MnDmTyZMnExAQQHp6Oi4uLvTv35+goCBatWrF0aNHWbNmDc888wxdu3a1rE43mUyqCy02yWw2c+3aNd58802GDh3Ks88+y/HjxxkwYAB5eXkEBgYSERGBwWDg0qVLGI1GS1krleUREZH/la4i8quZzWYOHTpE//79i+3QDRAcHEx2djbjxo1j/fr1vPbaa9SoUQNfX18rtlikZDl79iyvvvoqzz//PADNmzdn06ZNzJ8/nxEjRjBu3DhCQkLIzMykQoUKGAwGTQDEJhVdpZ6dnU1BQQHnzp3D09OTzz77jGHDhhESEkLr1q2ZNGkSjRo14qWXXmLcuHFaGSjyL7t27SI+Pp769esTEBAAgKenJydPnuT8+fNUr14ds9lMTExMsXFd4XVHtTnFVhkMBuzs7CgoKMDe3p5Lly4RGxtLcHAw/v7+dO7cmYkTJ9KnTx9q165tGacVHi8iIvK/0DMb8qsZDAb8/f0ZPnw4d+7c4fLly8D9FRkAPXr0wGw2880331CzZk369u2LnZ2d5X0RW1L04Qaz2UxeXh6XLl3i6tWrltc9PT3p2LEjrq6uTJ06lU2bNmFvb4+bmxsGg0ErncRmFYa28fHxbN68mbJly9KuXTvCw8MZPHgwEydOpHfv3jg5OZGenk5KSgoAdevWxWg0UlBQYM3mi5QIjo6OODg4cPDgQXbv3g3c71s+Pj60adOGzp0788ILL5Cens6MGTMAdN0R+RdnZ2fatm2Lp6cnR44cwcXFhaZNm1K+fHkee+wx1q1bR2JiYrH+ovIIIiLyW9DVRH61wgB2yJAh9O/fn9DQUD777DPLigwXFxdKlSpFfn5+sc9pxYbYmoKCAkvwlJ+fT15eHg4ODvTs2ZNjx46xf/9+y7FVq1alVq1aODs7k5iYyN69ey3vqTSC2Lpz586xcOFCTCYT/v7+pKen06xZM8vmfLm5uRgMBmrWrFnsc5o8iy3Lzc3FZDLRokULJkyYQOPGjYmPj+fQoUMAODg4MH78eCIiIujfvz8ff/wxDg4O5Ofn67ojwv1xXOnSpRk4cCC1a9dmz549VKpUidq1a+Po6Ei9evWIi4tj3Lhx1m6qiIg8gnQLXX61wkeGjEYjISEhmEwm3nzzTf7v//6PqlWrsnv3blxdXfHy8rJ2U0WspuiGFDExMSQnJ+Ps7EyfPn1o3749Bw8eJD4+noKCAgICAsjKyuK7777j2Wef5fr16xw5ckQbK4lNetBmLsOHD+fixYskJibSq1cv0tLS2Lt3r2WzpPPnz5OVlUWfPn2s1GqRkiU6OprTp09z6dIlWrVqRefOnQkJCWHu3LnExcUB0KxZM9zd3S03QOD+zXmttBW5r/BaZG9vT35+PtnZ2Vy4cIEDBw4QGxtLZmYmjRo1wmg0YjKZtEhFRER+U9qcTP5nRSfXc+fOJSoqioYNG/L000/z1ltvaRAjAoSFhbF27VoCAwO5c+cOFy5cIDo6GoDFixezd+9ePDw8LCsGN27cSGxsLNu3b2flypU4ODhY+RuIWMeFCxeoUKECbm5u5OTkMGfOnGL958CBAyQnJ3P58mU8PT0ZNmwY9vb2uu6IzYuIiGDVqlWMGTOGu3fvsnbtWgA+/vhjvvzyS6KjozGbzXTv3l03CMWm/fOf/6RGjRr/8bjCeusXL17ktdde47HHHsPZ2ZnY2FgcHBweeMNRRETkf6Vb6fI/K6wfaDQaGTlyJPb29ixdupQ+ffpgNBoxm82aPItNO3ToEFu3biUuLg5vb28SExPZvXs3AwYMYP78+YSGhnL8+HGOHTtG+fLl6d69OwBXrlyhatWq6P6a2JKiE9+DBw8SHBzM008/TdeuXQkMDCQoKIhOnTqxdOlSBg4cSPPmzWnevHmxc2gTP7F1WVlZfPHFF0yfPp1WrVpx4MABUlNTCQ0NJSUlhdq1azNw4EBmzpzJsWPHFNyKzVqyZAmff/45I0eO/I+bKBsMBgoKCnjiiSfYtm0bWVlZVKpUSZvHiojI70pXF/lNFA1vhw0bRnZ2NhMnTiQ3N5fOnTsruBWb8sMVF7du3aJatWp4e3tz+vRpNmzYwPDhw7ly5QrDhw8nIiICPz8/fH19SUlJISoqimvXrpGUlERcXByOjo5W/DYif5yifefo0aO4u7vz0ksvUbp0aYYOHUqXLl1o06YNY8aM4cCBA1y5coUqVar8qA6nJs9iy7KzsykoKODcuXN4enry2WefMWzYMEJCQmjdujWTJk2iUaNGvPTSS4wbNw4fHx9rN1nEamrUqEFycjKxsbG89tpr1K9f/2ePL1yU4uLigouLi+V1zXVEROT3omc55IF+bgfun3rPaDSSl5cHwJgxY+jQoQOzZ8/m7t27v0sbRUois9lsCZ4iIiLYsGEDrq6uODk5cfPmTTZu3EiDBg3o06cPTZs25fLly3Tp0oVPP/0Uk8nEjRs32L59O1lZWcTFxWlCLTajaN8JCwtj8ODBDB06lOPHj+Pr60tSUhImk4lFixYxe/Zsjh8/zvnz57V5kkgR8fHxbN68mbJly9KuXTvCw8MZPHgwEydOpHfv3jg5OZGenk5KSgoAdevWtdx8F7FFHTp04NVXX8VkMrFixQrOnDnzX33+3Llz3LlzR9ciERH53Si4lR8puuJpy5YtxMbGsmTJEpKTk4Gf3p3bbDZb6nB+9NFHtG3blrVr1+Lq6vqHtFukJCgcuG/bto3169dTrlw5WrZsydSpUzEajRw5coRGjRrh6upK5cqV6dChA1OnTsXf359SpUrRqlUrEhMTee+99xTaik0p7DvLli0jKSmJ5cuXM2/ePJycnJg+fTq3bt1i+vTpzJ49mz//+c+kpaVZanaqnIjIfefOnWPhwoWYTCb8/f1JT0+nWbNmlo3HCuuo16xZs9jnVJdTbE3R60ZmZia5ubl88sknLFq0iFOnTv3s5wqvV/Hx8YwZM4YbN2787u0VERHbpVGa/Ejh4D08PJxp06Zx/vx5Dh8+zDvvvMOSJUssxxUd8BQdxCQkJDB+/Hjy8vKoXLnyH9t4kRJg165dxMfHU79+fQICAgDw9PQkLS2N8+fPU716dcxmMzExMRgMBnr27GnZqRjuP26nR73FFplMJr766itGjhyJn58fZrOZM2fOUKVKFcLCwti7dy/Vq1cnLCyM1atXM2/ePACtdBL5l+HDh1OjRg0SExNp37493bt35/bt2/Tt25eJEyfSp08fMjIy6NOnj7WbKmJVhdeNiIgIQkNDefrpp+nZsyfffvsty5cv58SJEz/6TNH5zpo1a5gzZw4DBgz4RRubiYiI/FoKbuWBUlJS2LFjB5GRkUybNo0ePXpw6dIlfH19uXLlCvDvAc8PBzEzZ85kwYIFtGnTxmrtF/kj/XC1n6OjIw4ODhw8eJDdu3cD9/uLj48Pbdq0oXPnzrzwwgukp6czY8YMyzkU1oqtu3v3LqdOnaKgoICMjAzef/99Xn/9dUaPHo3JZGLmzJmsWLECgPr162M0GjGZTNZttIiVXbhwgZs3bwLg4uJCnTp12LlzJwADBgxg8ODBBAYGkpOTQ9OmTUlKSsLe3l59R2xeTk4OJ06cYNiwYfTt25dJkyYxfPhwHB0diY2N5fTp05ZjTSZTsflOeHg406dPp0OHDtZqvoiI2AgFtwL8uG7tjRs3KCgowM/Pj507dzJp0iQmTJhAzZo1iYqK4uuvv7Z87oeDmNDQUNq2bfuHfwcRayjaB3JzczGZTLRo0YIJEybQuHFj4uPjOXToEAAODg5MmDCB+fPn079/f5KSknBwcCA/P18rBkUAV1dXwsLC8PHx4dixY2RnZ9OkSRO8vLzw8PDAycmJffv2FbtZog1hxJYdPHiQ3r178+6777J3716cnJwICgriH//4B0uXLgWgefPmDB06lPDwcEaOHGl5wkN9R2xdbm4uFy9eJDMz0/Ja8+bN6dy5MykpKcVKxRX2l4SEBMt8p127dtZotoiI2BgFt1JsQ5jCzSrq1KlD9erVmT9/PiEhIYwZM4ZevXphNBpZt26d5Q504eeKhraFddREHnVF+050dDTjxo2jR48eLFy4EHt7e0aPHk3p0qWJi4vj4MGDALi5udGuXTu6dOmCnZ0dJpNJK21FimjcuDENGzZk69atODg4UK9ePe7du0dOTg79+vWzlBhRXVuxdUePHsXd3Z2XXnqJunXrMnToUCZOnMjf//53xowZQ2pqKleuXHlgX9F1R2zNDxepmM1mypQpwwsvvMC2bdsscyAAf39/PDw8OHnyJAcOHLC8vnr1aqZPn8706dM13xERkT+MglsbV3S14KlTpwgODmbr1q1UrFiRMmXKEBUVRe/evenZsycATk5OeHl54ebmZjnHli1bCAsLU2grNqdofbRly5bh7+9Pt27d2LlzJ2+99RZPPPEEb7zxBnZ2dqxevZq9e/f+6Bxa8SRSXGG/atSoEWfOnGHx4sUMGTKE27dv07FjR6B4iR4RWxQWFsbgwYMZOnQox48fx9fXl6SkJEwmE4sWLWL27NkcP36c8+fPq6+IzSt6o33lypVMmTKFyMhIbt68SdeuXXFzc2PZsmWcOXMGgKysLEqXLs3LL79MUFAQAP/85z/55JNPCAsL00pbERH5QxnMWrJis4pOfJcuXUpycjJ79uyhZs2ajBs3Dj8/PwYNGkT58uXx9fWlbt26xMXFkZGRQWJioiVwSklJISMjA39/f2t+HRGryMrK4s0336Rfv360atWKAwcOMHToUEJDQ6lVqxY1atQgNTWVmTNn0rBhQ4KDg63dZJGHws2bN1m6dCnHjx+nUqVKzJkzBwcHB0wmk254iE1btmwZMTExLFq0CAcHByIiIrhy5QpTpkyhSZMmpKWlERkZyaZNm2jdujXz58/XzQ6xWQUFBZbQdu7cuaxevZo6deqQkZFBqVKlWLZsGRcuXGDp0qWcOHECHx8fS83odevWFbveXL16VRsvi4jIH07BrRAdHU10dDSTJ0/m9u3bnDhxgrNnzxIUFET9+vWJiopi//79uLi44O7uzoIFCyyTZ6PRqImA2JSik9/s7Gzy8/Np164dK1asICMjgyFDhjBq1Ci6d+/OpEmTaNSoES+99BIpKSn4+PhYJg8i8svk5OTg6OiIwWAgPz9fj3iLTTOZTIwcOZLmzZvTs2dPvvrqK4YMGYK3tzfff/89QUFBBAYGAnDixAl8fX113REB0tLSiIqKomfPntSvX58vv/yShQsXcuvWLaKioihVqhQ7d+7k4sWLlC9fnt69e1s28dN8R0RErEnBrRASEkLNmjUZOnQoAFeuXGHTpk1s27aNESNGEBAQQH5+PpmZmVSoUEGTZxEgPj4eR0dHevTowaRJk0hPT+eLL75g0qRJdOvWDYBXX32V2rVrM2XKFMvniq78EJFfTisGRe4/5fHiiy/yxhtv0L59e6ZNm8ZTTz2Fv78/Y8eO5d69e/Ts2ZO+fftaPqNV6mLLzGYzhw4don///pYFKI0aNQIgOTmZhQsXkpGRQWRkJB4eHsU+q/mOiIiUBEoPbEzRnN5sNpOXl8elS5e4evWq5XVPT086duyIq6srU6dOZdOmTdjb2+Pm5mbZEEaDGLF1586dY+HChZhMJvz9/UlPT6dZs2aWOs+5ubkYDAZq1qxZ7HMKbUV+HYW2IuDq6kpYWBg+Pj4cO3aM7OxsmjRpgpeXFx4eHjg5ObFv375i4z2FtmJriv7+GwwG/P39GTJkCDdu3ODs2bPk5uYC0LBhQ4YOHUqlSpXo2bMnN27cKHYezXdERKQk0NXIhhRd6Zefn09BQQGOjo707NmT2NhY9u/fT4sWLQCoWrUqtWrVstSzdXV1tTx6p8mz2JoHrZIdPnw4Fy9eJDExkV69epGWlsbevXvp27cvPj4+nD9/nqysLPr06WOlVouIyKOocePGGAwGgoODcXBwoF69ety7d4+cnBz69evHCy+8AGiVutimomO2jIwM7t27R5UqVRg+fDgA06ZNo1y5crRt2xYHBwcaNGjA66+/zp49eyhfvrwVWy4iIvJgCm5tRNFBTExMDMnJyTg7O9OnTx/at2/PwYMHiY+Pp6CggICAALKysvjuu+949tlnuX79OkeOHLEEtyK2prDvXLhwgQoVKuDm5oaLiwt16tRh586d9OrViwEDBuDj40NycjKXL1+madOmDBs2zFIfTSueRETkt1AYxjZq1IgPPviAxYsX8/nnn5OVlUXHjh0BhbZim8xms2XMFhkZyaeffsqtW7dwd3enb9++DB8+nLy8PMaMGYPBYKBt27bY29vTpEkTmjRpAqi0iIiIlDyqcWtjwsLCWLt2LYGBgdy5c4cLFy4QHR0NwOLFi9m7dy8eHh6Wx7w3btxIbGws27dvZ+XKlTg4OFj5G4j8cYre8Dh48CDBwcE8/fTTdO3alcDAQDIzM+nUqROvvvoqAwcOfOA5VB9NRER+Dzdv3mTp0qUcP36cSpUqMWfOHMvmsQqexJbFxsaybNky3n33Xf7yl7/Qp08f7t27x/Lly/H09GT27NnExcUxefJkOnfurP4iIiIlmtIEG3Lo0CG2bt1KXFwc3t7eJCYmsnv3bgYMGMD8+fMJDQ3l+PHjHDt2jPLly9O9e3fg/mZlVatWRRm/2JKioe3Ro0dxd3fnpZdeonTp0gwdOpQuXbrQpk0bxowZw4EDB7hy5QpVqlT50QonhbYiIvJ7cHNzY+zYseTk5ODo6KjNY8XmFe7fkZyczKBBg2jXrh2HDh3i8uXLTJ06lbS0NLKzswkODiYjI4OPPvqILl26WLvZIiIiP0sju0fYD+ty3rp1i2rVquHt7c3p06fZsGEDw4cP58qVKwwfPpyIiAj8/Pzw9fUlJSWFqKgorl27RlJSEnFxcTg6Olrx24j8cYo+ahcWFkZCQgJubm54enoycOBAkpKSWL58OYsWLeL69es4Ojpy/vx5PD09rdxyERGxNU5OTgDaPFZsnsFgwNHREXt7e8qVK8fu3bsJDg5m1KhRdOrUiQkTJvDdd9+xbNky3nvvPS1KERGRh4K2N39EFQ2eIiIi2LBhA66urjg5OXHz5k02btxIgwYN6NOnD02bNuXy5ct06dKFTz/9FJPJxI0bN9i+fTtZWVnExcXh4+Nj5W8k8scpXDW7bNkyS0g7b948nJycmD59Ordu3WL69OnMnj2bP//5z6SlpbF27VoATQJERMQqVNNW5L6KFSsyf/58QkJCGDt2LL179wagcuXKxcq+GQwGjdtERKTE0235R1Th4H3btm2sX7+eyZMn07JlS2rXro3RaOTIkSMEBQXh6upK5cqV6dChA8888wz+/v7Y29vTqlUrAgICtHpDbJbJZOKrr75i5MiR+Pn58dVXX3HmzBm8vb0JCwsjKCiIwMBAwsLC6N27N76+voAmziIiIiLWNHbsWFJSUsjLy7Nsuly6dGm++OILatWqVexYjdtERKSk04rbR9iuXbuIj4+nfv36BAQEAODp6UlaWhrnz5+nevXqmM1mYmJiMBgM9OzZE3t7e/Lz8wGws7NTaCs26+7du5w6dYqCggIyMjJ4//33ef311xk9ejQmk4mZM2eyYsUKAOrXr4/RaMRkMlm30SIiIiI2zGQyYTAYWLhwIW5ubvTv35+XX36ZXr16kZGRwaRJkwA9ISUiIg8PpXKPELPZXOyusaOjIw4ODhw8eJDdu3fTqlUrDAYDPj4+tGnThs6dO/Pkk08C8NFHH1nOobBWBFxdXQkLC8Pe3p5jx46RnZ1NkyZN8PLywsPDg2+++YZ9+/bx2muvWfqddiUWERERsR47OzsKCgooW7Ys69evZ+PGjWRkZODo6Ej37t0ti1Q03xERkYeFwazbjY+EohuR5ebmYmdnh52dHRcuXCA8PJy8vDz69+9Ps2bNALh58yZffPEF2dnZdO7cGTs7Ow1iRH6g8GZIcHAw+fn5REREcO/ePYKCgnjxxRd54YUXih0nIiIiItZnMpkeeENd8x0REXnYKLh9BBQNjaKjozl9+jSpqam0bt2aTp06YTKZmDNnDgUFBfTu3Rt/f/8fneOnBjciAvHx8XzwwQd06dKFzz//nKysLFavXo2dnZ1CWxEREZHfWdFFKr/k9UKa44iIyMNOwe0jJCIiglWrVjFmzBju3r1r2eX+448/5ssvvyQ6Ohqz2Uz37t0JDAy0bmNFHiI3b95k6dKlHD9+nEqVKjFnzhwcHBw0GRARERH5nRUNZ1NTUy37cHh6egI//eRT0ddPnjyJh4cHFStW/OMaLiIi8htQcPuIyMrK4s0336Rfv360atWKAwcOMHToUEJDQ6lVqxY1atQgNTWVmTNn0rBhQ4KDg63dZJGHTk5ODo6OjhgMBj1qJyIiIvIHmjNnDnv27OHmzZu4ubkREBDAqFGjgB+Ht0X/HBcXx8qVK1m0aBG1a9e2SttFRER+LaUOD6mig5Hs7GwKCgo4d+4cnp6efPbZZwwbNoyQkBBat27NpEmTaNSoES+99BLjxo3Dx8fHyq0XeTg5OTkB2sRPRERE5I8UHx/PRx99xOzZsykoKODq1atMnTqV77//nr/97W8YDAbL/KjoPGnNmjXMmzePqVOnKrQVEZGHkpKHh1ThYCQ+Ph5HR0d69OhBu3btCA8P54svvmDSpEl069YNgPT0dEqVKgVA3bp1gf9cD0pEfppq2oqIiIj8PtLS0ihXrhxly5a1zFnOnj1L165dadq0qeW4KlWqMHjwYJ544gn69ev3wNA2PDyc0NBQ2rdvb62vIyIi8j9RcveQO3fuHAsXLsRkMuHv7096ejrNmjWzDE5yc3MxGAzUrFmz2OcU2oqIiIiISEmSl5fHtm3b2L9/PwAXL14E4MyZM1y7ds1ynMlkomnTpvTo0YPDhw+Tk5ODyWSyhLYJCQkKbUVE5JGgFbcPkQetkh0+fDgXL14kMTGRXr16kZaWxt69e+nbty8+Pj6cP3+erKws+vTpY6VWi4iIiIiI/GcODg5cvnyZdevWsX79evLz84mJieHFF19k5cqV7Nu3j5YtW1o2h3V1dSU/P99Szgruh7bTpk1j1qxZtGvXzlpfRURE5DehZZcPkcLQ9sKFC9y8eRMAFxcX6tSpw86dOwEYMGAAgwcPJjAwkJycHJo2bUpSUhL29vaYTCartV1EREREROQ/ee+993B1dWXfvn0888wzADRr1gxvb29WrVrFnj17APj+++/58ssvqV69OnB/D4KrV69y4MABhbYiIvLIMJjNZrO1GyE/r+hK24MHDxIcHMzTTz9N165dCQwMJDMzk06dOvHqq68ycODAB54jPz9fmymJiIiIiEiJlZeXR15eHu+88w75+fmcP3+ewYMH07lzZ06ePMnq1avZtWsX7u7uGI1GDAYD69atw8HBwXKOmzdv4ubmZsVvISIi8ttRcFvCFQ1tjx49StmyZdm6dSulS5cmMjKSLl260KZNG+7cucOBAwcYNmwYVapU0eZJIiIiIiLyUHvnnXc4evQow4YNo1OnTty5c4fz589z4sQJypUrR4cOHbC3tyc/Px87OzvNgURE5JGj4LYEK7oralhYGAkJCbi5ueHp6cnAgQPx8PBg+fLlXLhwgevXr+Po6Mj48eNp2bKllVsuIiIiIiLy6xR9WnDSpEl8/vnnBAUF8Ze//IVr167h4+NjOdZkMllq3oqIiDxqFNw+BJYtW0ZMTAyLFi3CwcGBiIgIrly5wpQpU2jSpAlpaWlERkayadMmWrduzfz584uFviIiIiIiIg+TooHspEmT+OyzzzCZTDz55JMsWbJEcx0REbEJCm5LOJPJxMiRI2nevDk9e/bkq6++YsiQIXh7e/P9998TFBREYGAgACdOnMDX19dSWkFERERERKQk2bRpE88++ywuLi7/8dii4W1SUhI3btzgtdde094dIiJiM5TwlXB3797l1KlTFBQUkJGRwfvvv8/rr7/O6NGjMZlMzJw5kxUrVgBQv359jEYjJpPJuo0WERERERH5gRMnTjBq1CiWLVvG3bt3/+PxdnZ2lrlNly5d6N+/v6WmrYiIiC1QcFvCubq6EhYWho+PD8eOHSM7O5smTZrg5eWFh4cHTk5O7Nu3j6ILp1XjSURERERESpr69esTGRnJ0qVLiYqKIisr6z9+pmh4W0grbkVExFYouH0ING7cmIYNG7J161YcHByoV68e9+7dIycnh379+hETE4PBYEBVL0REREREpCQqXCXbpk0bpk2bRlRUFKtWrfqP4a3ZbLYsTNm0aRP/+Mc/fve2ioiIlBQKbh8ChYX3GzVqxJkzZ1i8eDFDhgzh9u3bdOzYEUCbkYmIiIiISIlVuEo2LCyMkydP4urqypw5c4iOjubOnTsP/EzROU5CQgKjRo3i6tWrf1ibRURErE3PmDxEnn/+edLS0ti7dy+VKlUiKirK8uiQyiOIiIiIiEhJtmXLFtatW8fixYvp1q0bqampTJw4EXt7e/r161dsw7Kioe2aNWuYNWsW8+fPp0WLFtZqvoiIyB9Owe1DxM3NjbFjx5KTk4OjoyMGg4H8/HzVeBIRERERkRIvLS2NunXr0rhxYwDq1q1LmTJlGDFiBAUFBbz++uu4urr+KLQNDw8nNDSUdu3aWbP5IiIifziVSngIOTk5WWraKrQVEREREZGSpqCg4Eevubm5cefOHdLT0y3HtGzZkoEDB7J06VKWLVtGVlZWsfIIhaFt+/bt/9D2i4iIlAQKbh9iqmkrIiIiIiIlTUFBAUbj/anm5cuXuX37NgB+fn5cv36ddevWkZmZaTnG3d2dxx9/nLNnz1rKJcTHx/Pee+8xffp0hbYiImKztFxTREREREREfhNms9kSyM6ePZvt27eTk5PDq6++ymuvvcbEiRMZMWIE9+7do2nTpvzpT39i+/btdO3alb59+1oWp7i4uDBz5kyVRxAREZtmMJvNZms3QkRERERERB5uRWvT7tq1i8mTJzNhwgT+/ve/8/nnn+Pv78/bb7/NwYMHiYqKIjU1lYoVK2Jvb09CQgIODg7aeFlERKQIBbciIiIiIiLym9myZQs7d+6kbt26DBgwAIDVq1eTmJjIM888Q1BQEEajkW+//Zbc3FyefPJJjEajNl4WERH5AV0VRURERERE5FcrutI2KyuLY8eOsXPnTipWrGg55uWXXwZg3bp12NnZ0a1bNx5//HHL+wUFBQptRUREfkBXRhEREREREflVim5ElpaWhpOTEwEBAZQrV46YmBhatGhBixYtgPvhrdFoJCoqiipVqlCrVi3LeQrPISIiIv+mUgkiIiIiIiLyP5k7dy779+/nzp07lC1blieeeILy5cuzY8cO3nvvPZo1a2Y5dufOnTz77LOqZSsiIvIf6LamiIiIiIiI/GorV65kzZo1TJgwgRUrVuDr68v69evx9/enTZs2vPPOOxw6dMhyfJs2bbCzs8NkMlmx1SIiIiWfglsRERERERH51S5dukSvXr1o3Lgxp06dYuPGjcyYMQNXV1cqVqxI586dGTx4MCdPniz2Oa24FRER+XkKbkVEREREROS/ZjabMZvNfPPNN7i4uHDy5ElCQkIYOXIkXbp04cSJExw5coSAgACGDRtGnTp1rN1kERGRh4o2JxMREREREZH/msFgAODFF19kwoQJzJkzh5kzZ/LCCy8AkJubS25uLn5+fvj5+QFgMpm00lZEROQX0opbERERERER+dVatGhBt27dqFGjBuXKlQPg+++/59ixY3h4eBQ7VqGtiIjIL2cwm81mazdCREREREREHl43btwgOjqa1atXU7VqVcxmMw4ODqxduxYHBwfMZrNlha6IiIj8MgpuRURERERE5H+Wn5/P2bNnOX36NGXKlKFNmzbY2dmRn5+Pvb2q9ImIiPy3FNyKiIiIiIjI70I1bUVERH49BbciIiIiIiIiIiIiJYw2JxMREREREREREREpYRTcioiIiIiIiIiIiJQwCm5FREREREREREREShgFtyIiIiIiIiIiIiIljIJbERERERERERERkRJGwa2IiIiIiIiIiIhICaPgVkREROR3Zjabrd0EERERERF5yCi4FREREQH+7//+D29vb3r16vWTx4wcORJvb2/Gjh37i8977NgxBg4c+B+PW7BgAd7e3r/4vCIiIiIi8mizt3YDREREREoKo9FIcnIy3377LR4eHsXey87OZs+ePf/1ORMTE7lw4cJ/PK5Hjx60aNHivz6/iIiIiIg8mrTiVkRERORf6tati5OTE9u2bfvRe3v27KF06dJUrlz5d/nZHh4eNGzY8Hc5t4iIiIiIPHwU3IqIiIj8i7OzMwEBAQ8Mbrds2UL79u2xt//3A0sFBQUsXbqUtm3b4uvrS/v27Vm5cqXl/bFjx/Lxxx+Tnp6Ot7c3H330EV9//TXe3t7Exsby3HPP0aBBA9atW/fAUglJSUn89a9/pUGDBgQGBjJ79mxyc3MBuHfvHu+++y4tW7bE19eX5557juXLl/9OfzMiIiIiIvJHU3ArIiIiUkSHDh0s5RIKZWVlsW/fPjp16lTs2HfffZf58+fzwgsvsGTJEp577jlCQ0NZuHAhAEOGDCEgIIDHHnuMhIQEAgMDLZ9dsGABb7zxBjNnzsTf3/9H7YiPj2fMmDE89dRTREZGMnDgQFauXMl7770HQGhoKPv27WPMmDEsX76c1q1bM3PmTNatW/c7/K2IiIiIiMgfTTVuRURERIoIDAykdOnSbNu2jb59+wKwY8cO3N3dady4seW41NRUPvzwQ95++23L5mPNmzfHYDAQFRXFK6+8Qo0aNXBzc8PR0dFSBiE7OxuA559/nm7duj2wDQUFBSxcuJA2bdpYglqAu3fvsnnzZvLy8jh69Cj+/v507NgRgKZNm+Ls7Iy7u/tv/VciIiIiIiJWoBW3IiIiIkWUKlWKVq1aFSuXsHnzZp5//nkMBoPltcOHD2M2m2nVqhX5+fmW/7dq1YqcnByOHTv2sz+nTp06P/leamoqN27coG3btsVe79+/Px999BEODg40bdqUDz/8kDfeeIO4uDjS0tIICgoqtqpXREREREQeXlpxKyIiIvIDzz//PEOHDuXbb7/FycmJzz77jLfeeqvYMbdu3QKwrHj9oatXr/7sz3B2dv7J9wrP/XOrZydMmICHhwcbNmzgb3/7G3/729/w8/Pj3XffxcfH52d/toiIiIiIlHwKbkVERER+oGXLlri4uLBt2zacnZ2pVq0avr6+xY4pW7YsAO+//z4uLi4/Ooenp+ev/vmF575582ax1zMyMkhJScHPzw9nZ2fefPNN3nzzTa5cucKePXtYtGgRwcHBbN68+Vf/bBERERERKRlUKkFERETkBxwdHWnTpg3bt29n69atD1xV26RJE+B+mFqvXj3L/2/evElERIRl1azR+N8Pt5544gkqVKjAnj17ir2+fv16Bg4cSFZWFu3btycmJga4HxL37t2bjh07cuXKlf/654mIiIiISMmjFbciIiIiD9ChQwcGDRqE0Whk4sSJP3rf29ubF154gXfeeYf09HR8fX1JTU1l7ty5VKtWjVq1agH3V89ev36dTz/99Gfr2hZlZ2fHsGHDmDp1Ku7u7rRq1YrU1FTmz59P7969qVSpEk899RSRkZE4ODjg7e1NamoqH3/8Me3bt/8t/xpERERERMRKFNyKiIiIPECzZs0oW7YsVapUoXbt2g88Zvr06URFRbFmzRq+/fZb3N3d6dChA2+99RZ2dnYAdO3alU8//ZSgoCCGDx9Ohw4dftHP7927N87OzixfvpyEhAQ8PDx44403eOONNwCYOnUq8+bNIyYmhmvXruHu7k737t0ZMWLEb/MXICIiIiIiVmUwm81mazdCRERERERERERERP5NNW5FREREREREREREShgFtyIiIiIiIiIiIiIljIJbERERERERERERkRJGwa2IiIiIiIiIiIhICaPgVkRERERERERERKSEUXArIiIiIiIiIiIiUsIouBUREREREREREREpYRTcioiIiIiIiIiIiJQwCm5FREREREREREREShgFtyIiIiIiIiIiIiIljIJbERERERERERERkRJGwa2IiIiIiIiIiIhICfP/WtVVzDPJrHQAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 1400x600 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#visualizing our performance\n",
    "plot_performance('evaluation/json_results', ['Basic RAG', 'Summary Indexing'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Level 3 - Re-Ranking with Claude\n",
    "In this final enhancement to our retrieval system, we introduce a reranking step to further improve the relevance of the retrieved documents. This approach leverages Claude's power to better understand the context and nuances of both the query and the retrieved documents.\n",
    "\n",
    "The `rerank_results` function uses Claude to reassess and reorder the initially retrieved documents:\n",
    "1. It presents Claude with the query and summaries of all retrieved documents.\n",
    "2. Claude is asked to select and rank the most relevant documents.\n",
    "3. The function parses Claude's response to get the reranked document indices.\n",
    "4. It includes fallback mechanisms in case of errors or insufficient results.\n",
    "5. Finally, it assigns descending relevance scores to the reranked results.\n",
    "\n",
    "The `retrieve_advanced` function implements the new retrieval pipeline:\n",
    "1. We initially retrieve more documents than needed (default 20, configurable via `initial_k`) from the vector database.\n",
    "2. We then use the `rerank_results` function to refine this larger set down to the most relevant documents (default 3, configurable via `k`).\n",
    "3. Finally, it generates a new context string from these reranked documents.\n",
    "\n",
    "This process casts a wider net initially and then uses AI to focus on the most pertinent information. By combining vector-based retrieval with LLM reranking, this approach aims to provide more accurate and contextually appropriate responses to user queries.\n",
    "\n",
    "Our evaluations show significant improvements:\n",
    "- Accuracy increased from 78% in our previous system to 85%.\n",
    "- Precision was improved by using our re-ranker to reduce the number of documents shown to the LLM.\n",
    "- MRR (Mean Reciprocal Rank) was likely improved by asking Claude to rank the relevance of each document in order.\n",
    "\n",
    "These improvements demonstrate the effectiveness of incorporating AI-powered reranking in our retrieval process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import List, Dict\n",
    "\n",
    "def rerank_results(query: str, results: List[Dict], k: int = 5) -> List[Dict]:\n",
    "    # Prepare the summaries with their indices\n",
    "    summaries = []\n",
    "    print(len(results))\n",
    "\n",
    "    for i, result in enumerate(results):\n",
    "        summary = f\"[{i}] Document Summary: {result['metadata']['summary']}\"\n",
    "        summaries.append(summary)\n",
    "    joined_summaries = \"\\n\\n\".join(summaries)\n",
    "    \n",
    "    prompt = f\"\"\"\n",
    "    Query: {query}\n",
    "    You are about to be given a group of documents, each preceded by its index number in square brackets. Your task is to select the only {k} most relevant documents from the list to help us answer the query.\n",
    "    \n",
    "    <documents>\n",
    "    {joined_summaries}\n",
    "    </documents>\n",
    "\n",
    "    Output only the indices of {k} most relevant documents in order of relevance, separated by commas, enclosed in XML tags here:\n",
    "    <relevant_indices>put the numbers of your indices here, seeparted by commas</relevant_indices>\n",
    "    \"\"\"\n",
    "    try:\n",
    "        response = client.messages.create(\n",
    "            model=\"claude-3-haiku-20240307\",\n",
    "            max_tokens=50,\n",
    "            messages=[{\"role\": \"user\", \"content\": prompt}, {\"role\": \"assistant\", \"content\": \"<relevant_indices>\"}],\n",
    "            temperature=0,\n",
    "            stop_sequences=[\"</relevant_indices>\"]\n",
    "        )\n",
    "        \n",
    "        # Extract the indices from the response\n",
    "        response_text = response.content[0].text.strip()\n",
    "        indices_str = response_text\n",
    "        relevant_indices = []\n",
    "        for idx in indices_str.split(','):\n",
    "            try:\n",
    "                relevant_indices.append(int(idx.strip()))\n",
    "            except ValueError:\n",
    "                continue  # Skip invalid indices\n",
    "        print(indices_str)\n",
    "        print(relevant_indices)\n",
    "        # If we didn't get enough valid indices, fall back to the top k by original order\n",
    "        if len(relevant_indices) == 0:\n",
    "            relevant_indices = list(range(min(k, len(results))))\n",
    "        \n",
    "        # Ensure we don't have out-of-range indices\n",
    "        relevant_indices = [idx for idx in relevant_indices if idx < len(results)]\n",
    "        \n",
    "        # Return the reranked results\n",
    "        reranked_results = [results[idx] for idx in relevant_indices[:k]]\n",
    "        # Assign descending relevance scores\n",
    "        for i, result in enumerate(reranked_results):\n",
    "            result['relevance_score'] = 100 - i  # Highest score is 100, decreasing by 1 for each rank\n",
    "        \n",
    "        return reranked_results\n",
    "    \n",
    "    except Exception as e:\n",
    "        print(f\"An error occurred during reranking: {str(e)}\")\n",
    "        # Fall back to returning the top k results without reranking\n",
    "        return results[:k]\n",
    "\n",
    "def retrieve_advanced(query: str, db: SummaryIndexedVectorDB, k: int = 3, initial_k: int = 20) -> Tuple[List[Dict], str]:\n",
    "    # Step 1: Get initial results\n",
    "    initial_results = db.search(query, k=initial_k)\n",
    "\n",
    "    # Step 2: Re-rank results\n",
    "    reranked_results = rerank_results(query, initial_results, k=k)\n",
    "    \n",
    "    # Step 3: Generate new context string from re-ranked results\n",
    "    new_context = \"\"\n",
    "    for result in reranked_results:\n",
    "        chunk = result['metadata']\n",
    "        new_context += f\"\\n <document> \\n {chunk['chunk_heading']}\\n\\n{chunk['text']} \\n </document> \\n\"\n",
    "     \n",
    "    return reranked_results, new_context\n",
    "\n",
    "# The answer_query_advanced function remains unchanged\n",
    "def answer_query_advanced(query: str, db: SummaryIndexedVectorDB):\n",
    "    documents, context = retrieve_advanced(query, db)\n",
    "    prompt = f\"\"\"\n",
    "    You have been tasked with helping us to answer the following query: \n",
    "    <query>\n",
    "    {query}\n",
    "    </query>\n",
    "    You have access to the following documents which are meant to provide context as you answer the query:\n",
    "    <documents>\n",
    "    {context}\n",
    "    </documents>\n",
    "    Please remain faithful to the underlying context, and only deviate from it if you are 100% sure that you know the answer already. \n",
    "    Answer the question now, and avoid providing preamble such as 'Here is the answer', etc\n",
    "    \"\"\"\n",
    "    response = client.messages.create(\n",
    "        model=\"claude-3-haiku-20240307\",\n",
    "        max_tokens=2500,\n",
    "        messages=[{\"role\": \"user\", \"content\": prompt}],\n",
    "        temperature=0\n",
    "    )\n",
    "    return response.content[0].text"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading vector database from disk.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   0%|          | 0/100 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "18\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   1%|          | 1/100 [00:00<01:31,  1.09it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,2,7\n",
      "[0, 2, 7]\n",
      "15\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   2%|\u258f         | 2/100 [00:01<01:30,  1.09it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   3%|\u258e         | 3/100 [00:02<01:21,  1.19it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,13,15\n",
      "[1, 13, 15]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   4%|\u258d         | 4/100 [00:03<01:18,  1.22it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,6\n",
      "[0, 1, 6]\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   5%|\u258c         | 5/100 [00:04<01:21,  1.17it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "11\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   6%|\u258c         | 6/100 [00:05<01:21,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   7%|\u258b         | 7/100 [00:06<01:20,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,5,11\n",
      "[0, 5, 11]\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   8%|\u258a         | 8/100 [00:06<01:21,  1.13it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,7\n",
      "[0, 1, 7]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:   9%|\u2589         | 9/100 [00:07<01:19,  1.15it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,19,10\n",
      "[1, 19, 10]\n",
      "10\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  10%|\u2588         | 10/100 [00:08<01:18,  1.14it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,0,1\n",
      "[2, 0, 1]\n",
      "Processed 10/100 items. Current Avg Precision: 0.5000, Avg Recall: 0.8000, Avg MRR: 1.0000\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  11%|\u2588         | 11/100 [00:09<01:16,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,4,11\n",
      "[0, 4, 11]\n",
      "8\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  12%|\u2588\u258f        | 12/100 [00:10<01:20,  1.10it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,3,2\n",
      "[0, 3, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  13%|\u2588\u258e        | 13/100 [00:11<01:17,  1.12it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4,3,6\n",
      "[4, 3, 6]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  14%|\u2588\u258d        | 14/100 [00:12<01:16,  1.12it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "9,5,0\n",
      "[9, 5, 0]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  15%|\u2588\u258c        | 15/100 [00:13<01:15,  1.12it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,7,13\n",
      "[2, 7, 13]\n",
      "7\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  16%|\u2588\u258c        | 16/100 [00:13<01:13,  1.15it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,2,0\n",
      "[1, 2, 0]\n",
      "19\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  17%|\u2588\u258b        | 17/100 [00:14<01:10,  1.17it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,4,3\n",
      "[1, 4, 3]\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  18%|\u2588\u258a        | 18/100 [00:15<01:06,  1.23it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,5,3\n",
      "[1, 5, 3]\n",
      "5\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  19%|\u2588\u2589        | 19/100 [00:16<01:04,  1.26it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  20%|\u2588\u2588        | 20/100 [00:17<01:10,  1.13it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,17,18\n",
      "[0, 17, 18]\n",
      "Processed 20/100 items. Current Avg Precision: 0.4333, Avg Recall: 0.7250, Avg MRR: 0.9667\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  21%|\u2588\u2588        | 21/100 [00:18<01:06,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,5,6\n",
      "[0, 5, 6]\n",
      "17\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  22%|\u2588\u2588\u258f       | 22/100 [00:19<01:09,  1.13it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,9,3\n",
      "[1, 9, 3]\n",
      "16\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  23%|\u2588\u2588\u258e       | 23/100 [00:20<01:11,  1.08it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  24%|\u2588\u2588\u258d       | 24/100 [00:21<01:16,  1.01s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,11,14\n",
      "[0, 11, 14]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  25%|\u2588\u2588\u258c       | 25/100 [00:22<01:12,  1.03it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,14,16\n",
      "[0, 14, 16]\n",
      "15\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  26%|\u2588\u2588\u258c       | 26/100 [00:22<01:07,  1.10it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,4\n",
      "[0, 1, 4]\n",
      "6\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  27%|\u2588\u2588\u258b       | 27/100 [00:23<01:03,  1.15it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  28%|\u2588\u2588\u258a       | 28/100 [00:24<00:59,  1.21it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,1,3\n",
      "[2, 1, 3]\n",
      "18\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  29%|\u2588\u2588\u2589       | 29/100 [00:25<00:58,  1.22it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,2,11\n",
      "[1, 2, 11]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  30%|\u2588\u2588\u2588       | 30/100 [00:26<00:59,  1.17it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0, 4, 7\n",
      "[0, 4, 7]\n",
      "Processed 30/100 items. Current Avg Precision: 0.4556, Avg Recall: 0.7389, Avg MRR: 0.9611\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  31%|\u2588\u2588\u2588       | 31/100 [00:26<00:56,  1.23it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,3,4\n",
      "[0, 3, 4]\n",
      "9\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  32%|\u2588\u2588\u2588\u258f      | 32/100 [00:27<00:55,  1.23it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,2,0\n",
      "[1, 2, 0]\n",
      "6\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  33%|\u2588\u2588\u2588\u258e      | 33/100 [00:28<00:54,  1.22it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,0,4\n",
      "[1, 0, 4]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  34%|\u2588\u2588\u2588\u258d      | 34/100 [00:29<00:55,  1.20it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  35%|\u2588\u2588\u2588\u258c      | 35/100 [00:30<00:52,  1.25it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,7\n",
      "[0, 1, 7]\n",
      "16\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  36%|\u2588\u2588\u2588\u258c      | 36/100 [00:31<00:52,  1.21it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "10\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  37%|\u2588\u2588\u2588\u258b      | 37/100 [00:31<00:53,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5,6,8\n",
      "[5, 6, 8]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  38%|\u2588\u2588\u2588\u258a      | 38/100 [00:32<00:53,  1.17it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4,11,3\n",
      "[4, 11, 3]\n",
      "2\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  39%|\u2588\u2588\u2588\u2589      | 39/100 [00:33<00:52,  1.15it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 0, 0\n",
      "[1, 0, 0]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  40%|\u2588\u2588\u2588\u2588      | 40/100 [00:34<00:50,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,6,16\n",
      "[2, 6, 16]\n",
      "Processed 40/100 items. Current Avg Precision: 0.4583, Avg Recall: 0.7167, Avg MRR: 0.9042\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  41%|\u2588\u2588\u2588\u2588      | 41/100 [00:35<00:49,  1.19it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,5\n",
      "[0, 1, 5]\n",
      "11\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  42%|\u2588\u2588\u2588\u2588\u258f     | 42/100 [00:36<00:46,  1.24it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,8,2\n",
      "[0, 8, 2]\n",
      "12\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  43%|\u2588\u2588\u2588\u2588\u258e     | 43/100 [00:36<00:45,  1.26it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,9,6\n",
      "[1, 9, 6]\n",
      "4\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  44%|\u2588\u2588\u2588\u2588\u258d     | 44/100 [00:37<00:44,  1.25it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  45%|\u2588\u2588\u2588\u2588\u258c     | 45/100 [00:38<00:43,  1.25it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 3, 18\n",
      "[1, 3, 18]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  46%|\u2588\u2588\u2588\u2588\u258c     | 46/100 [00:39<00:42,  1.26it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,4,5\n",
      "[0, 4, 5]\n",
      "7\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  47%|\u2588\u2588\u2588\u2588\u258b     | 47/100 [00:40<00:42,  1.24it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,5\n",
      "[0, 1, 5]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  48%|\u2588\u2588\u2588\u2588\u258a     | 48/100 [00:40<00:43,  1.21it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,0,3\n",
      "[1, 0, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  49%|\u2588\u2588\u2588\u2588\u2589     | 49/100 [00:41<00:43,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,1,12\n",
      "[2, 1, 12]\n",
      "4\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  50%|\u2588\u2588\u2588\u2588\u2588     | 50/100 [00:42<00:42,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "Processed 50/100 items. Current Avg Precision: 0.4400, Avg Recall: 0.7033, Avg MRR: 0.8800\n",
      "8\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  51%|\u2588\u2588\u2588\u2588\u2588     | 51/100 [00:43<00:44,  1.10it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "4\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  52%|\u2588\u2588\u2588\u2588\u2588\u258f    | 52/100 [00:44<00:40,  1.19it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,3,1\n",
      "[0, 3, 1]\n",
      "17\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  53%|\u2588\u2588\u2588\u2588\u2588\u258e    | 53/100 [00:45<00:39,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 2, 3\n",
      "[1, 2, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  54%|\u2588\u2588\u2588\u2588\u2588\u258d    | 54/100 [00:46<00:37,  1.24it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 4, 5\n",
      "[1, 4, 5]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  55%|\u2588\u2588\u2588\u2588\u2588\u258c    | 55/100 [00:46<00:34,  1.29it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,8\n",
      "[0, 1, 8]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  56%|\u2588\u2588\u2588\u2588\u2588\u258c    | 56/100 [00:47<00:34,  1.29it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,2,6\n",
      "[0, 2, 6]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  57%|\u2588\u2588\u2588\u2588\u2588\u258b    | 57/100 [00:48<00:35,  1.20it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,14,4\n",
      "[0, 14, 4]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  58%|\u2588\u2588\u2588\u2588\u2588\u258a    | 58/100 [00:49<00:36,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "7\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  59%|\u2588\u2588\u2588\u2588\u2588\u2589    | 59/100 [00:50<00:34,  1.19it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  60%|\u2588\u2588\u2588\u2588\u2588\u2588    | 60/100 [00:51<00:33,  1.18it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 5, 15\n",
      "[1, 5, 15]\n",
      "Processed 60/100 items. Current Avg Precision: 0.4444, Avg Recall: 0.7194, Avg MRR: 0.8889\n",
      "6\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  61%|\u2588\u2588\u2588\u2588\u2588\u2588    | 61/100 [00:52<00:34,  1.13it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,4,1\n",
      "[2, 4, 1]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  62%|\u2588\u2588\u2588\u2588\u2588\u2588\u258f   | 62/100 [00:53<00:37,  1.01it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,7,11\n",
      "[1, 7, 11]\n",
      "5\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  63%|\u2588\u2588\u2588\u2588\u2588\u2588\u258e   | 63/100 [00:54<00:40,  1.09s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  64%|\u2588\u2588\u2588\u2588\u2588\u2588\u258d   | 64/100 [00:55<00:35,  1.02it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,4,11\n",
      "[1, 4, 11]\n",
      "7\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  65%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 65/100 [00:56<00:33,  1.04it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,3,4\n",
      "[2, 3, 4]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  66%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 66/100 [00:57<00:31,  1.09it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,15,12\n",
      "[2, 15, 12]\n",
      "16\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  67%|\u2588\u2588\u2588\u2588\u2588\u2588\u258b   | 67/100 [00:57<00:29,  1.12it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,3,4\n",
      "[1, 3, 4]\n",
      "5\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  68%|\u2588\u2588\u2588\u2588\u2588\u2588\u258a   | 68/100 [00:58<00:28,  1.14it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0, 2, 3\n",
      "[0, 2, 3]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  69%|\u2588\u2588\u2588\u2588\u2588\u2588\u2589   | 69/100 [00:59<00:26,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,3,5\n",
      "[2, 3, 5]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  70%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 70/100 [01:00<00:26,  1.12it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,14\n",
      "[0, 1, 14]\n",
      "Processed 70/100 items. Current Avg Precision: 0.4333, Avg Recall: 0.7024, Avg MRR: 0.8667\n",
      "6\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  71%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 71/100 [01:01<00:24,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,0,2\n",
      "[1, 0, 2]\n",
      "6\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  72%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f  | 72/100 [01:01<00:22,  1.24it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "17\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  73%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e  | 73/100 [01:02<00:22,  1.20it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,3,8\n",
      "[0, 3, 8]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  74%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d  | 74/100 [01:04<00:27,  1.04s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3,1,16\n",
      "[3, 1, 16]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  75%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 75/100 [01:05<00:24,  1.03it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0, 3, 4\n",
      "[0, 3, 4]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  76%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 76/100 [01:06<00:22,  1.07it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3,0,2\n",
      "[3, 0, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  77%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b  | 77/100 [01:06<00:20,  1.10it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,16\n",
      "[0, 1, 16]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  78%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a  | 78/100 [01:07<00:19,  1.15it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,4,13\n",
      "[0, 4, 13]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  79%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589  | 79/100 [01:08<00:17,  1.22it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10,19,1\n",
      "[10, 19, 1]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  80%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 80/100 [01:11<00:28,  1.44s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,10\n",
      "[0, 1, 10]\n",
      "Processed 80/100 items. Current Avg Precision: 0.4375, Avg Recall: 0.7083, Avg MRR: 0.8583\n",
      "12\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  81%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 81/100 [01:12<00:23,  1.25s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2,3,6\n",
      "[2, 3, 6]\n",
      "11\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  82%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f | 82/100 [01:12<00:20,  1.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0, 3, 9\n",
      "[0, 3, 9]\n",
      "13\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  83%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e | 83/100 [01:13<00:17,  1.05s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0, 2, 6\n",
      "[0, 2, 6]\n",
      "12\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  84%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d | 84/100 [01:14<00:16,  1.01s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,10\n",
      "[0, 1, 10]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  85%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 85/100 [01:15<00:15,  1.01s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,4,10\n",
      "[0, 4, 10]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  86%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 86/100 [01:17<00:15,  1.13s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3,1,16\n",
      "[3, 1, 16]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  87%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b | 87/100 [01:18<00:14,  1.13s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,5,11\n",
      "[0, 5, 11]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  88%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a | 88/100 [01:19<00:12,  1.03s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,12,15\n",
      "[1, 12, 15]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  89%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589 | 89/100 [01:20<00:11,  1.04s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,11\n",
      "[0, 1, 11]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  90%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 90/100 [01:20<00:09,  1.06it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,3\n",
      "[0, 1, 3]\n",
      "Processed 90/100 items. Current Avg Precision: 0.4333, Avg Recall: 0.7000, Avg MRR: 0.8685\n",
      "6\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  91%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 91/100 [01:21<00:08,  1.11it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,2,4\n",
      "[1, 2, 4]\n",
      "7\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  92%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 92/100 [01:22<00:06,  1.16it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,5\n",
      "[0, 1, 5]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  93%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e| 93/100 [01:23<00:05,  1.21it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 2, 11\n",
      "[1, 2, 11]\n",
      "5\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  94%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d| 94/100 [01:23<00:05,  1.20it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,1,2\n",
      "[0, 1, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  95%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 95/100 [01:24<00:04,  1.23it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1, 13, 14\n",
      "[1, 13, 14]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  96%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 96/100 [01:25<00:03,  1.06it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,0,2\n",
      "[1, 0, 2]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  97%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b| 97/100 [01:26<00:02,  1.10it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,4,15\n",
      "[1, 4, 15]\n",
      "20\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  98%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a| 98/100 [01:27<00:01,  1.10it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4,14,7\n",
      "[4, 14, 7]\n",
      "1\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval:  99%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589| 99/100 [01:28<00:00,  1.29it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "[0]\n",
      "17\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating Retrieval: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100/100 [01:29<00:00,  1.12it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0,3,8\n",
      "[0, 3, 8]\n",
      "Processed 100/100 items. Current Avg Precision: 0.4367, Avg Recall: 0.6933, Avg MRR: 0.8650\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   0%|          | 0/100 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "18\n",
      "0,2,7\n",
      "[0, 2, 7]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   1%|          | 1/100 [00:04<08:08,  4.93s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key elements from the Correct Answer - namely that you can create multiple test cases by clicking the 'Add Test Case' button and filling in values for variables in your prompt, then repeating this process for additional test cases. The Generated Answer actually provides more detail than the Correct Answer by mentioning you can re-run the evaluation suite, but this additional information doesn't contradict the core information. The essential steps and process described are the same in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "15\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   2%|\u258f         | 2/100 [00:10<08:48,  5.39s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key points from the Correct Answer:\n",
      "1. It correctly identifies Voyage AI as Anthropic's recommended embeddings provider\n",
      "2. It mentions that Voyage AI offers customized/domain-specific models (including specific examples for finance and healthcare)\n",
      "3. It notes that Voyage AI provides bespoke fine-tuned models for individual customers\n",
      "\n",
      "While the Generated Answer provides more specific details about Voyage AI's model offerings than the Correct Answer, this additional information doesn't contradict anything in the Correct Answer - it simply elaborates further. The core substance and main points are aligned between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,13,15\n",
      "[1, 13, 15]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   3%|\u258e         | 3/100 [00:18<10:36,  6.56s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the key points mentioned in the Correct Answer and even provides additional helpful details. Both answers mention the same core success metrics: accuracy, F1 score, consistency, structure, speed, and bias/fairness. Both answers also emphasize the importance of choosing the right model to balance performance and latency based on specific use case requirements. The Generated Answer expands on these concepts by providing more detailed explanations of each metric and discussing the trade-offs between model size and speed, but this additional detail doesn't contradict the Correct Answer - it merely elaborates on it. There are no critical omissions or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,6\n",
      "[0, 1, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   4%|\u258d         | 4/100 [00:24<10:19,  6.46s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the same two key advantages of Claude for Sheets mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers highlight the ability to test prompts in parallel across evaluation suites, noting this is more efficient than sequential chained prompts.\n",
      "\n",
      "2. Both answers mention that Claude for Sheets is better suited for office tasks like survey analysis and data processing compared to chained prompts.\n",
      "\n",
      "The Generated Answer expresses these same core concepts, just with slightly different wording and structure. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "9\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   5%|\u258c         | 5/100 [00:30<09:28,  5.98s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that missing \"\\n\\nHuman:\" and \"\\n\\nAssistant:\" turns in the Text Completions API prompt will result in an API error. The Generated Answer provides some additional context about formatting and ordering, but this extra detail doesn't contradict or detract from the main point. The essential message about what happens (an API error occurs) is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "11\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   6%|\u258c         | 6/100 [00:37<10:12,  6.52s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key points from the Correct Answer and even provides more detailed information while maintaining the same core message. Both answers emphasize that:\n",
      "\n",
      "1. Tool use requests are priced based on total input and output tokens, just like regular requests\n",
      "2. There are additional tokens required for tool use, including:\n",
      "   - The tools parameter\n",
      "   - Tool use content blocks\n",
      "   - Tool result content blocks\n",
      "   - Special system prompt\n",
      "\n",
      "The Generated Answer expands on these points with more detail but doesn't contradict or omit any critical information from the Correct Answer. The fundamental message about how tool use affects pricing (by adding more tokens that are counted in the same way as regular requests) is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,5,11\n",
      "[0, 5, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   7%|\u258b         | 7/100 [00:42<09:12,  5.94s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the essential information from the Correct Answer - specifically the release date (June 27th, 2024) and what features will be available (API usage, billing details, and rate limits). While the Correct Answer provides slightly more detail by mentioning the specific tabs (Usage, Cost, and Rate Limits), this is a minor detail that doesn't change the core meaning. Both answers convey the same fundamental information about what will be available and when.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "9\n",
      "0,1,7\n",
      "[0, 1, 7]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   8%|\u258a         | 8/100 [00:48<09:19,  6.09s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures both key elements from the Correct Answer:\n",
      "\n",
      "1. It mentions considering whether the task requires in-depth thinking/complex reasoning (matching the Correct Answer's point about \"tasks requires in-depth thinking that a human would need to work through\")\n",
      "\n",
      "2. It acknowledges that CoT increases output length which impacts latency (matching the Correct Answer's point about \"increased output length from CoT may impact latency\")\n",
      "\n",
      "The Generated Answer actually provides more detail and examples, but the core substance aligns perfectly with the Correct Answer. There are no contradictions or missing critical pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,19,10\n",
      "[1, 19, 10]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:   9%|\u2589         | 9/100 [00:55<09:34,  6.32s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides much more detailed steps and technical implementation details, at its core it conveys the same fundamental concept as the Correct Answer - that Claude can be used to process and summarize PDF documents to make them easier to understand. The Generated Answer expands on this basic concept by providing specific implementation details and examples, but it doesn't contradict or omit any critical information from the Correct Answer. Both answers focus on the key functionality of being able to upload PDFs and have Claude help digest their contents through summarization.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "10\n",
      "2,0,1\n",
      "[2, 0, 1]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  10%|\u2588         | 10/100 [00:59<08:28,  5.64s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers indicate that you can view the API rate limits in a Rate Limits tab within Anthropic's console interface. The only difference is minor wording variation (\"Developer Console\" vs \"Claude Console\") and the Generated Answer's inclusion of the word \"new,\" but these don't change the core substance of the answer. Both answers convey the same essential information about where to find the rate limits.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 10/100 questions. Current Accuracy: 1.0000\n",
      "20\n",
      "0,4,11\n",
      "[0, 4, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  11%|\u2588         | 11/100 [01:08<09:39,  6.51s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides many additional metrics beyond what is mentioned in the Correct Answer, it does include the two key metrics specified in the Correct Answer: the 95th percentile response time (mentioned under \"Speed\") and cost per classification (mentioned under \"Cost\"). The Generated Answer expands on these core metrics with additional useful measurements, but crucially does not contradict or omit any of the essential elements from the Correct Answer. The additional metrics it suggests (like F1 score, consistency, etc.) are supplementary but don't invalidate the core correct elements.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "8\n",
      "0,3,2\n",
      "[0, 3, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  12%|\u2588\u258f        | 12/100 [01:14<09:29,  6.47s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key distinction between how system prompts are specified in both APIs:\n",
      "\n",
      "1. For Text Completions API: Both answers indicate that the system prompt goes before the first \"\\n\\nHuman:\" turn in the prompt text\n",
      "2. For Messages API: Both answers specify that the system prompt is set using a dedicated \"system\" parameter in the API request\n",
      "\n",
      "The Generated Answer actually provides helpful concrete examples to illustrate these concepts, which goes beyond but doesn't contradict the Correct Answer. The substance and core information about how to specify system prompts in both APIs is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "4,3,6\n",
      "[4, 3, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 13, column 2\n",
      "Evaluating End-to-End:  13%|\u2588\u258e        | 13/100 [01:23<10:29,  7.23s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>\n",
      "The Generated Answer is essentially correct and aligns with the core message of the Correct Answer. Both answers emphasize:\n",
      "\n",
      "1. The use of XML tags (like <thinking> and <answer>) to structure prompts\n",
      "2. The combination of these tags with chain of thought reasoning\n",
      "3. The goal of creating high-performance, structured prompts for Claude\n",
      "\n",
      "While the Generated Answer goes into more detail about specific implementation aspects (like nesting tags and maintaining consistency), and the Correct Answer provides a more specific example of how to prompt Claude, the fundamental concept being conveyed is the same. The Generated Answer doesn't contradict any information in the Correct Answer, and it captures the essential relationship between XML tags and chain of thought reasoning.\n",
      "\n",
      "The additional details provided in the Generated Answer serve to expand on the core concept rather than contradict or omit any critical information from the Correct Answer.\n",
      "</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "9,5,0\n",
      "[9, 5, 0]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  14%|\u2588\u258d        | 14/100 [01:29<09:51,  6.88s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the same key information as the Correct Answer, just presented in a slightly different format:\n",
      "\n",
      "1. Both answers identify the same three metrics being measured\n",
      "2. Both provide the exact same values for each metric:\n",
      "   - 89.01% accuracy\n",
      "   - 1.61 seconds for 95th percentile response time\n",
      "   - $0.0004 for average cost per request/classification\n",
      "\n",
      "The only difference is in minor wording choices (e.g., \"Classification\" vs \"request routing\") which doesn't affect the substantive meaning. The Generated Answer captures all the critical information from the Correct Answer without any contradictions or omissions.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "2,7,13\n",
      "[2, 7, 13]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  15%|\u2588\u258c        | 15/100 [01:34<09:04,  6.41s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key elements from the Correct Answer:\n",
      "1. Having clear success criteria\n",
      "2. Having ways to empirically test against those criteria\n",
      "3. Having a first draft prompt to improve\n",
      "\n",
      "The Generated Answer actually provides slightly more detail by mentioning specific documentation sections, but the core substance perfectly matches the Correct Answer. There are no contradictions or missing critical pieces of information. The minor differences in phrasing (like listing the items with numbers vs. combining them in a sentence) don't affect the correctness of the answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "7\n",
      "1,2,0\n",
      "[1, 2, 0]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  16%|\u2588\u258c        | 16/100 [01:42<09:33,  6.83s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides more detail and context than the Correct Answer, it contains the key information about how mid-response prompting works in both APIs:\n",
      "\n",
      "1. For the Text Completions API, it correctly states that you can \"pre-fill part of Claude's response by including it in the prompt\"\n",
      "\n",
      "2. For the Messages API, it correctly explains that you can \"make the last input message have the assistant role, and the response will continue from that content\"\n",
      "\n",
      "These points align perfectly with the substance of the Correct Answer. The additional information about streaming, input/output formats, and other differences between the APIs doesn't contradict the core information, it just provides extra context. Since there are no missing critical pieces of information and no contradictions with the Correct Answer, the Generated Answer should be considered correct.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "19\n",
      "1,4,3\n",
      "[1, 4, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  17%|\u2588\u258b        | 17/100 [01:51<10:18,  7.45s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core message that when given a role through a system prompt (specifically as CFO), Claude provides more detailed, structured, and actionable financial analysis compared to not having a specific role. The Generated Answer actually provides more detail and examples than the Correct Answer, but crucially does not contradict it. The key points about the role leading to more insightful, structured, and actionable analysis are present in both answers. While the Generated Answer includes additional examples about legal contract analysis, this extra information doesn't detract from or contradict the core message about how role-based prompting affects Claude's responses in the financial analysis context.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "9\n",
      "1,5,3\n",
      "[1, 5, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  18%|\u2588\u258a        | 18/100 [01:59<10:14,  7.49s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the essential elements from the Correct Answer and even provides additional relevant details. Both answers mention:\n",
      "\n",
      "1. Key quantitative metrics including F1 score, accuracy, precision, and recall\n",
      "2. The importance of using industry benchmarks, prior experiments, AI research, and expert knowledge to determine specific targets\n",
      "3. The concept of improvement over baseline\n",
      "\n",
      "The Generated Answer actually goes into more detail by providing specific examples (like the F1-score target of 0.85) and additional metrics like perplexity and operational metrics. While these details aren't in the Correct Answer, they don't contradict it and only serve to enhance the core message. The substance and main points of both answers align completely.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "5\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 9, column 182\n",
      "Evaluating End-to-End:  19%|\u2588\u2589        | 19/100 [02:03<08:55,  6.61s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key elements from the Correct Answer:\n",
      "1. The core concept of combining XML tags with other prompt engineering techniques\n",
      "2. Specifically mentions multishot prompting using <examples> tags\n",
      "3. Mentions chain of thought using <thinking> and <answer> tags\n",
      "4. Notes that this creates \"super-structured, high-performance prompts\"\n",
      "\n",
      "While the wording is slightly different, the substance and meaning are identical. There are no missing critical pieces of information and no contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,17,18\n",
      "[0, 17, 18]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  20%|\u2588\u2588        | 20/100 [02:11<09:06,  6.83s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the essential elements from the Correct Answer and even provides additional helpful implementation details. Both answers emphasize:\n",
      "\n",
      "1. The need to provide a detailed rubric\n",
      "2. Having the LLM evaluate the output against the rubric\n",
      "3. Getting a \"correct\" or \"incorrect\" result as the final output\n",
      "\n",
      "The Generated Answer goes into more specific implementation details about functions and steps, but this additional information doesn't contradict the core concept presented in the Correct Answer. The substance of how to use an LLM for grading (providing it with both the rubric and content to grade, then getting a binary evaluation) is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 20/100 questions. Current Accuracy: 0.9000\n",
      "9\n",
      "0,5,6\n",
      "[0, 5, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  21%|\u2588\u2588        | 21/100 [02:20<09:51,  7.48s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer contains all the essential steps from the correct answer and actually provides more detailed information. The core steps are the same:\n",
      "1. Subscribe to the model package on AWS Marketplace\n",
      "2. Select and agree to terms\n",
      "3. Get the Product ARN for your region\n",
      "4. Create a JupyterLab space in SageMaker Studio\n",
      "5. Upload and follow Voyage's notebook for deployment\n",
      "\n",
      "While the generated answer includes additional information about alternative methods (HTTP API and Python package), this extra information doesn't contradict the correct answer - it just provides additional deployment options. The fundamental process for AWS Marketplace deployment matches the correct answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "17\n",
      "1,9,3\n",
      "[1, 9, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  22%|\u2588\u2588\u258f       | 22/100 [02:27<09:37,  7.40s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it misses some critical elements from the Correct Answer. Specifically:\n",
      "\n",
      "1. It doesn't mention that you should provide a SINGLE tool (exclusivity)\n",
      "2. It doesn't mention setting the tool_choice parameter to explicitly instruct the model to use that tool\n",
      "3. It doesn't mention that tool names and descriptions should be written from the model's perspective\n",
      "\n",
      "While the Generated Answer does discuss JSON formatting and tool usage in general terms, it misses these specific key implementation details that are crucial for properly using tools to generate JSON output. The Generated Answer focuses more on general JSON formatting guidance rather than the specific tool setup requirements outlined in the Correct Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "16\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  23%|\u2588\u2588\u258e       | 23/100 [02:34<09:32,  7.43s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more detailed information than the Correct Answer while maintaining all the key points. Both answers agree on the fundamental differences:\n",
      "\n",
      "1. Both mention that Claude 3 Haiku has vision capabilities\n",
      "2. Both indicate that Claude 3 Haiku is faster and more performant\n",
      "3. Both note that Claude 3 Haiku has more recent/up-to-date training data\n",
      "\n",
      "The Generated Answer expands on these points with additional details about context windows, pricing, and language capabilities, but these additions don't contradict the Correct Answer - they simply provide more specific information. The core message about Claude 3 Haiku being more capable, faster, and more up-to-date is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,11,14\n",
      "[0, 11, 14]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  24%|\u2588\u2588\u258d       | 24/100 [02:40<08:43,  6.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers emphasize the same key point - that using examples helps reduce misinterpretation of instructions and leads to more accurate outputs from Claude. While the Generated Answer includes additional benefits (enforcing uniform structure/style and helping with complex tasks), it fully encompasses the core benefit mentioned in the Correct Answer. There are no contradictions between the two answers, and the Generated Answer doesn't miss any critical information from the Correct Answer - it actually provides more detail while maintaining the same fundamental point about reducing misinterpretation.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,14,16\n",
      "[0, 14, 16]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  25%|\u2588\u2588\u258c       | 25/100 [02:46<08:06,  6.49s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it focuses on different advantages than what is specified in the Correct Answer. The Correct Answer emphasizes the ability to adapt models through providing domain-specific context in prompts without retraining, while the Generated Answer focuses on resource efficiency and cost-effectiveness. While the Generated Answer may state valid benefits of prompt engineering, it misses the key advantage specified in the Correct Answer about being able to easily adapt models through contextual prompts. The answers are discussing different aspects and advantages, making them substantively different.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "15\n",
      "0,1,4\n",
      "[0, 1, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  26%|\u2588\u2588\u258c       | 26/100 [02:51<07:44,  6.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides more detailed steps than the Correct Answer, the core information about using Anthropic's pre-made template by making a copy of the Claude for Sheets workbook template is present and accurate. The additional information about installation and API key setup doesn't contradict the core message, it simply provides extra context. The essential point about using a pre-made template by making a copy is preserved in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "6\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  27%|\u2588\u2588\u258b       | 27/100 [02:57<07:24,  6.09s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that the \"index\" field identifies which content block the text/changes apply to in the streamed response. While the Correct Answer specifically mentions that multiple deltas can have the same index consecutively, and the Generated Answer focuses more on the updating aspect, they are essentially describing the same functionality - that the index field is used to track and identify which content block is being modified in the streaming process. The Generated Answer doesn't contradict anything in the Correct Answer, and captures the main purpose and relationship between the index field and the streamed content.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "9\n",
      "2,1,3\n",
      "[2, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  28%|\u2588\u2588\u258a       | 28/100 [03:04<07:32,  6.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it contains all the essential information from the Correct Answer and does not contradict it. Both answers specify:\n",
      "\n",
      "1. Images must be base64-encoded\n",
      "2. Images are included as part of the API request\n",
      "3. The same four supported image formats are listed (JPEG, PNG, GIF, and WebP)\n",
      "\n",
      "The Generated Answer actually provides additional helpful implementation details, but these extra details don't contradict or omit any of the core information from the Correct Answer. The substance of how to include images and what formats are supported is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "18\n",
      "1,2,11\n",
      "[1, 2, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  29%|\u2588\u2588\u2589       | 29/100 [03:11<07:50,  6.62s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key relationship between TTFT and latency, accurately explaining that TTFT is a specific component of overall latency that measures the time to generate the first token. The Generated Answer maintains the core concept from the Correct Answer that TTFT is particularly important for interactive applications/responsiveness. While the Generated Answer provides additional details about factors affecting latency, this extra information doesn't contradict the Correct Answer and simply provides supplementary context. The fundamental relationship between TTFT and latency is accurately represented in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0, 4, 7\n",
      "[0, 4, 7]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  30%|\u2588\u2588\u2588       | 30/100 [03:19<08:04,  6.92s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core message that providing examples of edge cases to Claude can improve its performance in routing support tickets. The Generated Answer actually expands on the Correct Answer by providing more detailed explanations of how examples can help with specific types of edge cases (implicit requests, emotional prioritization, intent vs. routing, and issue prioritization). While it provides more detail, it doesn't contradict the Correct Answer, and covers all the key points mentioned in the Correct Answer about improving Claude's ability to handle edge cases in ticket routing. The substance and main point about examples improving performance in edge case scenarios is consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 30/100 questions. Current Accuracy: 0.8667\n",
      "9\n",
      "0,3,4\n",
      "[0, 3, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  31%|\u2588\u2588\u2588       | 31/100 [03:28<08:40,  7.54s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the essential elements from the Correct Answer regarding how the \"tool_use\" stop_reason functions in Claude's tool use workflow:\n",
      "\n",
      "1. It explains that Claude determines when a tool can help with a query\n",
      "2. It mentions that Claude constructs a tool use request\n",
      "3. It notes that the API response will have a \"tool_use\" stop_reason\n",
      "4. It describes how the tool input needs to be extracted and executed client-side\n",
      "5. It explains that the results need to be sent back to Claude\n",
      "\n",
      "While the Generated Answer provides more detail and breaks down the workflow into more specific steps, it doesn't contradict the Correct Answer in any way. Instead, it elaborates on the same core concepts. All critical information from the Correct Answer is present in the Generated Answer, and both answers describe the same fundamental process of how the \"tool_use\" stop_reason facilitates the tool use workflow.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "9\n",
      "1,2,0\n",
      "[1, 2, 0]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  32%|\u2588\u2588\u2588\u258f      | 32/100 [03:33<07:47,  6.88s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key information from the Correct Answer:\n",
      "1. It correctly identifies the \"overloaded_error\" event as the error that may be sent during high usage periods\n",
      "2. It correctly states this corresponds to HTTP 529 error code in non-streaming contexts\n",
      "3. It correctly specifies this is for streaming responses\n",
      "\n",
      "The Generated Answer is essentially a restatement of the Correct Answer with very minor wording differences that don't affect the substance of the response. There are no missing critical pieces of information and no contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "6\n",
      "1,0,4\n",
      "[1, 0, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  33%|\u2588\u2588\u2588\u258e      | 33/100 [03:38<07:00,  6.28s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it correctly identifies \"text delta\" as one type, it fails to specifically identify \"input_json_delta\" as the second type, instead vaguely referring to \"other delta types (not specified)\". The Correct Answer clearly states that the two specific types are \"text_delta\" and \"input_json_delta\". By being vague and non-specific about the second type, the Generated Answer misses a critical piece of information that is present in the Correct Answer. This makes the Generated Answer incomplete and therefore incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  34%|\u2588\u2588\u2588\u258d      | 34/100 [03:42<06:17,  5.73s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incorrect. While it correctly states that Claude 3.5 Sonnet became generally available on June 20th, 2024, it fails to mention the separate date for tool use availability (May 30th, 2024). The correct answer clearly indicates these were two separate events with different dates. This is a critical piece of missing information, as the question specifically asked about both Claude 3.5 Sonnet AND tool use availability dates. The generated answer only addresses one of these two important elements.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,7\n",
      "[0, 1, 7]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  35%|\u2588\u2588\u2588\u258c      | 35/100 [03:47<05:58,  5.51s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys the same essential information as the Correct Answer - that Anthropic launched Claude.ai and the Claude iOS app first in Europe in May 2024, followed by Canada in June 2024. While the Generated Answer includes specific dates (May 13th and June 5th) that aren't in the Correct Answer, and mentions the API for Canada's launch, these are just additional details that don't contradict the core sequence of events. The fundamental ordering and timing (Europe in May, Canada in June) matches perfectly between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "16\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  36%|\u2588\u2588\u2588\u258c      | 36/100 [03:55<06:31,  6.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the essential elements from the Correct Answer:\n",
      "\n",
      "1. It correctly explains that a \"tool_use\" stop_reason indicates Claude has decided to use a tool\n",
      "2. It outlines the same key steps needed to continue the conversation:\n",
      "   - Extracting the tool name and input from Claude's request\n",
      "   - Executing the tool code client-side\n",
      "   - Sending back a tool_result content block to Claude\n",
      "\n",
      "The Generated Answer even provides some additional helpful context about Claude analyzing the results to formulate a final response, but this doesn't contradict anything in the Correct Answer. The core substance and key procedural steps are identical between both answers, just expressed with slightly different wording.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "10\n",
      "5,6,8\n",
      "[5, 6, 8]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  37%|\u2588\u2588\u2588\u258b      | 37/100 [04:02<06:38,  6.32s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect for a few reasons:\n",
      "\n",
      "1. The Generated Answer lists additional libraries (`time` and `typing`) that are not mentioned in the Correct Answer and may not actually be part of the example code being referenced.\n",
      "\n",
      "2. The Generated Answer includes a statement that contradicts the purpose described in the Correct Answer - it states \"The code does not directly evaluate tone or style\" while the Correct Answer specifically refers to \"evaluating tone and style in a customer service chatbot.\"\n",
      "\n",
      "While both answers do mention the anthropic library, the Generated Answer includes potentially incorrect additional information and contradicts the core purpose stated in the Correct Answer. These differences are substantial enough to make the Generated Answer incorrect.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "4,11,3\n",
      "[4, 11, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  38%|\u2588\u2588\u2588\u258a      | 38/100 [04:09<06:49,  6.60s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. It describes authentication methods for the standard Claude API, not for accessing Claude models through Amazon Bedrock. The correct authentication methods involve AWS credentials (either direct credentials or using AWS credential providers), while the Generated Answer talks about using ANTHROPIC_API_KEY. These are fundamentally different authentication approaches since Bedrock requires AWS-specific credentials. The Generated Answer shows no awareness of AWS authentication requirements and instead provides completely different, incorrect authentication methods.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "2\n",
      "1, 0, 0\n",
      "[1, 0, 0]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  39%|\u2588\u2588\u2588\u2589      | 39/100 [04:15<06:38,  6.53s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides more detail and context than the Correct Answer, it captures the same core concept of balancing two key factors: (1) the risk/potential of prompt leaks and (2) the impact on model performance/degradation due to added complexity. The Generated Answer expands on these points and provides additional context about implementation considerations, but the fundamental trade-off described matches the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information from the Correct Answer are missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "2,6,16\n",
      "[2, 6, 16]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  40%|\u2588\u2588\u2588\u2588      | 40/100 [04:24<07:18,  7.31s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. Selecting the right Claude model based on specific requirements is a key way to reduce latency\n",
      "2. Different Claude models have different capabilities and performance characteristics\n",
      "3. The goal is to find the optimal balance between speed and capabilities for your specific use case\n",
      "\n",
      "While the Generated Answer goes into more specific detail about individual models (Opus, Sonnet, Haiku) and their characteristics, this additional detail doesn't contradict the Correct Answer - it merely expands upon it. The fundamental point about choosing the right model to optimize for speed while meeting your needs is preserved in both answers.\n",
      "\n",
      "There are no critical omissions or contradictions between the two answers. The Generated Answer effectively communicates the same key concept about model selection for latency optimization.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 40/100 questions. Current Accuracy: 0.8000\n",
      "20\n",
      "0,1,5\n",
      "[0, 1, 5]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  41%|\u2588\u2588\u2588\u2588      | 41/100 [04:31<06:50,  6.96s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the essential information from the Correct Answer and even provides more detailed implementation context. Both answers specify using the client.messages.stream() method and iterating over the stream.text_stream attribute. The Generated Answer includes a practical code example that demonstrates exactly how to implement this functionality, but this additional detail doesn't change the core correctness of the answer. There are no contradictions between the two answers, and no critical information is missing from the Generated Answer when compared to the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "11\n",
      "0,8,2\n",
      "[0, 8, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  42%|\u2588\u2588\u2588\u2588\u258f     | 42/100 [04:37<06:38,  6.86s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information:\n",
      "\n",
      "1. Both explain that you can guide/shape Claude's response by pre-filling part of it in the messages list (though they word it slightly differently)\n",
      "\n",
      "2. Both identify \"max_tokens\" as the parameter used to generate short responses\n",
      "\n",
      "While the Correct Answer specifies \"last position\" of the messages list and gives \"1\" as a specific example value for max_tokens, these are minor details that don't change the core substance of the answer. The Generated Answer accurately captures the main concepts about pre-filling responses and using max_tokens to control response length.\n",
      "\n",
      "There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "12\n",
      "1,9,6\n",
      "[1, 9, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  43%|\u2588\u2588\u2588\u2588\u258e     | 43/100 [04:45<06:53,  7.26s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core message: that having a larger volume of test cases with automated grading is preferable to having fewer test cases with high-quality human grading. While the Generated Answer provides more detailed explanations and supporting points, the fundamental conclusion matches the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information from the Correct Answer are missing from the Generated Answer. The additional detail in the Generated Answer simply elaborates on why this approach is better, but doesn't change the core message.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "4\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  44%|\u2588\u2588\u2588\u2588\u258d     | 44/100 [04:51<06:24,  6.86s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. According to the Correct Answer, the two required fields are \"index\" and \"delta\", where \"delta\" contains the \"type\" and \"text\". The Generated Answer incorrectly states that the two required fields are \"type\" and \"text\". This is a substantive difference, not just a wording variation, as it misidentifies the top-level required fields in the event structure. The Generated Answer is missing the critical \"index\" field requirement and incorrectly elevates \"type\" and \"text\" (which are actually nested within the \"delta\" field) to be the main required fields.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1, 3, 18\n",
      "[1, 3, 18]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  45%|\u2588\u2588\u2588\u2588\u258c     | 45/100 [04:58<06:20,  6.91s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the two key interactive ways to learn Claude's capabilities that were mentioned in the Correct Answer:\n",
      "\n",
      "1. The Claude Cookbooks with their interactive Jupyter notebooks\n",
      "2. The Developer Console with its prompt generator tool\n",
      "\n",
      "The Generated Answer actually provides slightly more detail than the Correct Answer, but the core substance is the same. The mention of VoyageAI and additional details about the Developer Console don't contradict the Correct Answer - they're just supplementary information. Both answers focus on the same two main interactive learning methods, and there are no critical omissions or contradictions between them.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,4,5\n",
      "[0, 4, 5]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  46%|\u2588\u2588\u2588\u2588\u258c     | 46/100 [05:05<06:06,  6.79s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. The core explanation given in the Correct Answer - that breaking tasks into subtasks improves accuracy because each subtask gets Claude's full attention and reduces errors compared to doing everything at once - is fully captured in the Generated Answer's first point about accuracy. While the Generated Answer goes on to list additional benefits (clarity, traceability, mitigation of hallucinations), these are supplementary points that don't contradict the core explanation. The Generated Answer simply provides more detail while maintaining the fundamental reasoning presented in the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes all critical information from the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "7\n",
      "0,1,5\n",
      "[0, 1, 5]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  47%|\u2588\u2588\u2588\u2588\u258b     | 47/100 [05:12<06:05,  6.90s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more detailed information while maintaining the core concept from the Correct Answer. Both answers emphasize the key point that Messages streaming responses can contain multiple content blocks of varying types, making them more complex than Text Completions streaming. The Generated Answer expands on this by providing specific examples of the different event types and explaining the structural differences between the two formats, but this additional detail doesn't contradict the core message - it simply provides more context. The substance of both answers aligns, with the Generated Answer effectively capturing and expanding upon the main point about the increased complexity and multiple content block nature of Messages streaming responses.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,0,3\n",
      "[1, 0, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  48%|\u2588\u2588\u2588\u2588\u258a     | 48/100 [05:17<05:33,  6.41s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it correctly identifies claude.ai and the web Console as one way to experiment with Claude (matching the Correct Answer), it adds a second method about following the Quickstart guide for API calls that is not mentioned in the Correct Answer. Since the Generated Answer includes additional information that is not validated by the Correct Answer, and we cannot verify if this is accurate based on the provided Correct Answer, we must mark this as incorrect. The Correct Answer specifically only mentions claude.ai and the web Console as the two ways to experiment with Claude.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "2,1,12\n",
      "[2, 1, 12]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  49%|\u2588\u2588\u2588\u2588\u2589     | 49/100 [05:25<05:48,  6.84s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that chain prompts help reduce errors by breaking complex tasks into smaller, more manageable subtasks that Claude can focus on individually. The Generated Answer expands on this basic principle with additional details and examples, but does not contradict or omit any critical information from the Correct Answer. The fundamental mechanism described (breaking tasks into subtasks to improve accuracy and consistency) is the same in both answers. While the Generated Answer provides more detail about specific implementations and benefits, these additions complement rather than contradict the core message of the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "4\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  50%|\u2588\u2588\u2588\u2588\u2588     | 50/100 [05:30<05:13,  6.27s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers state that an overloaded_error event corresponds to HTTP status code 529 in a non-streaming context for the Claude API. While the Correct Answer uses slightly more formal language (\"would normally correspond to\"), the core information - the 529 status code - is identical in both answers. The difference in phrasing does not change the fundamental meaning or accuracy of the response.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 50/100 questions. Current Accuracy: 0.8000\n",
      "8\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  51%|\u2588\u2588\u2588\u2588\u2588     | 51/100 [05:37<05:18,  6.50s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the exact same two ways to specify the embedding format as mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers indicate that leaving the format unspecified will return embeddings as lists of floating-point numbers\n",
      "2. Both answers state that setting the format to \"base64\" will return the embeddings as Base64 encodings\n",
      "\n",
      "The Generated Answer simply presents the information in a more structured bullet-point format, but conveys the same essential information as the Correct Answer. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "4\n",
      "0,3,1\n",
      "[0, 3, 1]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  52%|\u2588\u2588\u2588\u2588\u2588\u258f    | 52/100 [05:44<05:19,  6.66s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the essential elements from the Correct Answer:\n",
      "\n",
      "1. It correctly explains that tool_use content blocks are sent as partial JSON strings\n",
      "2. It mentions that these are sent as content_block_delta events\n",
      "3. It notes that the client needs to accumulate these deltas\n",
      "4. It mentions that parsing happens after receiving a content_block_stop event\n",
      "5. It references both Pydantic and SDK helpers as parsing options\n",
      "\n",
      "While the wording and structure differ slightly, the Generated Answer conveys the same key information and technical details as the Correct Answer. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "17\n",
      "1, 2, 3\n",
      "[1, 2, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  53%|\u2588\u2588\u2588\u2588\u2588\u258e    | 53/100 [05:49<04:49,  6.15s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately identifies and describes both tutorials:\n",
      "1. The GitHub tutorial which covers prompt engineering concepts with examples\n",
      "2. The Google Sheets tutorial which is described as a lighter-weight version\n",
      "\n",
      "The Generated Answer captures the key distinctions between the two tutorials and their delivery methods. While the exact wording differs slightly from the Correct Answer, the substance and meaning are essentially identical. The Generated Answer doesn't miss any critical information or make any contradictory claims compared to the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1, 4, 5\n",
      "[1, 4, 5]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  54%|\u2588\u2588\u2588\u2588\u2588\u258d    | 54/100 [05:59<05:28,  7.14s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more comprehensive detail than the Correct Answer while covering all the key points mentioned in it. Specifically, it addresses:\n",
      "\n",
      "1. The 200K token context window (explicitly mentioned in both answers)\n",
      "2. Tool use capabilities for integration with specialized applications (mentioned in both)\n",
      "3. Multimodal input capabilities (mentioned in both)\n",
      "4. Enterprise-grade security and data handling for sensitive information (mentioned in both)\n",
      "\n",
      "The Generated Answer expands on these core points with additional relevant details about security certifications, reliability features, and global capabilities, but these additions don't contradict the Correct Answer - they simply provide more context and depth. The fundamental capabilities that make Claude suitable for enterprise use cases are consistently represented in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,8\n",
      "[0, 1, 8]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  55%|\u2588\u2588\u2588\u2588\u2588\u258c    | 55/100 [06:02<04:38,  6.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect because it omits a key region where Claude.ai API and iOS app are available - the United States. While the Generated Answer correctly mentions Canada and Europe, leaving out the United States represents a significant omission of information. The availability in all three regions (United States, Canada, and Europe) is a critical part of the complete and accurate answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,2,6\n",
      "[0, 2, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  56%|\u2588\u2588\u2588\u2588\u2588\u258c    | 56/100 [06:10<04:51,  6.63s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both main approaches (push-based using webhooks and pull-based) and accurately describes their key differences in terms of:\n",
      "\n",
      "1. Scalability - Both answers note that the push-based approach is more scalable\n",
      "2. Implementation complexity - Both answers indicate that pull-based is easier to implement\n",
      "3. Key trade-offs - Both answers mention the security implications of exposing public endpoints for webhooks in the push-based approach, and the inefficiency of unnecessary calls in the pull-based approach\n",
      "\n",
      "The Generated Answer actually provides more detail than the Correct Answer, but all the core concepts align perfectly. There are no contradictions between the two answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,14,4\n",
      "[0, 14, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  57%|\u2588\u2588\u2588\u2588\u2588\u258b    | 57/100 [06:14<04:13,  5.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is completely correct. It contains all the key information from the Correct Answer: the release date (May 10th, 2024), what was released (a prompt generator tool), and where it's available (through the Developer Console). The wording is slightly different but conveys exactly the same information and meaning. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  58%|\u2588\u2588\u2588\u2588\u2588\u258a    | 58/100 [06:19<03:54,  5.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core message that Claude 3 Sonnet provides the optimal balance between intelligence and speed for high-throughput tasks, specifically mentioning sales forecasting and targeted marketing as examples. The Generated Answer actually provides slightly more detail by directly quoting from the documentation, but the fundamental meaning is identical to the Correct Answer. There are no contradictions or missing critical information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "7\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  59%|\u2588\u2588\u2588\u2588\u2588\u2589    | 59/100 [06:27<04:13,  6.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It contains all the key information from the Correct Answer and even provides additional helpful context and explanation. The core points that:\n",
      "1. Similarity can be calculated using dot product\n",
      "2. This is equivalent to cosine similarity\n",
      "3. This equivalence is due to Voyage embeddings being normalized to length 1\n",
      "\n",
      "are all present in the Generated Answer. The additional mathematical explanation and code example don't contradict anything in the Correct Answer - they just provide supplementary detail.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1, 5, 15\n",
      "[1, 5, 15]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  60%|\u2588\u2588\u2588\u2588\u2588\u2588    | 60/100 [06:34<04:21,  6.55s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the key substance of the Correct Answer, just with more detail and elaboration. Both answers emphasize that examples help improve Claude's performance on complex tasks by:\n",
      "\n",
      "1. Providing better guidance and context (reducing misinterpretation as mentioned in the Correct Answer)\n",
      "2. Helping maintain consistent structure/format (mentioned in both answers)\n",
      "3. Leading to more accurate and desired outputs (both answers touch on this)\n",
      "\n",
      "While the Generated Answer goes into more specific detail about things like reducing hallucinations and breaking down complex tasks, it doesn't contradict the Correct Answer. Rather, it expands upon the core concepts presented in the Correct Answer. The fundamental message about examples improving performance through better guidance, consistency, and accuracy is preserved across both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 60/100 questions. Current Accuracy: 0.8167\n",
      "6\n",
      "2,4,1\n",
      "[2, 4, 1]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  61%|\u2588\u2588\u2588\u2588\u2588\u2588    | 61/100 [06:40<04:04,  6.27s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately identifies and describes both types of content block deltas:\n",
      "\n",
      "1. Text deltas containing the \"text\" field for text content updates\n",
      "2. Input JSON deltas containing the \"partial_json\" field for JSON input updates\n",
      "\n",
      "While the wording is slightly different from the Correct Answer, the substance and key information is the same. The Generated Answer effectively communicates that these deltas represent partial/incremental updates to their respective content types (text and JSON input). There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,7,11\n",
      "[1, 7, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  62%|\u2588\u2588\u2588\u2588\u2588\u2588\u258f   | 62/100 [06:45<03:44,  5.92s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the same key capabilities mentioned in the Correct Answer, just with slightly more detail and different phrasing. Both answers highlight:\n",
      "\n",
      "1. Question answering/interactive capabilities for building systems like chatbots\n",
      "2. Text analysis capabilities for personalization through understanding sentiment and preferences\n",
      "\n",
      "The Generated Answer expands on these points with more specific examples (like customer support chatbots and educational AI tutors), but the core capabilities described are the same. There are no contradictions or missing critical pieces of information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "5\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  63%|\u2588\u2588\u2588\u2588\u2588\u2588\u258e   | 63/100 [06:52<03:48,  6.17s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures all the key elements from the Correct Answer and presents them in essentially the same order:\n",
      "\n",
      "1. Both answers mention the message_start event coming first\n",
      "2. Both describe the content blocks structure with start, delta, and stop events\n",
      "3. Both mention message_delta events\n",
      "4. Both include the message_stop event at the end\n",
      "5. Both note that ping events may be dispersed throughout\n",
      "\n",
      "The Generated Answer actually provides slightly more detail in its structure, but the core information matches perfectly with the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information are missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,4,11\n",
      "[1, 4, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  64%|\u2588\u2588\u2588\u2588\u2588\u2588\u258d   | 64/100 [06:56<03:27,  5.76s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same key information: that the Claude API allows up to 20 images per request, while the claude.ai interface has a limit of 5 images. While the Correct Answer provides slightly more context by mentioning \"Messages API\" and \"per turn,\" the core numerical limits are identical and accurately stated in the Generated Answer. The substance and critical information about the image limits are preserved, even if expressed more concisely.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "7\n",
      "2,3,4\n",
      "[2, 3, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  65%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 65/100 [07:01<03:11,  5.48s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key substance of the Correct Answer, which is that when Claude's response contains an incomplete tool use block due to hitting the max_tokens limit, you should retry with a higher max_tokens value. The Generated Answer conveys the same essential instruction and solution as the Correct Answer, just with slightly different wording. There are no missing critical pieces of information or contradictions between the two answers. Both answers communicate the same core concept and recommended action.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "2,15,12\n",
      "[2, 15, 12]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  66%|\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 66/100 [07:06<02:59,  5.27s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While both answers agree on the first step (developing test cases), they differ on the second step. The Correct Answer states that the second step is to \"take a look at Anthropic's guide to developing test cases\", while the Generated Answer states it is to \"build a strong input prompt\". These are substantively different steps. The Generated Answer misses the critical guidance about consulting Anthropic's documentation on test case development, which is specified in the Correct Answer. This represents a meaningful difference in the substance of what needs to be done before running a classification evaluation.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "16\n",
      "1,3,4\n",
      "[1, 3, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  67%|\u2588\u2588\u2588\u2588\u2588\u2588\u258b   | 67/100 [07:12<02:59,  5.44s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept - that you can use the content parameter with an \"assistant\" role message to pre-fill or influence Claude's response. The Generated Answer provides more detail and an example, but the fundamental meaning matches the Correct Answer. Both answers explain that this technique allows you to shape or guide Claude's output. There are no contradictions between the answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "5\n",
      "0, 2, 3\n",
      "[0, 2, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  68%|\u2588\u2588\u2588\u2588\u2588\u2588\u258a   | 68/100 [07:18<02:56,  5.51s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both key advantages mentioned in the Correct Answer:\n",
      "\n",
      "1. It correctly states that prompt engineering preserves general knowledge while fine-tuning risks catastrophic forgetting\n",
      "2. It accurately notes that prompt engineering is more effective at helping models understand and utilize external content/retrieved documents\n",
      "\n",
      "The Generated Answer essentially restates the same two main points from the Correct Answer, just with slightly different wording. There are no missing critical pieces of information and no contradictions between the two answers. The substance and meaning are identical.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "2,3,5\n",
      "[2, 3, 5]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  69%|\u2588\u2588\u2588\u2588\u2588\u2588\u2589   | 69/100 [07:23<02:51,  5.52s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it describes some technical steps involved in using the API, it misses one of the key initial setup requirements mentioned in the Correct Answer - specifically installing and configuring the AWS CLI. The Generated Answer jumps straight to authentication and client creation, while the Correct Answer emphasizes the more foundational setup steps of: 1) Installing/configuring AWS CLI and 2) Installing an SDK. The Generated Answer's steps are more about the implementation details that would come after these initial setup requirements. Since it's missing this critical setup information, it cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,14\n",
      "[0, 1, 14]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:XML parsing error: mismatched tag: line 3, column 601\n",
      "Evaluating End-to-End:  70%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 70/100 [07:29<02:49,  5.64s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is completely correct. It provides the exact same AWS CLI command as the Correct Answer (`aws bedrock list-foundation-models --region=<region> --by-provider anthropic --query \"modelSummaries[*].modelId\"`), explains that you need to replace `<region>` with your desired region (giving the same example of `us-west-2`), and correctly states that this will list the available Claude models in that region. The substance and technical details are identical between both answers, with only minor differences in phrasing that don't affect the accuracy of the information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 70/100 questions. Current Accuracy: 0.8000\n",
      "6\n",
      "1,0,2\n",
      "[1, 0, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  71%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 71/100 [07:34<02:41,  5.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that the `input_type` argument can be passed to specify whether the input text is a \"query\" or \"document\". The Generated Answer actually provides additional detail about how the input_type affects processing, but this extra information doesn't contradict the Correct Answer. The essential point about the existence and purpose of the `input_type` parameter is accurately conveyed in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "6\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  72%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f  | 72/100 [07:39<02:30,  5.37s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is incomplete compared to the correct answer. While it correctly explains that tool_use deltas contain partial JSON strings and text deltas contain direct text updates, it misses a critical piece of information: that tool_use deltas may have delays between streaming events as the model emits one complete key-value pair at a time. This timing/delay characteristic is an important distinction between the two formats that was specified in the correct answer but omitted from the generated answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "17\n",
      "0,3,8\n",
      "[0, 3, 8]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  73%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e  | 73/100 [07:46<02:33,  5.68s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It accurately states the key file size limits from the Correct Answer: 5MB per image for the API and 10MB per image for claude.ai. While the Generated Answer includes additional information about image quantity limits (20 images per API request and 5 images per turn on claude.ai), this extra information doesn't contradict the core information about file size limits. The substance of the file size limits matches exactly with the Correct Answer, and the additional details don't detract from or contradict this core information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "3,1,16\n",
      "[3, 1, 16]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  74%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d  | 74/100 [07:50<02:20,  5.39s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize the importance of choosing a model that balances speed and output quality based on specific requirements. While the Generated Answer provides additional details about Chain of Thought and its impact on latency, this extra information doesn't contradict the core message. The essential point about balancing speed and quality requirements is present in both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0, 3, 4\n",
      "[0, 3, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  75%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 75/100 [07:57<02:22,  5.71s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the key points from the Correct Answer:\n",
      "1. It correctly identifies the voyage-code-2 model as the recommended model\n",
      "2. It correctly mentions the 17% performance improvement\n",
      "\n",
      "However, the Generated Answer misses one important detail from the Correct Answer: it doesn't mention that the model achieves state-of-the-art results on general-purpose corpora. This is a significant piece of information about the model's capabilities that was included in the Correct Answer.\n",
      "\n",
      "Additionally, there's a small error in attributing this to \"Anthropic's Voyage AI\" - the Correct Answer simply mentions \"Voyage AI\" as a separate entity.\n",
      "\n",
      "Since there is a critical piece of information missing (the state-of-the-art performance on general-purpose corpora), this should be marked as incorrect, even though much of the basic information is accurate.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "3,0,2\n",
      "[3, 0, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  76%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c  | 76/100 [08:02<02:15,  5.63s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is partially correct but not fully aligned with the Correct Answer. While both answers mention interactive Jupyter notebooks and working with PDFs, the Generated Answer diverges by discussing extending Claude's capabilities and VoyageAI, which aren't mentioned in the Correct Answer. The Correct Answer specifically mentions \"embeddings\" as a key feature, but the Generated Answer only mentions embeddings in the context of VoyageAI, which isn't part of the official answer. Since the Generated Answer misses the direct focus on embeddings as a core feature and includes potentially incorrect information about VoyageAI, it cannot be considered fully correct.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,16\n",
      "[0, 1, 16]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  77%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b  | 77/100 [08:09<02:13,  5.78s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures the core concept presented in the Correct Answer - that a larger context window allows the model to incorporate more retrieved information during RAG, which improves the quality of the generated output. The Generated Answer actually provides additional details about latency considerations and trade-offs, but these additions don't contradict the core message of the Correct Answer. Both answers emphasize the fundamental relationship between context window size and the model's ability to effectively utilize retrieved information. The Generated Answer maintains the same essential meaning while expanding on the implications.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,4,13\n",
      "[0, 4, 13]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  78%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a  | 78/100 [08:15<02:12,  6.03s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the key elements from the Correct Answer and even expands on them in a helpful way. Both answers emphasize:\n",
      "\n",
      "1. The tool's ability to identify edge cases where prompts might not work well\n",
      "2. The capability to rate/evaluate individual results to assess performance\n",
      "3. The importance of reviewing results across test cases to ensure consistency\n",
      "4. The ultimate goal of refining prompts to build more robust applications\n",
      "\n",
      "The Generated Answer adds some additional context about the beta status and feedback process, but this doesn't contradict the Correct Answer - it simply provides extra information. The core substance and main points about how the Evaluation tool helps improve prompts are consistent between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "10,19,1\n",
      "[10, 19, 1]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  79%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589  | 79/100 [08:20<01:56,  5.54s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers identify Claude 3 Haiku as having the fastest comparative latency. The Generated Answer provides additional context about classification tasks and compares it to other models, but the core claim about Haiku being the fastest matches exactly with the Correct Answer. There are no contradictions or missing critical information between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,10\n",
      "[0, 1, 10]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  80%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 80/100 [08:28<02:09,  6.46s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys the same core concept as the Correct Answer - that to have a multi-turn conversation using the Anthropic Messages API, you need to send the full conversation history with each request because the API is stateless. The Generated Answer actually provides more detail and a concrete code example, but the fundamental principle matches exactly with the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. The full conversation history must be included with each request\n",
      "2. The API is stateless\n",
      "3. Previous messages (both user and assistant) need to be included\n",
      "\n",
      "There are no contradictions between the answers, and the Generated Answer doesn't miss any critical information from the Correct Answer. The additional implementation details provided in the Generated Answer don't change the core correctness of the response.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 80/100 questions. Current Accuracy: 0.7875\n",
      "12\n",
      "2,3,6\n",
      "[2, 3, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  81%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 81/100 [08:36<02:09,  6.82s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it conveys the same core message as the Correct Answer. Both answers emphasize that:\n",
      "\n",
      "1. Using XML tags to provide a specific role (like General Counsel) improves Claude's analysis of legal contracts\n",
      "2. The role context helps Claude identify critical legal issues and risks that might be missed without it\n",
      "3. The analysis is more thorough and valuable with the role context\n",
      "\n",
      "The Generated Answer actually provides more detail and examples, but the fundamental point about role prompting improving contract analysis capability remains the same. There are no contradictions between the two answers, and no critical information from the Correct Answer is missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "11\n",
      "0, 3, 9\n",
      "[0, 3, 9]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  82%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f | 82/100 [08:44<02:07,  7.10s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect for several reasons:\n",
      "\n",
      "1. The Generated Answer makes claims about \"chain of thought\" differences between the models that are not mentioned in the Correct Answer and may not be accurate according to the documentation.\n",
      "\n",
      "2. The Generated Answer mentions Claude 3 Haiku, which is not relevant to the question about Opus vs Sonnet.\n",
      "\n",
      "3. Most importantly, the Generated Answer fails to capture the key distinction provided in the Correct Answer: that Opus is more likely to ask users for missing information while Sonnet tends to infer reasonable values on its own. Instead, it makes different claims about transparency and handling ambiguous queries that don't align with the core difference described in the Correct Answer.\n",
      "\n",
      "While both answers discuss how the models handle missing information differently, the specific nature of those differences as described in the Generated Answer contradicts or misses the key point made in the Correct Answer.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "13\n",
      "0, 2, 6\n",
      "[0, 2, 6]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  83%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e | 83/100 [08:50<01:59,  7.02s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it covers all the key points mentioned in the Correct Answer and even provides additional helpful detail. Both answers emphasize:\n",
      "\n",
      "1. Implementing retry logic for error handling\n",
      "2. Conducting thorough staging/testing\n",
      "3. Load testing\n",
      "4. Error handling and logging setup\n",
      "5. Gradual rollout process\n",
      "6. Documentation and training\n",
      "7. Monitoring and alerting\n",
      "\n",
      "The Generated Answer expands on these points with more specific implementation details, but the core recommendations align perfectly with the Correct Answer. There are no contradictions between the two answers, and no critical pieces of information from the Correct Answer are missing from the Generated Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "12\n",
      "0,1,10\n",
      "[0, 1, 10]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  84%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d | 84/100 [08:58<01:54,  7.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. While it provides more detail and additional metrics (like F1-score and consistency), it includes all three key elements from the Correct Answer:\n",
      "\n",
      "1. Accuracy (explicitly mentioned)\n",
      "2. Cost (explicitly mentioned)\n",
      "3. Speed (explicitly mentioned as response time)\n",
      "\n",
      "The Generated Answer expands upon these core concepts but doesn't contradict them. The additional information provided (like F1-score, consistency, and interpretability) are supplementary details that don't detract from or contradict the main points in the Correct Answer. Since all three critical elements from the Correct Answer are present in the Generated Answer, and there are no contradictions, the Generated Answer should be considered correct.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,4,10\n",
      "[0, 4, 10]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  85%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 85/100 [09:03<01:37,  6.47s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers identify the same two recommended methods for learning prompt engineering with Claude:\n",
      "1. The GitHub prompting tutorial\n",
      "2. The Google Sheets prompting tutorial\n",
      "\n",
      "The Generated Answer provides slightly more detail by mentioning that the GitHub tutorial is \"example-filled\" and that the Google Sheets version is a \"lighter weight version,\" but these are just additional descriptive details that don't change the core substance. The fundamental information about the two recommended learning methods matches between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "3,1,16\n",
      "[3, 1, 16]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  86%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c | 86/100 [09:11<01:36,  6.89s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct as it captures all the key points from the Correct Answer and expands on them without contradicting anything. Both answers emphasize:\n",
      "\n",
      "1. The fundamental difference in training - pretrained LLMs are trained on raw text data for next-word prediction, while Claude undergoes additional RLHF training\n",
      "\n",
      "2. The difference in capabilities - pretrained LLMs require more prompt engineering to be useful, while Claude is better at directly following instructions and being helpful\n",
      "\n",
      "3. The purpose of the additional training - to make Claude more helpful, honest, and capable at various tasks\n",
      "\n",
      "While the Generated Answer provides more detail and structure, it maintains complete alignment with the core message of the Correct Answer. There are no contradictions or missing critical pieces of information.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,5,11\n",
      "[0, 5, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  87%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b | 87/100 [09:19<01:34,  7.25s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides a more detailed expansion of the key points mentioned in the Correct Answer. It covers all the main advantages mentioned in the Correct Answer:\n",
      "\n",
      "1. Speed and efficiency (mentioned in points 1 and 4)\n",
      "2. Cost-effectiveness (point 2)\n",
      "3. Less data and compute requirements (points 1 and 5)\n",
      "4. Preservation of general knowledge (point 9)\n",
      "5. Flexibility and rapid iteration (point 6)\n",
      "6. Transparency (point 10)\n",
      "\n",
      "The Generated Answer elaborates on these points with more specific examples and explanations, but fundamentally conveys the same core advantages. There are no contradictions between the two answers, and the Generated Answer doesn't miss any critical information from the Correct Answer. In fact, it provides additional valuable context while maintaining alignment with the core concepts presented in the Correct Answer.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,12,15\n",
      "[1, 12, 15]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  88%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a | 88/100 [09:24<01:20,  6.75s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. The core instruction about running `gcloud auth application-default login` to authenticate with GCP is present and matches exactly with the Correct Answer. While the Generated Answer provides additional context about using the SDK and making requests afterward, this extra information doesn't contradict the core authentication step specified in the Correct Answer. The substance of how to authenticate is identical between both answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,11\n",
      "[0, 1, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  89%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589 | 89/100 [09:31<01:14,  6.74s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures the core information about the Prompt Generator tool being introduced on May 10th, 2024, and its main purpose of helping users create tailored prompts for specific tasks. While the Correct Answer provides additional context about the Claude iOS app and Claude Team plan, these are supplementary details rather than critical pieces of information about the Prompt Generator capabilities themselves. The Generated Answer accurately conveys the essential functionality and purpose of the new tool, even if it's more concise. There are no contradictions between the two answers, and the key functionality of helping users create customized prompts is preserved in both versions.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "0,1,3\n",
      "[0, 1, 3]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  90%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 90/100 [09:35<00:59,  6.00s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It conveys exactly the same information as the Correct Answer - that both Claude 3.5 Sonnet and the Artifacts feature became available on June 20th, 2024. While the wording is slightly different (omitting \"both\" and having a slightly different sentence structure), the core information and meaning are identical. There are no missing critical details or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 90/100 questions. Current Accuracy: 0.8000\n",
      "6\n",
      "1,2,4\n",
      "[1, 2, 4]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  91%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 91/100 [09:40<00:50,  5.58s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core information - that to limit Claude's response to a single token, you should use the \"max_tokens\" parameter set to 1 in the request. The Generated Answer uses slightly different wording by mentioning \"request body\" instead of just \"request,\" but this is a minor detail that doesn't change the fundamental meaning. Both answers accurately describe how to achieve the desired single-token limitation using the same parameter with the same value.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "7\n",
      "0,1,5\n",
      "[0, 1, 5]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  92%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 92/100 [09:45<00:43,  5.42s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. Both answers convey the same core concept that temperature controls randomness in the model's output generation. The Generated Answer simply provides more detail and elaboration about what higher and lower temperatures do specifically, but the fundamental meaning matches the Correct Answer. There are no contradictions between the two answers, and the Generated Answer includes the key concept about randomness control that is present in the Correct Answer. The additional details in the Generated Answer serve to explain the concept further rather than change its meaning.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1, 2, 11\n",
      "[1, 2, 11]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  93%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e| 93/100 [09:51<00:39,  5.64s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is incorrect. While it correctly identifies one way to specify API parameters (adding them as additional arguments after the prompt and model), it misses the second key method mentioned in the Correct Answer - the ability to pass in an API key for a specific cell. Instead, it incorrectly states that CLAUDEMESSAGES is the second method. The CLAUDEMESSAGES function is not mentioned in the Correct Answer at all, making this a significant deviation from the correct information. Since one of the two main methods is completely different from what's specified in the Correct Answer, this constitutes a critical error.</explanation>\n",
      "<is_correct>false</is_correct>\n",
      "</content>\n",
      "\n",
      "5\n",
      "0,1,2\n",
      "[0, 1, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  94%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d| 94/100 [09:57<00:33,  5.59s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer captures all the key points from the Correct Answer:\n",
      "1. Prefilling with { causes Claude to skip the preamble\n",
      "2. Results in direct JSON object output\n",
      "3. Makes the response more concise\n",
      "4. Makes it easier for programs to parse\n",
      "\n",
      "While the wording is slightly different, the substance and meaning are identical. The Generated Answer effectively communicates the same information about how prefilling with a curly brace affects Claude's output behavior. There are no missing critical pieces of information or contradictions between the two answers.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1, 13, 14\n",
      "[1, 13, 14]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  95%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 95/100 [10:03<00:29,  5.81s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The generated answer is partially correct but contains extra information that is not verified by the correct answer. The first two points about the multimodal cookbook and API reference documentation match the correct answer's substance. However, the third point about the developer community is not mentioned in the correct answer and appears to be additional unverified information. Since this addition doesn't contradict the correct information but rather adds to it, and the core resources (cookbook and API reference) are accurately captured, the generated answer can be considered substantially correct in terms of the key information provided.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,0,2\n",
      "[1, 0, 2]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  96%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c| 96/100 [10:10<00:24,  6.11s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct and actually provides more detailed information than the Correct Answer while maintaining the same core information. Both answers convey that:\n",
      "\n",
      "1. The API key can be specified as a parameter when creating a new Anthropic client\n",
      "2. If not provided explicitly, the SDK will default to using the ANTHROPIC_API_KEY environment variable\n",
      "\n",
      "The Generated Answer goes further by providing specific code examples in both Python and TypeScript, but this additional detail doesn't contradict or omit any of the key information from the Correct Answer. The substance of both answers is essentially the same.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "1,4,15\n",
      "[1, 4, 15]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  97%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b| 97/100 [10:15<00:17,  5.96s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the same two key benefits mentioned in the Correct Answer:\n",
      "\n",
      "1. Both answers mention identifying edge cases where prompts might fail/falter\n",
      "2. Both answers discuss ensuring consistent performance across test inputs/cases\n",
      "\n",
      "The Generated Answer breaks these points out more explicitly with numbering, but the core substance is identical to the Correct Answer. The slight differences in wording (e.g., \"rate individual results\" vs \"test case inputs\") don't change the fundamental meaning. Both answers emphasize the tool's ability to help identify problems and ensure reliability across different scenarios.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "20\n",
      "4,14,7\n",
      "[4, 14, 7]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  98%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a| 98/100 [10:23<00:12,  6.44s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures the key points from the Correct Answer:\n",
      "\n",
      "1. It acknowledges that the pretrained model is not inherently good at answering questions or following instructions (matching the Correct Answer)\n",
      "\n",
      "2. It explains that the final version of Claude went through fine-tuning and RLHF to become more helpful and capable (matching the Correct Answer)\n",
      "\n",
      "While the Generated Answer provides additional details about biases and capabilities, these don't contradict the Correct Answer - they merely expand upon it. The core message about the transformation from pretrained model to final API version through fine-tuning and RLHF is consistent between both answers.\n",
      "\n",
      "There are no critical pieces of information from the Correct Answer that are missing from the Generated Answer, nor are there any contradictions between the two.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "1\n",
      "0\n",
      "[0]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End:  99%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589| 99/100 [10:26<00:05,  5.56s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is exactly identical to the Correct Answer, stating that Anthropic's IPv6 address range is 2607:6bc0::/48. There are no differences in wording or substance, and all critical information is present.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "17\n",
      "0,3,8\n",
      "[0, 3, 8]\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluating End-to-End: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100/100 [10:31<00:00,  6.32s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "<content>\n",
      "<explanation>The Generated Answer is correct. It captures both key methods for specifying the API key that are mentioned in the Correct Answer:\n",
      "1. Using the ANTHROPIC_API_KEY environment variable\n",
      "2. Passing the API key directly when initializing the client\n",
      "\n",
      "While the Generated Answer is more concise, it contains the same essential information as the Correct Answer. The additional details in the Correct Answer (like mentioning that the environment variable is used \"by default\") are supplementary and don't change the core correctness of the Generated Answer. There are no contradictions between the two answers, and no critical information is missing.</explanation>\n",
      "<is_correct>true</is_correct>\n",
      "</content>\n",
      "\n",
      "Processed 100/100 questions. Current Accuracy: 0.8100\n",
      "Detailed results saved to evaluation_results_detailed_level_three.csv\n",
      "Average Precision: 0.4367\n",
      "Average Recall: 0.6933\n",
      "Average F1: 0.5359\n",
      "Average Mean Reciprocal Rank: 0.865000\n",
      "End-to-End Accuracy: 0.8100\n",
      "Evaluation complete. Results saved to evaluation_results_level_three.json, evaluation_results_detailed_level_three.csv, and evaluation_results_level_three.png\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "# Initialize the SummaryIndexedVectorDB\n",
    "level_three_db = SummaryIndexedVectorDB(\"anthropic_docs_v3\")\n",
    "level_three_db.load_data('data/anthropic_summary_indexed_docs.json')\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "# Run the evaluations\n",
    "avg_precision, avg_recall, avg_mrr, f1, precisions, recalls, mrrs  = evaluate_retrieval(retrieve_advanced, eval_data, level_three_db)\n",
    "e2e_accuracy, e2e_results = evaluate_end_to_end(answer_query_advanced, level_two_db, eval_data)\n",
    "\n",
    "# Create a DataFrame\n",
    "df = pd.DataFrame({\n",
    "    'question': [item['question'] for item in eval_data],\n",
    "    'retrieval_precision': precisions,\n",
    "    'retrieval_recall': recalls,\n",
    "    'retrieval_mrr': mrrs,\n",
    "    'e2e_correct': e2e_results\n",
    "})\n",
    "\n",
    "# Save to CSV\n",
    "df.to_csv('evaluation/csvs/evaluation_results_detailed_level_three.csv', index=False)\n",
    "print(\"Detailed results saved to evaluation_results_detailed_level_three.csv\")\n",
    "\n",
    "# Plot the results\n",
    "# Print the results\n",
    "print(f\"Average Precision: {avg_precision:.4f}\")\n",
    "print(f\"Average Recall: {avg_recall:.4f}\")\n",
    "print(f\"Average F1: {f1:.4f}\")\n",
    "print(f\"Average Mean Reciprocal Rank: {avg_mrr:4f}\")\n",
    "print(f\"End-to-End Accuracy: {e2e_accuracy:.4f}\")\n",
    "\n",
    "# Save the results to a file\n",
    "with open('evaluation/json_results/evaluation_results_level_three.json', 'w') as f:\n",
    "    json.dump({\n",
    "        \"name\": \"Summary Indexing + Re-Ranking\",\n",
    "        \"average_precision\": avg_precision,\n",
    "        \"average_recall\": avg_recall,\n",
    "        \"average_f1\": f1,\n",
    "        \"average_mrr\": avg_mrr,\n",
    "        \"end_to_end_accuracy\": e2e_accuracy\n",
    "    }, f, indent=2)\n",
    "\n",
    "print(\"Evaluation complete. Results saved to evaluation_results_level_three.json, evaluation_results_detailed_level_three.csv, and evaluation_results_level_three.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABWkAAAJOCAYAAADF44XfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8pXeV/AAAACXBIWXMAAA9hAAAPYQGoP6dpAADfdUlEQVR4nOzdeVgN/f8/8Odp3yzVrbLGjaJEkSjZKktUkp072227iZvbrqzJEh/um+y77O6iLNn3kiUi+0641Un20nbO7w+/5us4lfYTPR/X5bqamffMvObMnDnO67zn9RZJpVIpiIiIiIiIiIiIiEghlBQdABEREREREREREVFpxiQtERERERERERERkQIxSUtERERERERERESkQEzSEhERERERERERESkQk7RERERERERERERECsQkLREREREREREREZECMUlLREREREREREREpEBM0hIREREREREREREpEJO0REREJZBUKlV0CD89vsY/B55HIiIiIvoZqCg6ACL6MVy4cAF9+/bNdrmqqirKlSsHExMT9OrVC23bts1xe8nJyWjWrBk+ffoER0dHLF++PFdxSKVSREZGIjQ0FDdu3MCrV6+QkpICPT091K9fH23btoWzszOUlZVzfWwODg548eKF3HwlJSVoaGjA0NAQTZo0we+//45q1arlerv58fbtW/j5+eHMmTNISkqCoaEhDh06BBUV3q6L09fXu7KyMs6dOwc9Pb1s28fFxaFVq1aQSCQwNDTEmTNnCrT//fv34/jx41i8eHGu1/H09MTFixexYcMG2NnZFWj/hUEsFqNTp04YMGAABg8eLLPs2rVr2LFjBy5evAixWAxVVVUYGhqiadOm6NOnD2rWrFnk8V2/fh2zZ8/Gtm3biuz9tXv3bvj4+KBz586YN29ejm2fP38OR0fHQrl+8uJ79/Zvbd68GU2aNCn0ONLT02Fubg4AuHv3bq7XO3fuHNatW4cNGzYUekxfMzU1zXVbLy8vjBw5skji6NWrF65cuZLn87B3715MnDgRALBs2TI4OTkVSXyl0cePH9G+fXt06dIFY8aMUXQ4RERE9APjt34iyhMtLS04OjrKzX///j0ePHiAiIgIREREYMyYMRg2bFi22zl06BA+ffoEdXV1nDp1CnFxcTA0NMxx38+fP8eYMWNw/fp1AEDNmjVhbW0NVVVVxMbG4siRIzh8+DDWrFmDlStXomLFink6Njs7O+jr6wvTUqkUycnJuH37Nnbs2IH9+/djw4YNqF+/fp62mxdz5sxBaGgo9PX10bp1a5QvX54JWgXLyMjA0aNH0aNHj2zbhIWFQSKRFMr+Ll26hLFjx6Jhw4aFsj1F8fb2RtmyZdG/f3+Z+QEBAQgICICSkhIsLCxgYWGBT58+4fHjx9i6dSt27tyJadOm5fh6F4bu3buzB+ZXsru3f+uXX34phmhy5+XLl/j999+/+9lRmJycnKCpqZljm7wkdItLUFAQ1NXVkZKSgu3btzNJW4h0dHQwbtw4TJ48GS1atECjRo0UHRIRERH9oPjNn4jyRFdXFwsXLsxymUQiwcaNGzF//nwsXboUnTp1yjZRGhQUBADo378/Vq1ahV27duXY8yguLg69e/dGXFwc7O3tMXnyZNSqVUumzfPnzzF79mycPHkSI0aMwK5du/KU4Bw2bFiWPZPS09Ph7e2NvXv3wsfHB6GhobneZl5du3YNALB48eIi6a1GeVO2bFm8f/8ehw4d+m6SVlVVFWlpaQXeZ36TvfPnz0dycjIqVapU4BgKKiwsDKdPn8ayZcugqqoqzD9//jyWLl2KSpUqYdOmTTI906VSKfbv34+JEydixowZaNCgAerUqVNkMTJBKyune3tJVVg/jOTF5MmTUaVKlWLfb0HExsbi0qVLsLe3x+vXrxEeHo7Y2FhUrVpV0aH9NDp16oSNGzdi+vTp2Lt3L39cJSIionxhTVoiKjRKSkoYOHAg6tWrh/T0dJw9ezbLds+ePcPly5dhamqKPn36QElJCbt370ZGRka2254xYwbi4uLQpk0brF69Wi5BCwBVqlTBsmXLYG5ujps3b+LIkSOFclwqKiqYMmUKlJWVcffuXTx79qxQtpuVzCSfkZFRke2Dcq9WrVowNjbGxYsX8ebNmyzbvHjxAtHR0WjevHkxRyerUqVKqFmz5nd7+RW19PR0LFq0CNWrV5frrRcSEgIAGDFihFzpEJFIBFdXV/Tu3RsSiQQ7d+4stpiJfmbBwcGQSqWwt7eHs7MzpFIp31+FTCQSYdCgQbh//z727Nmj6HCIiIjoB8UkLREVusqVKwP4Ul81K0FBQZBKpWjfvr1QhzIuLg4nTpzIsv2zZ89w4sQJaGlpYdasWTnWm1VWVsbo0aPRo0ePQk10litXDuXKlQMAvH79WmbZkydPMGnSJLRo0QL16tVDixYt4O3tnWWdWwcHB1hbW+PevXvw8PBAvXr10Lp1a/Tr1w+mpqbCOm3btoWpqSkuXLggrPvgwQNMmDABzZs3R7169WBvb4/x48fjwYMHcvvx9PSEqakp7t27h759+8LCwgL29vY4ePAgLly4AFNTU8yfPx8PHjzAiBEjYGNjg4YNG6Jfv36IiYkBAFy+fBmenp6wsrJCixYtMGnSJCQmJsrt6/Hjx5g2bRratWsHS0tL1K9fH05OTpg5cybi4uJk2gYHB8PU1BQbN27EtWvXMGjQIDRu3BiWlpbo1atXtteAWCzG/Pnz0a5dO9SvXx+tW7fGX3/9hYcPH8q1TUlJwZo1a+Dm5oYGDRqgUaNG6Nu3b7bb/h5nZ2ekp6fj2LFjWS4/cOAAAMDFxSXbbeQ2pkmTJgn1Qa9cuQJTU1N4enoCgHDe/Pz8EBgYCDs7OzRo0EBIamae84iICJltpqamYuPGjfDw8ICVlRVsbW3Rt2/fLOuehoaG4rfffoOdnR3q16+PNm3aYObMmfjvv/9y/XodOnQIz549Q9euXeWWZb53RCJRtuu7ubnBzc0NJiYmcssK4z2wZs0amcfRzc3N5R5Pf//+PRYvXoz27dvDwsICTZo0wdChQ3H58uUsY/7w4QMWLVqENm3aoH79+ujYsSN2796d7TF+z/Pnz/Hnn3/C2toaDRs2RP/+/XHu3DmZNsOGDYOpqWm2ybbVq1fD1NQUAQEB+Y7jeyZNmgRTU1Pcvn0bISEh6NKlCywtLWFjY4ORI0fi/v37Wa4XFhaGnj17omHDhmjatCmmTZuGd+/e5WnfS5cuFcozxMXFwdTUFA4ODjJt8nK9FJX83vNu3LiB4cOHw9bWFlZWVhg8eDDu3buX5/1LJBLs3bsXIpEI7dq1g6urK0QiEYKCgpCamprteh8/fkRAQABcXFxgaWmJ5s2bY9iwYYiOjpZrm9t7THb3KOD/Xqdx48YJ8753zwPy9vnz9XaHDx8Oe3t7WFlZwdXVFStWrEBSUhKAL6+9qakpWrRokWVv7ZSUFDRu3BhWVlbCOsCXz+3y5ctj7dq17KlPRERE+cIkLREVqk+fPiEqKgoAULt2bbnlmV8YlZSU0KlTJwBA586dAQA7duzIcpv79u0DALRu3TrHwZsytWjRArNmzSrUmp6vX78WEpRfP05+/vx5dO7cGXv27EG5cuXg4OCAcuXK4d9//4WHhwdu3Lght620tDQMGTIE79+/R8uWLSESieDg4ABXV1doaWkBABwdHeHq6irUfzxx4gQ8PDwQEhICXV1dODo6Qk9PD6GhoejSpQtOnjyZZdwjR47Eo0eP0LJlS6ioqKBevXrCsps3b6Jbt264efMmbGxsUKFCBURGRqJv377YvXs3+vbti7dv36JZs2ZISUnBnj17MGjQIJkvn5cvX0bnzp2xc+dOaGtro0WLFrCyskJCQgK2bduGnj174uPHj3JxRUZGok+fPnj48CFsbGxgbGyMK1eu4I8//sDhw4dl2t67dw+dO3fG+vXrkZ6ejlatWkFXVxcHDhxAly5dhKQy8CWx8Ntvv2HhwoUQi8Vo2rQp6tevL2x76dKl3z3X33J2dgbwJfmYlbCwMFSuXBkNGjTIcnleYrKyshIG/dLT04Orq6vcIGBnzpyBn58fatWqBUtLSxgbG0NJKeuP80+fPsHT0xNz587F8+fPYWtrizp16iAqKgqDBw/G1q1bhbbLly/H+PHjcfPmTZiZmaFly5bIyMjAtm3b0LVrV4jF4ly9Xpm9yLKqb5pZvuDvv//GmTNnskyA1K9fHwsWLECvXr1k5hfWe6BBgwZwdXUVlru4uMhMv3r1Cl27dsXKlSuRnJyM5s2bo3bt2jhz5gw8PT3lkq/v3r1Dnz59sGrVKqSkpKBVq1bQ0NCAj48PNm7cmKvX7GufPn1Cr169cO7cOdjY2MDc3ByRkZH4/fffZc5XZhJ87969WW5nz549EIlEwv21KC1btgwTJkxAeno6mjdvDk1NTRw5cgQ9e/ZEbGysTNt//vkHo0ePxo0bN9CwYUM0aNAAISEheRq8DPhS9zWzp7ampiZcXV1lem7n93opKnm5550+fRq9evXC8ePHUa1aNTRv3hy3bt1Cr1698PLlyzztNyIiAi9fvkSTJk1QsWJFVKxYEba2tkhMTMz2aZO4uDh07doVS5cuxdu3b9G8eXNUrVoVJ0+elEss5+Uek1/Z3fPy8/mzdu1a9OvXDydPnoSxsTGaNWuGN2/e4O+//8agQYOQmpqKevXqoU6dOoiLi0NkZKRcPMePH8f79+/h7OwsfGYDgJqaGuzt7fHkyRPh/0FEREREeSIlIsqFyMhIqYmJibR169ZyyzIyMqRv376VRkRESHv06CE1MTGRdu7cWZqeni7X9vTp01ITExNp//79hXmfP3+WWltbS01NTaXPnj2TW2fo0KFSExMT6Y4dOwr3oP6/1q1bS01MTKSRkZFZLk9KShJi6NOnjzA/MTFRamNjI61bt6704MGDMuvs2LFDamJiInV0dJSmpKTI7atz587C/IyMDLnlT548EebFx8dLLS0tpaamptLg4GCZ/ezevVtqamoqtbKykr569UqY/9tvv0lNTEykLVu2lL5580ZmP5nn0sTERDpmzBhpamqqVCqVSlNSUqRdunQRli1btkwmBmtra6mJiYn0+vXrwnwXFxepiYmJ3PHHx8cLxxISEiLMDwoKErY/Z84cYd9SqVQ6d+5cqYmJidTDw0OYl5GRIXV3d5eamJhI/f39ZV6rLVu2SE1MTKQuLi7CvIkTJ0pNTEykf/31l/TTp0/C/MePHwvxhIeHS78n8zXq2bOnVCqVStu1ayc1MzOTJiYmyrR79OiR1MTERLpw4UJpbGys1MTERNq8eXOZNnmN6dt9fzvfxMREumbNGpnXSCr9v3P+9bZ8fX2lJiYm0r59+0rfv38vzL927Zq0QYMGUjMzM2lCQoI0JSVF2qBBA6mNjY00Li5OaJeWlib18vKSmpiYSJcsWfLd1y05OVlqYWEhtbW1zXJ5XFyctHnz5sJx2NraSv/66y/p9u3bpQ8ePMh2u4X9HpBKpUIMaWlpMtvLXG/evHky12d0dLTU2tpaam5uLr1//74wf9asWVITExPpH3/8If38+bMwf9euXcI+Jk6cmMOr9kXm9WNiYiLt2LGjVCwWC8vOnj0rNTc3l9arV08aGxsrlUq/nBtbW1upiYmJ3H3z2rVrUhMTE2m/fv2+u9+c7u3fk3lt161bV3rgwAFh/ufPn6U9e/YUXsev4zI1NZXa2NhI7969K8x/9uyZtFWrVsLx51Z277n8XC/fkxlb5uufW3m95338+FHarFkzqampqXTfvn3C/E+fPkkHDhwobCu7z6tvjR49Wu4+vH//frnPs68NGzZMuGd9/fl17NgxaZ06daQ2NjbCceT2HiOVZn2P+vZ1Gjt2rDDve/e8vH7+XL9+XVqnTh1pw4YNpZcvXxbmJyUlCbGtX79eKpVKpZs2bZKamJhIx48fLxfroEGDpCYmJtJLly7JLdu+fbvUxMREumjRoqxeWiIiIqIcsSctEeXJixcvYGpqKvOvbt26sLGxQf/+/XH16lW0bNkSa9asybIsQXBwMADAw8NDmKeurg4XFxdIpdIse9NmPmqd3Qjec+bMwbhx4+T+zZkzJ0/HtnLlSpn1x44di4EDB8Le3h4nT56Evr4+fH19hfa7d+/G27dv0bt3b6G3ZaYePXqgdevWiI2NxdGjR+X21b17d6ipqQFAtr0gM+3cuRNJSUno3LmzXK+4rl27onPnzvj06RO2b98ut66rqyvKly+f5X5EIhF8fHyEgZ3U1NTQrl07AEDFihUxdOhQoW2FChVgZWUFAEJN3k+fPqFevXro0qWL3PFXqFBB6NX2/Plzubj09fUxfvx4mUGlMnvSff2I9NWrV3Hr1i3Url0b48aNkzmGPn36wMbGBjo6OkhMTERcXBxCQ0NRoUIF+Pr6yvRwql69OiZNmgQAWLdunVw835NdyYODBw8CADp27JjlekURk7KyMnr37i1MZ3f9pKamIigoCCoqKvD390eZMmWEZfXr10efPn1gYmKCe/fu4cOHD0hOToampiZ0dXWFdioqKhg7dixmzJiB1q1bfze26OhopKSkZDvgl4GBAbZv346WLVsC+NJDff/+/Zg+fTo6dOiA1q1bY/HixXK934rqPfCta9eu4eLFi6hTp47c9dmgQQMMHz4caWlp2Lx5M4Avr3FwcDBUVVUxe/ZsqKurC+27deuWq9csK1OnThV60QOAvb09evbsKZxT4Mu5yXwa4dvetJm9mb++z35PVvf2b/8NHz48y3UdHBzQoUMHYVpdXV0YaO/r9/POnTshlUoxYsQImXIWVatWxeTJk3Md6/cU5Hr5HkdHxxxfI2tr6yzXy+0979ixYxCLxXBycpIpoaKlpYV58+bJrP897969w7Fjx1CmTBm0bdtWmN+mTRuUL18ely5dkiv9kFl6qHz58vDz8xM+pzKPvUOHDqhWrRqePHmSp3tMQWR1z8vP58/OnTshkUgwbNgwNGrUSJivqamJSZMmoVq1asITA25ublBTU8PRo0dlShqIxWKEh4ejevXqWZ7rzHvf16WKiIiIiHKLQ48SUZ5oaWkJjzFLpVK8evVKqNPYsWNHjBo1CtWrV89y3bdv3+L48eMoW7aszBdG4MsX523btiE4OBh//vmnzBfDzEeipdnUeDt27FiW9V8rV66MKVOm5PrYvq2Tp6ysDG1tbRgbG8Pe3h6enp6oUKGCsDzzS1iTJk2y3F7z5s1x8uRJXLhwQS6Jl5dR6y9dugQAQgL1Wx06dEBwcDAuXrwotyyn/VSrVk2ufETmdO3ateWS7JlfwFNSUgAA2tramDt3rtx24+LicPv2bdy5cwcAsqx7aG5uLjf6tYGBgbB9iUQCJSUl4ZhatWqVZR3TwMBA4e8DBw4gIyMDFhYWMsnQTM2aNYOSkhKioqKQkZGRY23jbzk7O2P58uU4fPgwunXrJswPCwtDzZo1UadOnSyT0ZcvXy70mKpVq5bltr4VExODpKQkNGjQIMsfOMaPHy8z/euvv+LRo0fo0qULXF1d0aJFC5iamqJ69erZvqe/lfmDSmZd6qxUrlwZq1evxtOnT3HixAlERkYiKioKHz58wMuXL7Fy5UqEhIQgMDBQGH2+qN4D38p8Tzdu3DjLhG7z5s0xb948YT+Zr7GVlVWWpVicnJzy/Fi9gYFBlvcUBwcHBAYGyhxj165dsX79eoSEhGDkyJEAvrzfDh48CB0dHbn7bE6+vrdn5+tyKV/LqtRH5vs5OTlZmJd5Hlu0aCHXvlWrVlBRUUF6enquY85OQa6X73FycspxcL7s3pu5vefl9BpVqFABDRo0yLY28rf279+P1NRUdO7cGRoaGsJ8NTU1uLq6IjAwEDt27ICPj4+wLPM9YGdnJ7NOpv/973/C31FRUXm6x+RXVve8/Hz+ZJ7vb2sXA1/Oz9c/qJYvXx6Ojo4ICwvDkSNH4O7uDuBL3e6MjIxsy4hUqVIFwJeyKURERER5xSQtEeWJrq4uFi5cKDMvKioKQ4YMwYEDB2BiYoJhw4Zlue6+ffuQmpoKDQ0NDBo0SG65kpISEhMTcfjwYZkakQYGBrh37x4SEhKy3O63A688ffo0T8mJTJs3b8424ZqVzISUl5dXju2y+rKWOQhZbsTHxwPIPvGV+aUwq5qhOe0nq2WZidCcln3rypUr2LVrF27evIlnz57h8+fPMu2zSq6XLVtWbt7XCYzMhEXmMVWsWDHb48iUWavxxIkTcgNBfS05ORnv3r3LVX3jTCYmJqhVqxYiIyPx7t07lCtXDnfv3sX9+/eF5FhxxZTZK/R78vLaAV/qxI4cORJ3797F3bt3sXDhQlSoUAEODg7o3r17tgm6r2UODKajo/PdtsbGxhgwYAAGDBgAiUSCmzdv4tChQ9i2bRv+++8//PXXX0L916J6D3wr83wFBgbK/ADwrcz3dGZc2fXyz4wrL7I7xsyBEL8eDKlmzZqwsrLC1atXcfnyZVhbW+PkyZN4+/YtevTokWWSLTtZ3dtzK6vXOPMHh6/rDuf0eqmpqcHAwECm5urOnTuFpOXXevbsmW2P1a/3k9vr5ciRI1nWZ23btq3cZ8nkyZPzdV5ze8/LzTWV2yRtZq/rS5cuCQMQZsqssR4SEoKxY8cKiee83Dfyeo/Jr5zueXn5/MmM9+u68jnp2rUrwsLCsHfvXiFJm1lTP3P6W5k/Zn47wCgRERFRbjBJS0QF1qhRI8yfPx8jRozA4sWLUbVq1Swf/878wvj+/fscezDt2LFDJklbt25dnDt3DleuXJHpxahoGRkZAL4MaJZTUqpWrVpy87732PXXsutBnCkzCfJ17+Pc7OfbXl35MXPmTGzbtg1KSkqoU6cOnJ2dUbNmTTRo0ADh4eFYuXJllutll/D9Vl561WW+DrVq1ULdunVzvV5uOTs7Y+nSpTh27Bi6dOny3VIHRRVTbl+7zOszt0xNTXHw4EGEh4fj5MmTOH/+PJ48eYKdO3di165dmDJlyncHd8o8X1kNCJaUlIQHDx5AWVkZ5ubmMsuUlJRgYWEBCwsLODs7o0ePHrh+/ToeP36MGjVqFNl7ILvtWFhY5Nh7OPMcfO9c5Oc99nXJhNxss0uXLrh69SpCQ0NhbW2NkJAQAHkrdVBQub0mv9fu257kV69eFQaO/JqdnV2OSdq8Xi93797Ncj/Gxsb5+sEvK0X1GmXnzp07uHnzJgDg0aNHePToUZbt3r9/jwMHDggD0eXlvpHXe0x+t5Xda5LXz5+89tK2s7NDpUqVcOHCBcTFxSExMRH37t2Dvb298KPJtzKvrcJ8bYiIiKj0YJKWiAqFk5MTunbtin///RczZsxA48aNhUc5AeD27du4ffs2DA0NcerUqSwTJ2KxGC1btsTly5fx4MEDIbnZqVMnrFmzBseOHcPkyZOz7JGkCAYGBnjy5An69u0LOzu7It3P48eP8eLFC9SuXVtueebo6fr6+kUWQ1YuXryIbdu2oWLFili7dq1cMjq7kcPzIvMa+rr34NfOnz+PhIQE2NjYCKUo6tatm+8egTnJTNIeOnQIXbp0QVhYGMzNzVGjRo1s1ynqmHKSue/sXrvHjx8jKioKFhYWQi9fFRUVtGzZUqgZ+/LlS2zevBkbNmzA4sWL0bNnzywToZkye1S+efNGbtndu3fRs2dP1KxZU0hwZyVzZPUbN27g3bt3AIrvPZD5mjVr1gxjxoz5bvvM3o5f9/78WmavyLzIbp3Mki7f9gLs0KED5syZg2PHjmHChAk4d+4cfv31V1haWuZ530Ut85758uVL1KxZU2aZRCKRe1pi3rx5mDdvXr72k5frZeTIkTn2iC9OmddUViV8gNxfU5k/ig4ZMgRjx47Nss26devg7++PHTt2CEna7903YmJi8PDhQzRs2DDP95jMZGtWCcwPHz7k6rgy5efzp0KFCnjx4gVevXqV5X17x44dMDAwEMohKCkpoXPnzli2bBmOHTsmvPZdunTJNq7MHsol5f8pRERE9GPhwGFEVGgmTpyIChUq4P3793K14jK/MHbo0CHbnm0VKlSAvb09AMgM6FK7dm20bdsW79+/h7e3d5a99L6W2XuoqDVu3BgAcPr06SyX+/v7w93dHbt27SqU/Rw+fDjL5WFhYQAAGxubAu0nr6KjowF8eST42y/IGRkZiIyMBPD9Xm05adiwIQDgzJkzWS5fvHgxxo0bh8TEROF1unTpkkwdzEwxMTFo27YtRo4cma+YatasCRMTE5w/fx6RkZF4+vSpzGBJWclPTLntcfc95ubmUFNTw40bN7J89DYoKAje3t44f/48zp8/D2dnZ0ydOlWmTaVKlTBp0iSULVsWSUlJePv2bY77zOx9mlXSxsTEBNra2nj48CHOnz+f7TZSUlLw8uVLqKqqCtsrrvdA5n7Onj2b5X3m6NGjcHZ2xowZMwB8SSiXLVsWN2/ezDJRe+rUqTzH8Pjx4yzrG2cmnb49Rm1tbbRv3x6vX7/G33//jZSUlGLtRZsXmT9mZZVAu3DhQpbvkZxk914pqffM3LC1tQWALAec/PDhA6Kior67jdTUVKFn8NeDj33L1dUVysrKiImJET43M++558+fz7Ke+Pr16zFx4kTcu3cvT/cY4P/q9WbVNvPzJLfy8/mTeWxZfWY/fPgQ06dPxz///CMz38PDAyKRCEePHsWxY8dQrlw5YVCyrGQmcnP68Y6IiIgoO0zSElGhKVu2LCZOnAjgy6j3mQNx5fYLIwChzltISIjMF3Y/Pz9Uq1YNR44cwW+//YaYmBi5dWNjYzFt2jSh11Beao7mR48ePaClpYUtW7bgwIEDMstOnDiBzZs3486dO7CwsCjQfrp37w4tLS3s2bNHGLU9U1BQEEJCQqClpZXtQCZFRVdXF8CXL/Nfn6vk5GRMnTpVGLE8c6Cx/GjatClq1qyJ27dvIyAgQOYL97Zt23Dt2jWYmJigbt26qFq1KhwdHfHq1St4e3vj48ePQtvXr1/D29sbT58+RcWKFfOdCHV2dkZaWhpmzpwJkUiUY6kDAPmKKfNx96/b5oe2tjY6d+6MtLQ0TJkyReYcxcTEYMuWLdDQ0EC7du1gamqKZ8+eISQkRC4JdOrUKbx//x6VKlWSGTgvK/Xr14eKigpiYmLkestpa2tj4MCBAIA///wT+/fvl0uEJiYmCkn3Ll26CLUoi+I9kPk6f92Dr0mTJqhbty5u3rwJf39/mSTV06dPMXv2bDx69EhIwKiqqqJ3797IyMjAhAkTZM7Z4cOHs3yE/nukUikmTZokt62goCCUKVMmy5IvmT37tm7dCmVlZXTq1CnP+y0Offr0gaqqKlavXi1TVzU+Ph6zZs3K8/Yyz2FSUpLMtVRS75m54eDggGrVqiEiIgIbN24U5qempsLHxwdJSUnf3caJEyfw5s0bmJiY5FgL28DAAM2aNQPwfz+MZg6U+fr1a/j6+sqUCDh58iQOHToEfX19NGvWLE/3GOD/BvHbsWOHzHvr0KFDWSalc5Kfz58+ffpAJBJh+fLluHXrljD/06dPwvXn5uYms58qVaqgadOmuHDhAh48eICOHTvm+DTB1atXAfxfQpiIiIgoL1jugIgKlaurK4KCgnD+/HnMnDkT+/btw7Fjx/D27VtUr179u4MPOTo6omzZsnj//j32798vJCTKli2L3bt3Y+rUqThy5Ai6du2KqlWrokaNGlBXV0dsbKwwmrO6ujp69+6NUaNGFemxGhoaYv78+fjrr7/w119/YdmyZfj111/x33//4caNGwCAKVOmFLgW6df7mTRpEjZu3IgaNWrg8ePHuHPnDjQ1NeHv75/tIDlFxdnZGQEBAbh37x6cnJxgaWmJ1NRUXL16FR8+fEDt2rVx//79bAd8yw0lJSUsWrQI/fv3x9KlS7F//36YmJjg2bNnuH37NrS1tbF48WKhva+vL54+fYoDBw4gPDwcFhYWEIlEuHz5MpKSktCwYcNcPcae0zH/888/ePToERo2bJirAXPyGlOVKlWgrKyMe/fuoV+/fjA1NcWUKVPyFe+ECRNw48YNnDp1Cg4ODrC2tsa7d+9w+fJlZGRkYP78+cIxjB8/HnPnzkWfPn1gaWkJAwMDxMXFITo6GsrKypg2bdp3k9va2tqwsbFBREQEbt26JfcDxYgRI5CQkIDt27dj7Nix8PPzg7m5OXR0dBAfH4/r168jLS0NLVu2hLe3t7BeUbwHjI2Nce/ePfTt2xfVq1fH/PnzoaWlhcWLF6Nfv37YsGEDDhw4AHNzc3z+/BmXL19GWloa2rVrh99++03YzvDhw3HlyhVcvHgRTk5OaNy4MRISEnDlyhVhUK+8qFGjBu7fv482bdrA2toaYrEYV69ehaqqKvz9/bMs6WBtbS28Hq1atZIpNZNbb968wbhx477brnHjxujRo0eetw98qc08ZcoUzJo1C3379kXjxo2hra2NyMhI/PLLL9DT0xMeF88NPT094fOiZ8+eqFatGhYuXFik98y5c+cKg2zlFFd+37Pq6upYuHAhBg0ahLlz52Lv3r2oVq0arl+/jsTERJiZmckkGLOS+eTK935EAoDOnTvjzJkzOHDgACZNmgQdHR34+fmhT58+2LVrF86dOwcLCwvEx8fj6tWrUFFRwaJFi4TXIC/3mO7du2Pr1q24evUq2rZti/r16yM2Nha3bt1C586d5RLqOcnP54+VlRX+/PNP/P333+jWrRusra2hra2N6OhovH79Gs2aNcOAAQPk9tW1a1ehN/D3eqlnDnSXWTKBiIiIKC+YpCWiQjd9+nS4ubnhyZMnWLNmjZCk+F4vWuDLF1RnZ2fs3LkTO3bskOk1Vr58eSxduhTXr19HaGgoLl++jJiYGHz8+BG6urpo1qwZ7Ozs4OHhUeS9aDO1bdsWQUFBWLt2LSIjI3Hq1Cno6+ujdevWGDBgAJo0aVJo+/n333+xZs0aXLhwAQ8fPkSFChXQtWtXDBw4UK6+Y3HQ0dHBrl278M8//yAyMhKnT5+GtrY2zMzM0LNnTzRt2hR2dnY4d+4c0tLSoKqqmq/91KlTB3v27MHKlStx5swZnDhxAmXKlIGLiwu8vLxkHivV19fHrl27sGnTJoSFheHSpUtQU1NDjRo10KlTpzyPeP+tGjVqoG7durh9+3auEiD5iUlfXx9+fn4ICAhAVFQUXr58me+Ej46ODrZu3YqNGzfiwIEDOHXqFFRUVGBjY4NBgwYJ5UUAoH///jAwMMD27dtx584dxMTEQFdXFx06dMCgQYPkBvvKTrdu3RAREYHDhw/LJWlFIhFmzJgBNzc37NmzB5cvX8a1a9eQnJyM8uXLo3nz5ujUqRPat28vt93Cfg/4+flhxowZuH//PuLj4xEbGwtTU1PUqFEDe/fuxdq1a3H8+HGEh4dDW1sb9erVQ/fu3eHm5iYzeJO6ujrWrVuHjRs3Ys+ePTh9+jQMDAwwbtw41KtXD/37989TXEZGRli2bBnmzZuHc+fOQUlJCa1bt8bIkSNzPAcNGzbE48eP813qICkpKVc9f1VUVPKdpAWA3r17o3r16li9ejVu3LgBkUiEli1bYtKkSejVq1eetqWkpISFCxdi/vz5uHXrFmJjY/Hu3TuUK1euyO6Zx44d+26bypUr5/s9CwANGjTArl27sGzZMmEAP3NzcyxatAi7du3KMUkbFxeH8PBwALlL0n79w2hISAj69OkDIyMjBAUFYfXq1Th27BhOnDgBTU1NtG7dGsOHD0f9+vWF9fNyj6lUqRJ27NiBf/75BxcuXMDp06dRu3ZtLF68GKampnlK0ub38+ePP/6AmZkZNm3ahJiYGCQnJ6NKlSr47bffMGjQoCzLMTVq1AjAl5ItOT0Z8/HjR0RGRqJWrVrCOkRERER5IZIWpFggERER0VckEglcXV3x5s0bnDp1KsdHg6lwpKamokWLFlBWVsapU6fy/YMIEcnbuHEj5s6dCx8fH3h6embbbsuWLfD19cXixYu/W6+ciIiIKCusSUtERESFRklJCV5eXnj9+jUOHjyo6HB+WhKJBKmpqUhPT8fChQvx5s0b9OzZkwlaokLw+fNnAMC9e/ewZs0a6Ojo5FjDWCKRYNu2bTAxMcnySQAiIiKi3GC5AyIiIipUzs7O2Lt3L5YsWQJnZ2dhgCcqPOnp6bCysoJIJEJaWhoMDQ3zXFqBiLK2fPlybNy4URh4bMKECdDR0cm2/e7du/HkyRNs27Yty5IJRERERLlRYv4XkZqaChcXF1y4cCHbNrdu3UK3bt3QoEEDdOnSRRiYh4iIiEqWOXPmICUlBevXr1d0KD8lNTU11KlTByKRCFZWVli7di3KlCmj6LCIfgp169aFsrIy9PT0MGLECAwcODDbth8/fsQ///yDYcOGwdLSsviCJCIiop9OiahJm5KSgrFjx+Lo0aPYvHlzlgPtJCUloW3btnB1dUXXrl2xfft2hIWF4ejRo9DS0lJA1EREREREREREREQFp/CetA8ePED37t3x7NmzHNsdPHgQ6urqmDBhAmrWrAlvb29oa2vj0KFDxRQpERERERERERERUeFTeJL24sWLaNKkCXbu3Jlju2vXrqFRo0YQiUQAAJFIhIYNGyI6OroYoiQiIiIiIiIiIiIqGgofOKx37965aicWi1GrVi2Zefr6+rh//36u9yWRSJCeng4lJSUh2UtEREREREREiiWVSiGRSKCiosJB+IioVFJ4kja3kpOToaamJjNPTU0Nqampud5Geno6YmJiCjs0IiIiIiIiIioEFhYWct/9iYhKgx8mSauuri6XkE1NTYWGhkaut5H5a5yZmRmUlZULNT4qGTIyMnDr1i2eYypVeN1TacVrn0ojXvdUGvG6Lx0yzzN70RJRafXDJGkNDQ2RkJAgMy8hIQEGBga53kZmiQM1NTV+uP+kMjIyAPAcU+nC655KK177VBrxuqfSiNd96ZB5nlmakIhKqx/mJ6oGDRrg6tWrkEqlAL7Uq7ly5QoaNGig4MiIiIiIiIiIiIiI8q9EJ2nFYjE+f/4MAGjfvj3ev38PPz8/PHjwAH5+fkhOToazs7OCoyQiIiIiIiIiIiLKvxKdpLW3t8fBgwcBADo6Oli1ahWioqLg4eGBa9euYfXq1dDS0lJwlERERERERERERET5V6Jq0t69ezfH6fr162PPnj3FGRIRERERERERUaGRSCRyA6MT0c9JVVU11/XUS1SSloiIiIiIiIjoZ5WamorHjx9DIpEoOhQiKibly5eHkZHRdwdGZJKWiIiIiIiIiKiISaVS/Pfff1BWVkbVqlWhpFSiK1ASUQFJpVIkJSUhPj4eAFCxYsUc2zNJS0RERERERERUxNLT05GUlIRKlSpxfB2iUkJTUxMAEB8fDwMDgxxLH/BnGyIiIiIiIiKiIpaRkQEAUFNTU3AkRFScMn+USUtLy7Edk7RERERERERERMXke3Upiejnktv3PJO0RERERERERERERArEJC0REREREREREZVapqamMDU1xcuXL+WWbd++Haampli6dGmutvX69WuEhYXJbPvChQuFFquDgwOCg4MLbXtUcjBJS0REREREREREpZqqqipOnDghN//YsWN5KlGxcOFCnD59ujBDo1KCSVoiIiIiIiIiIirVrK2t5ZK0Hz9+xNWrV2FmZpbr7Uil0sIOjUoJJmmJiIiIiIiIiKhUc3R0xMWLF/Hx40dh3qlTp2BtbQ1tbW2Ztjt27ICDgwOsrKzg6emJu3fvAgCWLl2KPXv2YM+ePXBwcBDaX758Ga6urrCwsMBvv/2GFy9eCMsePnyI33//HQ0bNkTz5s0REBAAiUQis69WrVqhYcOGWL58uUwcd+7cQc+ePdGgQQNhXfpxMUlLRERERERERESlmomJCQwNDXHmzBlh3tGjR+Hk5CTT7sSJEwgICMDUqVOxZ88eNGrUCH379sW7d+8wcOBAODs7w9nZGf/++6+wzu7du+Hj44N///0X7969w8KFCwEAiYmJ6N27NwwMDLB7925Mnz4dW7ZswebNmwEAZ8+ehZ+fH0aPHo2dO3ciJiZGJsE7YcIE1K1bF/v374efnx/Wrl3LUgs/MCZpiYiIiIiIiIio1HN0dBRKHqSmpiI8PByOjo4ybdauXYuhQ4eidevWqF69OkaPHo3KlSsjNDQU2tra0NDQgIaGBvT09IR1/vjjDzRp0gSmpqbo2rUr7ty5AwDYv38/NDU14evri5o1a8LJyQl//vkn1q5dC+BLctfV1RXu7u6oXbs25syZA3V1dWG7L168QPny5VG5cmW0aNECGzZsyFNpBipZmKQlIiIiIiIiIqJSz9HREWfPnkV6ejrOnz8PExMT6Ovry7R5+PAhFixYACsrK+HfnTt38OTJk2y3W61aNeHvMmXKICUlRdiWubk5VFRUhOVWVlYQi8V4//49Hj58iLp16wrLdHV1UbVqVWF66NChWLFiBezt7TFlyhSkpqaiQoUKBX0ZSEFUvt+EiIiIiIiIiIjo59aoUSMAQFRUFI4dO4Y2bdrItcnIyMCUKVNga2srM19HRyfb7SopZd1H8utesZky69FmZGQAkB+ITFVVVfh7yJAhcHZ2xrFjx3DixAn069cPvr6+6NatW7axUMnFnrRERERERERERFTqqaiooGXLljhx4gROnjwpV48WAGrUqIFXr17B2NhY+Ldy5UpER0cDAEQiUa73V6NGDdy8eRNpaWnCvKtXr0JPTw/ly5dH7dq1ERMTIyz7+PEjnj59CgBISUnB7NmzoaamhgEDBiAwMBDdu3fH4cOH83n0pGhM0hIREREREREREeFLyYPdu3dDX19fprRApgEDBmDTpk3Yu3cvnj17hgULFiAsLAw1a9YEAGhqauLFixeIi4v77r5cXV2RmpqKadOm4eHDhzh27BiWLl2KXr16QSQS4bfffkNYWBh27dqFhw8fYtq0afj8+TOAL71wr1y5Al9fXzx69AgxMTG4fPkya9L+wFjugIiIiIiIiIiICIC9vT3S09Oz7EULAB06dEBCQgKWLFmChIQE1KpVCytWrED16tUBAJ06dcKIESPg5uaGyMjIHPelo6ODtWvXws/PD+7u7tDT00O/fv0wdOhQAIC1tTXmzp2Lv//+G4mJiejSpYtMjdrFixdj1qxZ6Nq1K1RUVNC+fXsMHz68cF4IKnYi6bfFLX5iGRkZiI6OhqWlJZSVlRUdDhUBnmMqjXjdU2nFa59KI173VBrxui8dSsN5/vz5Mx4/fowaNWpAQ0ND0eEQUTHJ7Xuf5Q6IiIiIiIiIiIiIFIhJWiIiIiIiIiIiIiIFYpKWiIiIiIiIiIiISIGYpCUiIiIiIiIiIiJSICZpiYiIiIiIiIiIiBSISVoiIiIiIiIiIiIiBWKSloiIiIiIiIiIiEiBmKQlIiIiIiIiIiIiUiAmaYmIiIiIiIiIiIgUiElaIiIiIiIiIiLKkqmpqcy/pk2bwsfHB58+fSrwti9cuABTU9M8r/f8+XO5uMzNzWFvbw9fX1+kpqbKrbN06VKYmpri/PnzWW4zPT0d69atg5ubGywtLWFtbY1BgwYhKioqz/ER5YeKogMgIiIiIiIiIiqtJFIplESiEr2/pUuXwsrKChKJBP/99x+mTZsGf39/zJw5s0CxWFlZ4dy5c/lef/fu3ahYsSIAICUlBRcvXsT06dOhq6sLLy8vmbb79+9HtWrVsHfvXtja2sosk0gkGDp0KG7fvo2JEyeiYcOGSEpKQkhICPr374/NmzfDysoq33ES5QaTtERERERERERECqIkEiH0yQe8/pxe5PvS11CBW/UyeV6vXLlyqFChAgDA0NAQQ4cOxcyZMwucpFVTUxO2mx96enoy61epUgVXrlzBsWPHZJK0N2/exLNnz+Dn5wdfX19MmzYN2trawvLt27cjKioK+/btQ9WqVYX5EyZMwLt377Bq1SqsXLky33ES5QbLHRARERERERERKdDrz+mIS84o8n+FlQjW1NSUmY6Li8OoUaPQuHFj1KtXD507d5YpE7B582a0bt0aFhYW8PDwwOXLlwHIlzt4+vQpfv/9d1hZWaFVq1bYvHlznmNTU1ODsrKyzLz9+/ejTp06aNeuHdLS0nDkyBGZ5UFBQfDw8JBJ0GYaO3YsFi5cmOc4iPKKSVoiIiIiIiIiIsqVxMREBAYGws3NTZg3btw4ZGRkYMeOHdi7dy8MDQ0xY8YMAMCtW7fg7++P6dOnIywsDNbW1hg9ejQkEonMdlNSUjBw4EBoa2tj165dmDZtGhYvXoyTJ0/mKi6pVIoLFy5g3759aNeuncz8sLAwtG7dGtra2rC1tcWePXuE5ampqbh16xasra2z3K6enh50dHRy+/IQ5RvLHRARERERERERUbYGDx4MZWVlSKVSJCcno3z58kISViqVwsnJCe3atYORkREAoE+fPhgyZAgA4MWLFxCJRKhUqRKqVKmC0aNHo3Xr1nJJ2nPnziExMRFz5syBjo4OateuDR8fHygpZd+/0MXFBaL/X183NTUVenp66Nu3L37//XehTVRUFP777z84OTkBANq2bYupU6fixYsXqFy5Mt6+fQupVIpy5coJ6zx+/BgeHh4y+7p69Wo+Xz2i3GGSloiIiIiIiIiIsjV79mw0aNAAUqkUb968wZYtW9CrVy/s27cP+vr66NWrFw4ePIgrV67g8ePHuHHjhpCEtbe3h4mJCVxdXWFmZgZHR0d069YNKiqyKanHjx+jRo0aMr1Wu3TpkmNcq1evhqGhIV6+fIlZs2ahTp06GDZsmEy5gwMHDqBy5cowMzMDADg6OmLatGkICQnB8OHDheTs+/fvhXWqVKmCvXv3AgCuXbuG8ePH5//FI8olljsgIiIiIiIiIqJsGRoawtjYGNWrV4eVlRXmzp2L5ORkhIWFQSKRYODAgVi/fj0qVaqE33//Hf7+/sK6mpqa2L17NzZt2gQbGxsEBwfDw8MDcXFxMvv4NmmbG5UqVYKxsTFsbW2xatUqnDp1CvPnzxeWZ2Rk4NChQ3j58iXMzMxgZmYGe3t7SCQShISEAADU1dVhamoq01NWVVUVxsbGMDY2hqGhYZ7jIsoPJmmJiIiIiIiIiCjXlJSUIJVKkZGRgQcPHuDSpUvYuHEjhg0bhlatWiE+Ph7Al1IIV69exapVq9C0aVNMnjwZhw4dQkpKiszAYgBQvXp1PH36FMnJycK8+fPnY/bs2bmKqVq1ahg5ciS2bNmCa9euAQDOnz+PxMRELFmyBHv37hX+TZo0CU+ePMGVK1cAAD169EBwcDD+++8/ue1+m0wmKipM0hIRERERERERUbbevXsHsVgMsViMJ0+eYNasWcjIyICDgwPKli0LJSUlHDhwAC9evMChQ4ewdOlSAF/qxGpoaGDZsmXYvXs3nj9/jgMHDiApKQmmpqYy+7C3t8cvv/yCadOm4eHDhzh+/Dh27NgBe3v7XMfZt29f1KxZE7NmzYJEIsGBAwdQu3ZttG3bFiYmJsK/3r17o3z58kJJg169eqFJkybo2bMn9uzZg6dPn+LOnTtYsGABpkyZgkaNGhXaa0mUHdakJSIiIiIiIiJSIH2N4knP5Hc/I0eOFP7W1NREvXr1sGbNGlStWhUAMGPGDCxbtgyLFi1CjRo14OPjg4kTJ+LWrVuwsrKCn58fli9fjlmzZqFSpUpYsGABatasiYSEBGG7KioqQpvOnTvjl19+wYQJE9CqVatcx6miogIfHx/0798fu3btwtGjR+Hl5SXXTl1dHR4eHvj333/h7e0NdXV1BAQEYNeuXdi2bRtmzZoFkUiEunXrwtfXF25ubvl63YjyQiSVSqWKDqK4ZGRkIDo6GpaWljJFpOnnwXNMpRGveyqteO1TacTrnkojXvelQ2k4z58/fxYGx9LQ0BDmS6RSKIlExRZHce+PqLTL7r3/LfakJSIiIiKiQhUWFoaAgACkpaXBzc1NrhfT8+fPMXHiRHz8+BFlypTB/PnzUblyZQwePFioYyiRSHDv3j1s3LgRtra2ijgMIqJiUdwJUyZoiUom1qQlIiIiIqJCIxaL4e/vj8DAQBw4cACXL1/G2bNnZdr8888/6NChA0JCQtCuXTssXrwYALBmzRqEhIQgJCQEHTt2hIuLCxO0REREVCowSUtERERERIUmPDwcTZs2hZ6eHlRVVeHu7o6DBw/KtJFIJPj06ROAL48AfvvoX2xsLLZt2wYfH59ii5uIiIhIkVjugIiIiIiICk18fDwMDAyEaQMDA8TFxcm0+fPPP9GzZ08EBgYiPT0dO3bskFm+YsUK9O/fH7q6usUSMxEREZGisSctEREREREVGolEIjdP9E39w4kTJ2LWrFk4e/YsZsyYAS8vL2SOZ/zx40ccP34cPXr0KJZ4iYiIiEoCJmmJiIiIiKjQGBkZQSwWC9Px8fEwMjISphMTE/Ho0SM4OTkBANq1awexWIw3b94AAM6cOQN7e3toa2sXb+BERERECsQkLRERERERFRpbW1tERkYiISEBaWlpCA0NRatWrYTlurq6UFdXx4ULFwAAUVFR0NLSEkobXLlyBTY2NooInYiIiEhhWJOWiIiIiIgKjaGhIcaPH48BAwYgNTUVDg4OaNOmDby9veHg4ABHR0cEBATA19cXnz9/hra2NpYsWSKURHj27BlatGih4KMgIiIiKl5M0hIRERERUaFydnaGs7OzzDw/Pz/h7/r162P37t1Zrrt69eoijY2IiIioJGK5AyIiIiIiIiIiylJaWhqWLl0KR0dH1KtXD61atcLcuXPx8eNHRYdW5JYuXQpPT898r+/p6YmlS5cWOA4HBwcEBwcXeDtUsrEnLRERERERyZBKJBAp/Rz9OX6mYyGin5NEKoGSqPjuU3nd38KFCxEREYHZs2ejatWqiI2NhZ+fH54+fYqVK1cWYaSU6d9//4WWlpaiw6AixiQtERERERHJECkpIT14K6TiOEWHUiCiCoZQ8eij6DCIiHKkJFLCoU+HkJiRWOT70lPWQ3vt9nlaZ8+ePZgzZw5sbW0BAFWqVMGMGTPQp08fxMfHw8DAoChCpa/o6ekpOgQqBvxJmYiIiIiI5EjFccCrFz/0vx89yUxEpUdiRiLEGeIi/5efRLBIJEJkZCQkEokwz8rKCgcOHICuri4A+cfxL1y4AFNTUwDA8+fPYWpqilOnTsHBwQFWVlaYPXs27t27Bw8PD1haWmLo0KFC+YRJkyZhwYIFGD16NBo0aIAOHTrg1q1bWLx4MaytrdGiRQuEhYUJ+4qKikKvXr3QoEEDWFpaYvDgwYiPjwcABAcHo2fPnhgxYgQaNWqEFStWwMzMDImJ//c63LhxAw0aNPhu+YbM4zhy5AicnJxgYWGBoUOH4u3bt0Kbo0ePol27drC0tMSsWbOQkZEhs40dO3YIr4Gnpyfu3r0LAHj48CHq1auHvXv3AgBSU1PRrl07zJkzR+719fT0xIoVK/D777+jfv36aNeuHc6ePSvs482bN/Dy8oKVlRUcHR2xfft24VxQycYkLRERERERERERZalv374IDAyEg4MDpk+fjsOHD+Pz58+oVasWVFVVc72d1atXY/ny5fD19UVgYCC8vLwwduxYrFu3DtHR0fj333+Ftps2bYKNjQ1CQ0NRvnx59OvXD69fv8bOnTuFOCQSCT58+IChQ4eiWbNm2L9/P9atW4dnz57JDEJ59epV1KpVC7t27UKPHj1gaGiIo0ePCsvDwsLQsmVL6Ojo5Oo4Vq5ciUWLFmHLli2IiYnBhg0bAAAPHjzA6NGj0atXLwQFBSE9PR1RUVHCeidOnEBAQACmTp2KPXv2oFGjRujbty/evXuHmjVrYsiQIVi4cCE+fvyIZcuWQSKRYMyYMdnG0LFjR+zfvx916tTB1KlThST6X3/9hcTERGzfvh3Tpk3DsmXLcn2OSLGYpCUiIiIiIiIioiyNGDECCxYsgJGREXbt2oVRo0ahefPmCAoKytN2hg8fjjp16sDFxQX6+vro2LEjmjVrhkaNGsHW1haPHj0S2tarVw+9e/eGsbExXFxckJycDB8fH9SsWROenp549+4dEhIS8PnzZwwfPhwjRoxA1apV0ahRI7Rt2xb3798XtiUSifDHH3+gZs2a0NPTQ4cOHXDo0CFh+aFDh9CxY8dcH8eoUaNQv359NGjQAK6uroiJiQEABAUFwdraGv3790fNmjUxdepUmVIQa9euxdChQ9G6dWtUr14do0ePRuXKlREaGgoAGDZsGMqUKQNvb2+sW7cOfn5+0NTUzDKGli1bwsPDA9WqVcMff/yB//77D2KxGI8fP0ZERATmz5+POnXqoGXLlvDy8sr1sZFisSYtERERERERERFly83NDW5ubnjz5g3OnTuHLVu2wNvbG6ampqhXr16utlG1alXhbw0NDVSuXFlmOjU1VZiuUqWKzLJffvkFGhoaAAB1dXUAX0oCVKlSBe7u7ti4cSNu376NBw8e4O7du2jYsKGwvr6+vrAuALi4uGDjxo148+YNYmNj8ebNG7Rq1SrXr4WxsbHwt46ODtLS0gB8KVlQt25dYZmqqqrM9MOHD7FgwQIsWrRImJeSkoInT54AANTU1DBz5kx4enqiS5cusLGxyTaG6tWry8QAAOnp6bh79y7Kly8v81pbWlrm+thIsZikJSIiIiIiIiIiOXfu3MHevXsxadIkAICuri5cXV3Rrl07tG3bFpGRkVkmab+txQoAysrKMtNKStk/3K2iIpuuyq5tXFwcunTpAnNzc9jZ2aF79+44deoUrl27JrTJTOpmqlu3LqpVq4Zjx47hyZMncHR0lGuTk5xKPEil0mzbZmRkYMqUKcIAbJm+LrNw584dKCsr4+rVq0hNTYWamlquY5BKpVBRUZGLgX4cLHdARERERERERERyMjIysGHDBty6dUtmvpqaGjQ0NKCnpwfgS9Lw06dPwvLY2Nhiie/o0aMoV64cVq1ahX79+sHa2hqxsbHfTVS6uLjg5MmTOH36dJ5KHeSkdu3aQukDAJBIJLhz544wXaNGDbx69QrGxsbCv5UrVyI6OhoA8OrVK/z999+YN28e0tLSsHLlyjzHULNmTbx7907m9b9x40b+D4qKFZO0REREREREREQkx9zcHK1atcLw4cOxb98+PH/+HNHR0Zg+fTpSU1PRtm1bAICFhQX+/fdf3Lt3DxcuXMD69euLJb7y5cvj5cuXOH/+PGJjY7F69WocOXJEpnRCVlxcXHDu3DmIxWI0a9asUGLp3r07bty4gRUrVuDRo0eYP38+Xr58KSwfMGAANm3ahL179+LZs2dYsGABwsLCULNmTQDAzJkzYWVlBTc3N0yZMgWrV6/GgwcP8hRDjRo1YG9vjylTpuDOnTsIDw/HkiVLCuX4qOix3AERERERERERkQLpKeuV2P38/fffWLlyJQICAvDy5UtoaWnB3t4eW7ZsER7VHz16NCZPngwPDw/8+uuv+PPPPzFmzJjCDl+Os7MzLl26hFGjRkEkEsHCwgITJ07E0qVLc0zUGhsbo1atWjAzM8uxfEFeGBsbY8WKFZg7dy5WrFgBJycntGzZUljeoUMHJCQkYMmSJUhISECtWrWwYsUKVK9eHYcPH8bZs2exb98+AICDgwOaNWuGqVOnYtu2bXmKY+7cuZg6dSq6d+8OQ0NDeHh4YO3atYVyjFS0RNJSVKwiIyMD0dHRsLS0lKuFQj8HnmMqjXjdU2nFa59Ko+K87tNWLQJevSjSfRQ5o8pQHfqXoqOgAuL9vnQoDef58+fPePz4MWrUqCEzkJVEKoGSqPgedC7u/ZVEEokErVu3xvz589G0aVNFh1NokpOTERERgRYtWgjJ57CwMCxYsAAnTpxQcHSlV3bv/W+xJy0RERERERERkYIUd8K0tCdoT506hXPnzkFDQwM2NjaKDqdQqaurY8qUKejVqxe6dOmChIQELFu2DO3atVN0aJQLpfudSURERERERFQIwsLC0LFjR7Rt2xYBAQFyy58/f44+ffqgU6dO+O233/DihWxP9YiICPTr16+4wiUqtdatW4dDhw7Bz88PSko/V1pMSUkJy5YtQ0REBFxcXODl5YXmzZsXS+kJKjj2pCUiIiIiIiIqALFYDH9/fwQFBaFMmTIYPHgwzp49i+bNmwtt/vnnH3To0AF9+vRBYGAgFi9ejIULFyIjIwMbN27E6tWrYWJiosCjICodAgMDFR1CkbK2tsauXbsUHQblw8/1kwERERERERFRMQsPD0fTpk2hp6cHVVVVuLu74+DBgzJtJBIJPn36BOBLfcLMuoT379/H48eP4evrW+xxExFRycGetEREREREREQFEB8fDwMDA2HawMAAcXFxMm3+/PNP9OzZE4GBgUhPT8eOHTsAAHXq1MHs2bNx4cKFYo2ZiIhKFvakJSIiIiIiIioAiUQiN08kEslMT5w4EbNmzcLZs2cxY8YMeHl5QSqVFleIRERUwjFJS0RERERERFQARkZGEIvFwnR8fDyMjIyE6cTERDx69AhOTk4AgHbt2kEsFuPNmzfFHisREZVMTNISERERERERFYCtrS0iIyORkJCAtLQ0hIaGolWrVsJyXV1dqKurCyUNoqKioKWlBV1dXQVFTEREJQ1r0hIREREREREVgKGhIcaPH48BAwYgNTUVDg4OaNOmDby9veHg4ABHR0cEBATA19cXnz9/hra2NpYsWSJXEoGIiEovJmmJiIiIiIhIIcLCwhAQEIC0tDS4ubnBy8tLWBYXF4chQ4ZAKpXi8+fPkEgkiIuLw4ULF5CUlAQfHx88f/4c2tramDRpEqysrBR4JICzszOcnZ1l5vn5+Ql/169fH7t37852/SZNmqBJkyZFFh9RfqWlpWHlypXYu3cv4uLi8Msvv6Bdu3YYOXIkdHR0FB1ekVq6dCkuXryIwMDAfK3v6ekJGxsbjBw5skBxODg4wMvLCx4eHgXaTkFduHABffv2lZmnqqoKAwMDdO7cuUDH6enpiYsXL8rM09bWRr169eDj4wMTE5N8bztTcHAwAgICcOLECbllBT3XhUHhSdqUlBTMnDkTR44cgYaGBgYOHIiBAwdm2fbo0aNYtGgRXr16hTp16sDHxwfm5ubFHDEREREREREVlFgshr+/P4KCglCmTBkMHjwYZ8+eRfPmzQF86Z0aEhKCjIwMXL16FcuXL8fgwYOhpaWFadOmoU6dOli5ciViY2MxYMAA7N+/HxoaGgo+KqK8k0okECkVXzXKvO5v4cKFiIiIwOzZs1G1alXExsbCz88PT58+xcqVK4swUsr077//QktLS9FhCM6dOyf8nZycjOPHj2P+/PmoWrUq3N3d873dr3OCUqlUuNa8vLxw6NAhKBXh+2TgwIHw9PQssu3nhsKTtP7+/rhx4wY2bdqEly9fYuLEiahUqRLat28v0+7+/fsYO3YsZs2ahYYNG2Ljxo0YOnQojh49Ck1NTQVFT0RERERERPkRHh6Opk2bQk9PDwDg7u6OgwcPCknab9ump6ejR48eAIDbt29j6NChAICqVauifPnyuHr1KmxtbfMdT3EnyorSz3QspYFISQnpwVshFccV/b4qGELFo0+e1tmzZw/mzJkjvL+qVKmCGTNmoE+fPoiPj4eBgUFRhEpfybxPFiVPT0907tw5V711K1SoIDM9YMAAnDlzBkePHi1QklZLS0tm2wYGBvD29kbv3r1x79491KlTJ9/b/h5tbe0i23ZuKTRJm5SUhN27d2PNmjUwNzeHubk57t+/j61bt8olacPDw1GrVi3hZP/111/YunUrHjx4AAsLCwVET0RERERERPn1bXLHwMAAcXHySSqJRILg4GAsWbJEmGdmZob9+/dj9OjRuH//Ph48eICEhIQCxVOcibKilJ8kHCmeVBwHvHpR9PvJxzoikQiRkZFwcHAQejJaWVnhwIEDwuB33z6On/lY/N27d/H8+XM4Ojpi1apVmDVrFt68eYMuXbqge/fumDRpEh49eoQmTZrgf//7H3R0dDBp0iTo6+vjxYsXOHnyJCpXroyFCxfi8OHD2Lp1K7S0tDB58mShvEhUVBQWLlyIW7duQSQSoXHjxvDz84OBgQGCg4Oxa9cu6OvrIzIyEoMGDcLSpUtx7tw5IfF548YN9OnTB+Hh4TmWb8g8jqVLl8Lf3x9xcXGws7PD/PnzUb58eQBfngBfuHAh4uLi4OHhgYyMDJlt7NixA6tXr8abN2+Ex/hNTU3x8OFDdOrUCbNnz4a7uztSU1Ph6uqKli1bYsqUKTKvr6enJ+zs7HD58mVcunQJFStWhI+Pj/AD15s3bzB16lSEh4dDT08PgwYNwowZM3D37t18nP3cU1NTg7KysjB9+fJlzJkzBw8ePICxsTG8vLzQrl27fG0XgLDtuLg4+Pn54fz580hOTkbt2rXh4+ODRo0a5eocZZJIJBg9ejSePn2KwMBAbNq0SSh3EBwcjD179qBx48bYunUrMjIy0KVLF0yaNEmoJb5x40asW7cOnz59goeHB+7evZvrJHd2FPrT2p07d5Ceni5TO6hRo0a4du0aJBKJTNvy5cvjwYMHiIqKEj6kdXR0UK1ateIOm4iIiIiIiAro2+98ALIcSOv8+fPQ1dVFvXr1hHmTJ0/Gs2fP4Obmhs2bN6NJkyZQVVUtcExCouwH/vejJ5mp5Onbty8CAwPh4OCA6dOn4/Dhw/j8+TNq1aqVp/fd6tWrsXz5cvj6+iIwMBBeXl4YO3Ys1q1bh+joaPz7779C202bNsHGxgahoaEoX748+vXrh9evX2Pnzp1CHBKJBB8+fMDQoUPRrFkz7N+/H+vWrcOzZ8+wevVqYVtXr15FrVq1sGvXLvTo0QOGhoY4evSosDwsLAwtW7bMdX3dlStXYtGiRdiyZQtiYmKwYcMGAMCDBw8wevRo9OrVC0FBQUhPT0dUVJSw3okTJxAQEICpU6diz549aNSoEfr27Yt3796hZs2aGDJkCBYuXIiPHz9i2bJlkEgkGDNmTLYxdOzYEfv370edOnUwdepU4Z76119/ITExEdu3b8e0adOwbNmyXJ+j/MjIyMDhw4cRHh4udLgUi8UYOnQoPDw8sG/fPgwaNAiTJk3C5cuX87Tt+Ph4/P3336hduzZ+/fVXAMC4ceOQkZGBHTt2YO/evTA0NMSMGTNk1svuHH1tzpw5uHPnDtatW4eyZcvKLb969SoeP36M7du3Y+rUqdi8eTMiIiIAAKGhoViyZAmmTJmCnTt34vnz57h06VKeji0rCu1JKxaLoaurK2TFAeCXX35BSkoK3r59K9Odu0OHDjhx4gR69+4NZWVlKCkpYdWqVShXrlye9/vtLxn088g8tzzHVJrwuqfSitc+lUbFdd1/3RPmZ8D7RMlkYGCAS5cuCecnLi4OhoaGcufr2LFjsLOzk5n/8eNHzJgxQ0iquLu7o3LlygU617zuFetHi7c0GTFiBKpWrYpt27Zh165d2LFjB7S1teHt7Y0uXbrkejvDhw9HnTp1UKdOHcyZMwcdO3ZEs2bNAAC2trZ49OiR0LZevXro3bs3AMDFxQVz5syBj48PNDQ04Onpie3btyMhIQEikQjDhw/HgAEDIBKJULVqVbRt2xbXr18XtiUSifDHH38INas7dOiAQ4cOCeVTDh06hAkTJuT6OEaNGoX69esDAFxdXRETEwMACAoKgrW1Nfr37w8AmDp1Kk6ePCmst3btWgwdOhStW7cGAIwePRpnzpxBaGgoPD09MWzYMISFhcHb2xvHjx/H+vXrsy3v2bJlS6HH5h9//IFOnTpBLBYjKSkJEREROHbsGKpWrYo6derAy8sL06dPz3I7K1euxKpVqwAAnz9/RnR0NHx9fQF8SVJm5+vOlikpKahUqRImT56MDh06AAC2bt0KOzs7/PbbbwAAY2Nj3L59G5s2bYK1tXW22121ahXWr18P4P/uCXZ2dli1ahWUlZUhlUrh5OSEdu3awcjICADQp08fDBkyRGY72Z2jTGvWrMGhQ4ewfft2/PLLL1nGkpGRAV9fX+jo6ODXX3/Fxo0bERMTg2bNmmHbtm3o16+f0Jt7/vz5aNmyZbbHlVsKTdImJyfLJGiB/+vGnJqaKjP/zZs3EIvFmDZtGho0aIDt27dj8uTJ2LNnD/T19fO0329PDv18eI6pNOJ1T6UVr30qjYryutfU1ISZmVmRbV8R7t69i+TkZEWHQd8oW7Yszpw5g9OnT0NbWxtbt26Fk5MToqOjZdpFRERg9OjRMtf9li1bUKZMGXTq1AnXr1/Hhw8fhARDfvC6J8qZm5sb3Nzc8ObNG5w7dw5btmyBt7c3TE1NZXq556Rq1arC3xoaGqhcubLM9Nd5oCpVqsgs++WXX4Qkq7q6OoAveaMqVarA3d0dGzduxO3bt/HgwQPcvXsXDRs2FNbX19eXGVTQxcUFGzduxJs3bxAbG4s3b96gVatWuX4tjI2Nhb91dHSQlpYGAHj48CHq1q0rLFNVVZWZfvjwIRYsWIBFixYJ81JSUvDkyRMAX/JhM2fOhKenJ7p06QIbG5tsY6hevbpMDACQnp6Ou3fvonz58jKvtaWlZbbb6dmzp5BoHDduHNq2bYu2bdvmcPRf7N27Vzim6dOnw9HREX36/F+ZlUePHuHkyZMyydy0tDTUqFEDAOSeqF+7dq0Qj6enJ1JTU7Fp0yZERERgzJgxwrUiEonQq1cvHDx4EFeuXMHjx49x48YNuSczsjtHwJfeuYsXL4aRkZFcbd2v6evry/Su1tHRQXp6OoAv99evE8PlypUTjq0gFJqkVVdXl0vGZk5/OyrnwoULYWJiIpx0X19fODs7IygoSC5j/j0WFhY/3a+k9EVGRgZiYmJ4jqlU4XVPpRWvffrZHDp0CMuWLUNaWhpcXV0xYsQIYVl8fLwwSNLnz5+RkZGB+Ph44ctLfHw8gC8jId+7dw/r1q0r0ABKPxtTU1NFh0DZSElJwaJFi5CamgoHBwcMHjwYU6dORevWreHg4AAASEhIgL6+vsz9vnr16hg3bhymTZsGHR0drFq1CrVq1VLkoZQ4P9p1n/m5TiXLnTt3sHfvXkyaNAkAoKurC1dXV7Rr1w5t27ZFZGRklknarHpGf/v/NaUcBrdTUZFNV2XXNi4uDl26dIG5uTns7OzQvXt3nDp1CteuXRPaZCZ1M9WtWxfVqlXDsWPH8OTJEzg6Osq1yUlOJR6kUtmqv1+3zcjIwJQpU+Q+n79OBN65cwfKysq4evUqUlNT5To25hSDVCqFioqKXAw5KV++vFCrVUNDA/r6+jIJzuxktjE2Nkb58uXRp08fGBkZYcCAAQC+JIxdXV0xbNgwmfUyz2tmkjdzv5nKlSsnbNvX1xeDBw/G0KFDsW/fPpQpUwYSiQQDBw7E+/fv0aFDBzg4OCAtLQ1eXl4y+8npHIlEIqxbtw5TpkzBihUrsi0pkdVrn/naZvbqzWpZQSg0SWtoaIg3b94gPT1dOFFisRgaGhpy9SBu3rwJT09PYVpJSQl16tTBy5cv87xfZWVlfpn7yfEcU2nE655KK1779DMQi8VYuHAhgoKCUKZMGQwePBgRERHCICAVK1ZEaGgoMjIycPXqVSxfvhxDhgxBmTJlhN4nwJfHFk1MTGBvb6+oQymReI8ouTp27IiOHTvKzJszZ47MdFRUFKKjo2Xu9/r6+lnWGKT/w+ueCkNGRgY2bNgANzc3md7mampq0NDQEMpUqqqq4tOnT8Ly2NjYYonv6NGjKFeunPDIPgAEBgZ+N2Hm4uKCkydP4tmzZxg3blyhxFK7dm2ZEgESiQR37txBnTp1AAA1atTAq1evZJKgkydPhpOTExwdHfHq1Sv8/fffmDdvHpYsWYKVK1di1KhReYqhZs2aePfuHWJjY4XetDdu3CiEo8tew4YN0bt3b/z9999o27YtKleujBo1auDq1asyx7p+/XqkpqZi2LBhuUoEi0QizJo1Cx07dsT//vc/zJgxAw8ePMClS5dw/vx54drbunUrgNwnSStUqABbW1uMHz8eEydOhIeHR67i+VqtWrVw8+ZNODo6AvhSgufp06d52kZWFDpwWN26daGioiLzSEpUVBQsLCzkfiUxMDDAw4cPZeY9fvxYpgs8EREREdGPKDw8HE2bNoWenh5UVVXh7u6OgwcPZts2PT1dqKWXKTY2Ftu2bYOPj09xhExUrLKry0hERcvc3BytWrXC8OHDsW/fPjx//hzR0dGYPn06UlNThUfjLSws8O+//+LevXu4cOGCUFe0qJUvXx4vX77E+fPnERsbi9WrV+PIkSNyT21/y8XFBefOnYNYLBbq4hZU9+7dcePGDaxYsQKPHj3C/PnzZToWDhgwAJs2bcLevXvx7NkzLFiwAGFhYahZsyYAYObMmbCysoKbmxumTJmC1atX48GDB3mKoUaNGrC3t8eUKVNw584dhIeHY8mSJblaNzAwUKhzm1d//vkntLS0MG/ePABA7969cePGDSxevBhPnjzBvn37sGjRIlSqVClP261UqRKGDh2KnTt34vbt2yhbtiyUlJRw4MABvHjxAocOHcLSpUsByJdN/Z4OHTrA0tJSqMGbF56enti8eTOOHDmChw8fYsqUKUhKSspy8Mu8UGhPWk1NTbi7u2PGjBmYM2cO4uPjsX79esydOxfAlx4FZcqUgYaGBrp3745JkyahXr16sLKywu7du/Hy5Ut07txZkYdARERERFRg8fHxMDAwEKYNDAwQFyc/QrtEIkFwcHCWX7hWrFiB/v37Q1dXt0hjJfqWRCqBkqjo+v8oKyv/dPViib4lqmCIgj8snbv95NXff/+NlStXIiAgAC9fvoSWlhbs7e2xZcsW4VH90aNHY/LkyfDw8MCvv/6KP//8M9vHyAuTs7MzLl26hFGjRkEkEsHCwgITJ07E0qVLc0zaGRsbo1atWjAzM8vx0fi8MDY2xooVKzB37lysWLECTk5OMoNJdejQAQkJCViyZAkSEhJQq1YtrFixAtWrV8fhw4dx9uxZ7Nu3DwDg4OCAZs2aYerUqdi2bVue4pg7dy6mTp2K7t27w9DQEB4eHjJP3RSFsmXL4q+//oKPjw8iIiJgZ2eHlStXYuHChVi3bh0MDQ0xadIkuLm55XnbAwcORFBQEHx9fbFt2zbMmDEDy5Ytw6JFi1CjRg34+Phg4sSJuHXrVo41ZrPi7e0NDw8PHDlyJE/rdezYEU+fPsX06dORkpKCHj16oHLlygW+lkTSwiiaUADJycmYMWMGjhw5Ah0dHfz+++/CSHimpqaYO3eukMnfvXs31q9fj1evXqFu3brw9vaGubl5rveVkZGB6OhoWFpa8tGPnxTPMZVGvO6ptOK1Tz+TlStXIjk5WfhCGxERgXXr1mHdunUy7c6cOYNFixYhKChI5rr/+PEjHB0dceLECWhraxdKTGmrFgGvXhTKthTGqDJUh/6l6ChKhUOfDiExI1HRYeRbddXqsNO043WvQKXhc/3z5894/PgxatSoIVOHUyqRQJRDfdbCVtz7K4kkEglat26N+fPno2nTpooOp9AkJycjIiICLVq0EBKGYWFhWLBgAU6cOKHg6H4eFy9eRNWqVVGxYkUAX2rwNm3aFMuWLUOTJk3k2mf33v+WQnvSAl96086fPx/z58+XW3b37l2Z6W7duqFbt27FFRoRERERUbEwMjLCxYsXhen4+HgYGRnJtTt+/Djs7Ozk5p85cwb29vaFlqAlyqvEjESIM8SKDiPfdJXYA50Up7gTpqU9QXvq1CmcO3cOGhoasLGxUXQ4hUpdXR1TpkxBr1690KVLFyQkJGDZsmVo166dokP7qRw7dgxXr17FzJkzoa2tjc2bN0NHRweWlpYF2m7pfmcSEREREZUAtra2iIyMREJCAtLS0hAaGopWrVrJtbty5UqWj31fuXLlp/uiSUREVBTWrVuHQ4cOwc/PT248pB+dkpISli1bhoiICLi4uMDLywvNmzcvltITpcmoUaNQo0YNDBgwAJ06dcKjR4+wdu1aqKurF2i7Cu9JS0RERERU2hkaGmL8+PEYMGAAUlNT4eDggDZt2sDb2xsODg7C6MGxsbHQ19eXW//Zs2do0aJFcYdNRET0wwkMDFR0CEXK2toau3btUnQYPzUdHR34+/sX+naZpCUiIiIiKgGcnZ3h7OwsM8/Pz09mOioqCtHR0XLrrl69uihDIyIiIqIi9nP16yYiIiIi+slpamoqOgQiIiIiKmTsSUtEREREVAgkUgmUREXbB0JZWTnLmrRERPTjkEqlig6BiIqRRCLJVTsmaYmIiIiICoGSSAmHPh1CYkaiokMpkOqq1WGnaafoMIiIfjqqqqoQiUQQi8WoUKECRCKRokMioiIklUqRmpoKsVgMJSUlqKmp5dieSVoiIiIiokKSmJEIcYZY0WEUiK6SrqJDICL6KSkrK6NKlSp4/vw5njx5ouhwiKiYaGlpoVq1alBSyvmJKyZpiYiIiIiIiIiKgY6ODmrXro20tDRFh0JExUBZWRkqKiq56jnPJC0RERERERERUTFRVlaGsrKyosMgohKmaEc2ICIiIiIqYmFhYejYsSPatm2LgIAAmWVxcXHo1KmT8M/JyQkWFhZISkoS2nz8+BFOTk64cOFCcYdORERERASAPWmJiIiI6AcmFovh7++PoKAglClTBoMHD8bZs2fRvHlzAIChoSFCQkIAfBm8YdCgQRg8eDC0tLSEbfj6+uL9+/cKiZ+IiIiICGBPWiIiIiL6gYWHh6Np06bQ09ODqqoq3N3dcfDgwSzb7tu3D+np6ejRo4cw7+DBg9DW1oapqWlxhUxEREREJIdJWiIiIiL6YcXHx8PAwECYNjAwQFxcnFw7iUSCZcuWYdy4ccK8ly9fYtOmTZgwYUKxxEpERERElB0maYmIiIjohyWRSOTmZTV6bkREBAwMDGBhYSGs5+3tjalTp0JDQ6PI4yQiIiIiygmTtERERET0wzIyMoJYLBam4+PjYWRkJNfu2LFjcHFxEaYfPXqER48ewdvbG506dcKNGzfg4+ODiIiIYombiIiIiOhrTNISERER0Q/L1tYWkZGRSEhIQFpaGkJDQ9GqVSu5dlFRUbCxsRGma9WqhdOnTyMkJAQhISGoV68eZs+eDTs7u2KMnoiIiIjoCyZpiYiIiOiHZWhoiPHjx2PAgAFwcXGBqakp2rRpA29vbxw/flxoFxsbi0qVKikwUiIiIiKi7KkoOgAiIiIiooJwdnaGs7OzzDw/Pz+Z6ejo6By3ERgYWNhhERERERHlGpO0REQlXFhYGAICApCWlgY3Nzd4eXkJy+Li4jB48GB8/vwZGhoaSEpKQlxcHC5cuAAtLS0AXwbLWbVqFTZt2qSoQyAiIiIiIiKiHLDcARFRCSYWi+Hv74/AwEAcOHAAly9fxtmzZ4XlhoaG2LNnD+bOnYvg4GAYGxvDx8cHWlpayMjIwLp16zBmzJgsRz8nIiopJFKpokMgIiIiIlIo9qQlIirBwsPD0bRpU+jp6QEA3N3dcfDgQTRv3lyu7f79+5Geno4ePXoAAO7fv4/Hjx/D19eXj/ESUYmmJBIh9MkHvP6cruhQ8u3XsmpoWUlb0WFQKfK9J22GDBkiTH/69IlP2hAREZVwTNISEZVg8fHxMDAwEKYNDAwQFxcn104ikWD58uX43//+J8yrU6cOZs+ejQsXLhRLrEREBfH6czrikjMUHUa+6av/uLHTjyfzSZugoCCUKVMGgwcPxtmzZ4UfcQ0NDRESEgIAkEqlGDRoEAYPHiw8abNx40asXr0aJiYmijwMIiIi+grLHRARlWBZlSkQiURy827cuAEDAwNYWFgUR1hERESkQF8/aaOqqio8aZOVffv2ZfukDREREZUcTNISEZVgRkZGEIvFwnR8fDyMjIzk2l2+fBkdO3YsztCIiIhIQfLypM2yZcswbtw4YV7mkzblypUrlliJiIgod5ikJSIqwWxtbREZGYmEhASkpaUhNDQUrVq1kmt39+5d2NjYFH+AREREVOxy+6RNREQEn7QhIiL6QTBJS0RUghkaGmL8+PEYMGAAXFxcYGpqijZt2sDb2xvHjx8X2sXHx6NixYoKjJSIiIiKS26ftDl27BhcXFyKMzQiIiLKJw4cRkRUwjk7O8PZ2Vlmnp+fn8z0hg0boK6unuX6TZo0QZMmTYosPiIiIipetra2WLJkCRISElCuXDmEhoaiV69ecu2ioqLQr18/BURIREREecWetEREPwFNTU1Fh5ArYWFh6NixI9q2bYuAgAC55fHx8RgyZAg6deqEnj174vnz5wCAJ0+e4LfffoOrqys8PT3x+PHj4g6diIioxMjtkzaxsbGoVKmSAiMlIiKi3GJPWiKiIiaRSqAkKrrfxJSVlWFmZlZk2/9aQY5FLBbD398fQUFBKFOmDAYPHoyzZ8+iefPmQpsJEyagXbt26NWrF7Zv3w5/f38sWbIEkydPhoeHB7p164bo6GiMHj0aISEhhXVYREREP5zcPGkTHR2d7fp80oaIiKhkYZKWiKiIKYmUcOjTISRmJCo6lALRU9ZDe+32+V4/PDwcTZs2hZ6eHgDA3d0dBw8eFJK0iYmJuHPnDjZs2AAA6NKlC2xtbQEAt2/fRseOHQEAlpaWiI+PR2xsLKpWrVqQQyIiIiIiIiIqEZikJSIqBokZiRBniL/f8CcWHx8PAwMDYdrAwABxcXHCdOYjmfPmzcP58+dRsWJFTJs2DQBgZmaG/fv3o3v37jh//jzevn0LsVjMJC0REf2wJFIplEQiRYdBREREJQSTtEREVCwkEoncPNFXX07T09Nx8+ZNDB8+HJMnT8bu3bsxadIkBAYGYt68efD19cWWLVvQokUL1KlTB6qqqsUZPhERUaFSEokQ+uQDXn9OV3QoBfJrWTW0rKSt6DCIiIh+eEzSEhFRsTAyMsLFixeF6fj4eBgZGQnTFSpUgJaWFpycnAAALi4umD17NoAvCdxly5ZBTU0NEokEu3btQpUqVYr3AIiIiArZ68/piEvOUHQYBaKv/mPHT0REVFIU3Ug2REREX7G1tUVkZCQSEhKQlpaG0NBQtGrVSlherVo1VKxYESdOnAAAnD59WhgQbfHixThy5AgAYPfu3ahXrx50dXWL/RiIiIiIiIiIigKTtEREVCwMDQ0xfvx4DBgwAC4uLjA1NUWbNm3g7e2N48ePAwACAgKwYcMGuLi4YMOGDZgzZw4AYMKECdiyZQs6duyIw4cPY+7cuYo8FCIiIiIiIqJCxXIHRERUbJydneHs7Cwzz8/PT/j7119/RWBgoNx6VatWxY4dO4o8PiIiIiIiIiJFYE9aIiIiIiIiIiIiIgVikpaIiHJFS6QFqUSi6DAKzc90LERERERERPRjY7kDIiLKFXWROkRKSkgP3gqpOE7R4RSIqIIhVDz6KDoMIiIiIiIiIgBM0hIRUR5JxXHAqxeKDqNApIoOgIiIiIiIiOgrLHdAREREREREREREpEBM0hLRTyksLAwdO3ZE27ZtERAQILc8Pj4eQ4YMQadOndCzZ088f/4cAPDx40eMHTsWnTp1gru7O27evFncoRMRERERERFRKcMkLRH9dMRiMfz9/REYGIgDBw7g8uXLOHv2rEybCRMmoHXr1ggJCUGnTp3g7+8PAJg7dy4qVqyIkJAQ/PXXX5g2bZoiDoGIiIiIiIiIShHWpCWin054eDiaNm0KPT09AIC7uzsOHjyI5s2bAwASExNx584dbNiwAQDQpUsX2NraQiqV4siRIzh+/DgAoEWLFjAyMlLMQRARERERERFRqcGetET004mPj4eBgYEwbWBggLi4OGE6NjYWlSpVwrx58+Dm5oaRI0dCVVUVr1+/hpqaGrZt2wZ3d3d4enpCIpEo4hCIiIiIiIiIqBRhkpaIfjpZJVZFIpHwd3p6Om7evInGjRsjNDQUTk5OmDRpEjIyMpCQkAAtLS3s3bsXw4YNw4gRI4ozdCIiIiIiIiIqhZikJaKfjpGREcRisTAdHx8vU7agQoUK0NLSgpOTEwDAxcUF169fh66uLlRUVODi4gIAaNasGZKSkvD69eviPQAiIiIiIiIiKlWYpCWin46trS0iIyORkJCAtLQ0hIaGolWrVsLyatWqoWLFijhx4gQA4PTp0zAzM4Oamhrs7Oxw4MABAMD169ehqakJXV1dRRwGEREREREREZUSHDiMiH46hoaGGD9+PAYMGIDU1FQ4ODigTZs28Pb2hoODAxwdHREQEIDp06dj0aJF0NbWxrx58wAAfn5+mDZtGnbu3AllZWX873//g5ISf88iIiIiIiIioqLDJC0R/ZScnZ3h7OwsM8/Pz0/4+9dff0VgYKDcegYGBli5cmWRx0dERERERERElIndw4iIiIiIiIiIiIgUiElaIiqRJFKpokMgIiIiIiIiIioWLHdARCWSkkiE0Ccf8PpzuqJDKZBfy6qhZSVtRYdBRERERERERCUYk7REVGK9/pyOuOQMRYdRIPrqP3b8RERERERERFT0WO6AiIiIiIiIiIiISIGYpCUiIiIiIiIiIiJSICZpiYiIiIiIiIiIiBSISVoiIiIiIiIiIiIiBWKSloiIiIiIiIiIiEiBmKQlIiIiIiIiIiIiUiAmaYmIiIiIiIiIiIgUiElaIiIiIiIiIiIiIgVikpaIiIiIiIiIiIhIgZikJSIiIiIiIiIiIlIgJmmJiIiIiIiIiIiIFIhJWiIiIiIiIiIiIiIFYpKWiIiIiIiIiIiISIGYpCUiIiIiIiIiIiJSICZpiYiIiIiIiIiIiBRI4UnalJQUTJkyBdbW1rC3t8f69euzbXv37l306tUL9evXh6urKyIjI4sxUiIiIiIiIiIiIqLCp/Akrb+/P27cuIFNmzZh+vTpCAgIwKFDh+TaffjwAQMHDkStWrWwb98+tGnTBl5eXnj9+rUCoiYiIiIiIiIiIiIqHApN0iYlJWH37t3w9vaGubk52rRpg0GDBmHr1q1ybffs2QMtLS3MmDEDxsbGGDVqFIyNjXHjxg0FRE5ERERERERERERUOFQUufM7d+4gPT0dVlZWwrxGjRph5cqVkEgkUFL6vxzyxYsX4ejoCGVlZWFeUFBQscZLREREREREREREVNgU2pNWLBZDV1cXampqwrxffvkFKSkpePv2rUzb2NhY6OnpYerUqWjWrBm6d++OqKioYo6YiIiIiIiIiIiIqHAptCdtcnKyTIIWgDCdmpoqMz8pKQmrV69G3759sWbNGhw4cAC///47wsLCULFixTztNyMjo2CBU4mVeW55jn98X/eaJyoqvFf82HjP/3nwnk9FrSTeJ3jdU1Eridd9Tn60eImICptCk7Tq6upyydjMaQ0NDZn5ysrKqFu3LkaNGgUAMDMzQ3h4OEJCQjBs2LA87TcmJqYAUdOPgOf4x6apqQkzMzNFh0GlwN27d5GcnKzoMKiAeM//sfGeT8WhpN3ved1TcShp1z0REeVMoUlaQ0NDvHnzBunp6VBR+RKKWCyGhoYGypYtK9O2QoUK+PXXX2XmVa9eHf/991+e92thYcFfrn9SGRkZiImJ4TkmolwxNTVVdAhUALznE1Fu8X5PpdGPdt1nfq4TEZVWCk3S1q1bFyoqKoiOjoa1tTUAICoqChYWFjKDhgGApaUlLl26JDPv0aNHcHFxyfN+lZWV+WXuJ8dzTES5wfvEz4H3fCL6Ht4jqDTidU9E9GNR6MBhmpqacHd3x4wZM3D9+nUcO3YM69evR9++fQF86VX7+fNnAEDPnj1x9+5dLF26FE+fPsU///yD2NhYdOrUSZGHQERERERERERERFQgCk3SAsDkyZNhbm6Ofv36YebMmRg5ciTatm0LALC3t8fBgwcBAJUrV8batWtx8uRJuLi44OTJk1i9ejUMDQ0VGT4RERERERERERFRgSi03AHwpTft/PnzMX/+fLlld+/elZlu1KgRgoODiys0IiIiIiIiIiIioiKn8J60RERERERERERERKUZk7RERERERERERERECqTwcgdERERE3woLC0NAQADS0tLg5uYGLy8vmeVnz57FuHHjUK5cOWhoaMDc3Bxz585FcnIy7OzsUK1aNaFtcHAwR7gmIiIiIqISjUlaIiIiKlHEYjH8/f0RFBSEMmXKYPDgwTh79iyaN28utLl+/TqGDh2KBg0awNLSUkjC3rx5E02bNsWKFSsUFT4REREREVGesdwBERERlSjh4eFo2rQp9PT0oKqqCnd3dxw8eFCmTUxMDE6fPo3JkydjxIgRePXqlTA/Li4O3bp1Q8+ePXH58mVFHAIREREREVGesCctERERlSjx8fEwMDAQpg0MDBAXFyfTply5cujZsyfKlCmDO3fuYOzYsdi6dStEIhHat2+PwYMH49atWxg6dCj27dsHXV3d4j4MIiIiIiKiXGNPWiIiIipRJBKJ3DyRSCQzPX/+fKH8Qc+ePXH37l18+PAB/fv3x5AhQyASiWBubg4LCwtcuXKlWOImIiIiIiLKLyZpiYiIqEQxMjKCWCwWpuPj42FkZCRMp6SkYNWqVTLrSKVSqKioYPfu3fjvv//k5hMREREREZVkTNISERFRiWJra4vIyEgkJCQgLS0NoaGhaNWqlbBcXV0dwcHBOHnyJAAgODgYlpaW0NTURExMDDZv3gwAePDgAW7duoVGjRop4jCIiIiIiIhyjV1LiIiIqEQxNDTE+PHjMWDAAKSmpsLBwQFt2rSBt7c3HBwc4OjoiEWLFmH69OlITExElSpVMH/+fADAmDFjMHnyZHTs2BFKSkrw9/eHjo6Ogo+IiIiIiIgoZ0zSEhERUYnj7OwMZ2dnmXl+fn7C3+bm5ti5cyeio6NhaWkJZWVlAICuri5WrlxZrLF+T1hYGAICApCWlgY3Nzd4eXnJLD979izGjRsnlHQwMzPD3LlzheUfP36Eu7s7/Pz80KRJk2KNnYiIiIiIigeTtERERPTD0tTUVHQIORKLxfD390dQUBDKlCmDwYMH4+zZs8KgZwBw/fp1/PHHH+jfv3+W2/D19cX79++LKWIiIiIiIlKEfNWkvXjxIqKjowEAL1++xLBhw+Dq6oply5YVZmxERET0g5JIJUW+D2VlZZiZmQm9aItKQY4lPDwcTZs2hZ6eHlRVVeHu7o6DBw/KtImJicGpU6fg7u6OP/74A69evRKWHTx4ENra2jA1Nc13DEREREREVPLluSft3r17MXnyZAwcOBCWlpaYNm0aoqKi0KxZM6xcuRKqqqoYMmRIUcRKREREPwglkRIOfTqExIxERYdSIHrKemiv3T7f68fHx8PAwECYNjAwQFxcnEybcuXKoXfv3mjRogW2bduGsWPHYuvWrXj58iU2bdqETZs2YfDgwfmOgYiIiIiISr48J2k3btyIzp07Y/z48RCLxYiIiMDYsWPx+++/Y/369di5cyeTtERERITEjESIM8SKDkOhJBL5XrgikUhmOnPQMwDo3bs3Fi1ahA8fPsDb2xtTp06FhoZGkcdJRERERESKledyB48ePYK7uzsA4PTp05BKpXB0dAQAWFhY4L///ivUAImIiIh+VEZGRhCL/y9RHR8fLwwQBgApKSlYtWqVzDpSqRRxcXF49OgRvL290alTJ9y4cQM+Pj6IiIgottiJiIiIiKj45LknbdmyZfHx40cAX0YjrlSpEqpXrw4AePbsGXR1dQs1QCIiIqIfla2tLZYsWYKEhASUK1cOoaGh6NWrl7BcXV0dwcHBqF27NhwcHBAUFARLS0vUqlULp0+fFtp5enrCy8sLTZo0UcRhEBERERFREctzkrZJkyYICAjAgwcPcPz4cQwYMAAAcPjwYfzzzz+wt7cv9CCJiIjo+8LCwhAQEIC0tDS4ubnBy8tLZvnZs2cxbtw4oSenmZkZ5s6dizdv3sDb2xvPnz+HVCrFsGHD0LFjR0Ucwk/H0NAQ48ePx4ABA5CamgoHBwe0adMG3t7ecHBwgKOjIxYtWoQZM2bgf//7H/T19WXKHxARERERUemQ5yStt7c3xo8fj4CAANja2mLo0KEAgLlz56JSpUoYO3ZsoQdJREREOROLxfD390dQUBDKlCmDwYMH4+zZs2jevLnQ5vr16/jjjz/Qv39/mXWXLFkCMzMzLF++HGKxGJ07d0aTJk3wyy+/FPNR/JycnZ3h7OwsM8/Pz0/429zcHLt3785xG4GBgUUSGxERERERlQx5TtLq6elh3bp1cvO3bduGSpUqFUpQRERElDfh4eFo2rQp9PT0AADu7u44ePCgTJI2JiYGnz9/xt69e1GxYkVMnz4dRkZGaNGiBerVqwcAqFChAsqXL4+EhAQmaYmIiIiIiIpJngcOy/Tw4UNs3rwZCxcuRFxcHF6+fCnUqiUiIqLiFR8fDwMDA2HawMAAcXFxMm3KlSuHgQMHYu/evWjevLnw9Evr1q1RoUIFAMCBAweQmpqKWrVqFV/wJZiWSAtSiUTRYRSan+lYiIiIiIh+JnnuSSuRSDBt2jQEBQVBKpVCJBLB2dkZy5cvx9OnT7F161aZUYuJiIio6EmySL6JRCKZ6a9rnfbu3RuLFi3Chw8fUKZMGQBASEgIFixYgLVr10JFJc//RfgpqYvUIVJSQnrwVkjFcd9foQQTVTCEikcfRYdBRERERERZyPM3sOXLl2Pfvn2YPXs2WrVqhWbNmgEAxo8fjxEjRmDx4sUc8IKIiKiYGRkZ4eLFi8J0fHy8zI+mKSkp2Lhxo1BLHgCkUqmQjF29ejV27NiBTZs2oWbNmsUX+A9CKo4DXr1QdBgFIlV0AERERERElK08lzsICgrCqFGj0KVLF5QvX16YX7duXYwaNQrh4eGFGR8RERHlgq2tLSIjI5GQkIC0tDSEhoaiVatWwnJ1dXUEBwfjxIkTAL58nltaWkJTUxPBwcHYs2cPdu7cyQQtERERERGRAuS5J21CQgLq1q2b5TJDQ0O8f/++wEERERFR3hgaGmL8+PEYMGAAUlNT4eDggDZt2sDb2xsODg5wdHTEokWLMGPGDPzvf/+Dvr6+8OTL4sWLIRKJMGjQIGF7s2bNQoMGDRR1OERERERERKVKnpO0xsbGOH36NOzs7OSWXbx4EcbGxoUSGBEREeWNs7MznJ2dZeb5+fkJf5ubm2P37t1y6509e7bIYyMiIiIiIqLs5TlJ269fP0ybNg1paWlo3bo1RCIRnj59igsXLmD9+vWYNGlSUcRJRERERERERERE9FPKc5K2W7duSExMxIoVK7B9+3ZIpVL89ddfUFVVxaBBg9CrV6+iiJOIiKhUkEilUBKJFB0GERERERERFaM8J2kBYOjQoejTpw+uXLmCd+/eoWzZsmjQoIHMQGJERESUd0oiEUKffMDrz+mKDiXffi2rhpaVtBUdBhERERER0Q8jX0laANDR0UGLFi0KMxYiIiIC8PpzOuKSMxQdRr7pq/+4sRMRERERESlCnpO0ffv2/W6bzZs35ysYIiIiIiIiIiIiotImz0laqVQqNy8pKQkPHz6ElpYW2rZtWyiBEREREREREREREZUGeU7SBgYGZjn/3bt3GDx4MH799dcCB0VERERERERERERUWigV1obKlSuHIUOGYOPGjYW1SSIiIiIiIiIiIqKfXqElaTO9fv26sDdJRERERERERERE9NPKc7mDS5cuyc3LyMjAq1evsHz5cpibmxdKYERERERERERERESlQZ6TtJ6enhCJRHLzpVIpKlasiClTphRKYERERERERERERESlQZ6TtJs3b5abJxKJoKOjA1NTUygpFXoFBSIiIiIiIiIiIqKfVp6TtDY2NkURBxEREREREREREVGplKsk7eTJk3O9QZFIhDlz5uQ7ICIiIiIiIiIiIqLSJFdJ2gsXLuR6g1nVqyUiIiIiIiIiIiKirOUqSXvixImijoOIiIiIiIiIiIioVCrUUb6SkpJw5syZwtwkERERERERERER0U8tzwOHvXjxAjNmzMDFixeRmpqaZZvbt28XODAiIiIiIiIiIiKi0iDPSdq5c+fiypUr6NatG65cuQJNTU1YWloiPDwc9+7dw9KlS4siTiIAQFhYGAICApCWlgY3Nzd4eXll2e7WrVvo1asXbty4AQBITk6GnZ0dqlWrJrQJDg6GsrJyscRNRERERERERESUnTyXO7h06RLGjBkDHx8feHh4QF1dHePHj0dQUBAaN26M48ePF0WcRBCLxfD390dgYCAOHDiAy5cv4+zZs3LtUlJS4Ofnh7S0NGHezZs30bRpU4SEhAj/mKAlIiIiIiIiIqKSIM9J2k+fPsHU1BQA8Ouvv+LWrVsAAGVlZfTu3RuRkZGFGyHR/xceHo6mTZtCT08PqqqqcHd3x8GDB+XabdmyBX379pWZFxMTg7i4OHTr1g09e/bE5cuXiytsIiIiIiIiIiKiHOU5SWtgYICEhAQAgLGxMd69ewexWAwAKF++PF6/fl24ERL9f/Hx8TAwMBCmDQwMEBcXJ9PmxIkTSE1NRbt27WTmi0QitG/fHrt27cLUqVMxevRovHnzpljiJiIiIiIiIiIiykmek7QtW7bE33//jatXr6Jy5cowMjLC+vXr8fHjRwQFBcHQ0LAo4iSCRCKRmycSiYS/xWIxVq1ahX79+sm169+/P4YMGQKRSARzc3NYWFjgypUrRRovERERERERERFRbuQqSevp6YnQ0FCkpKRg1KhRKFu2LP755x8AwJgxY7Bp0yY0btwY+/btw4ABA4o0YCq9jIyMhF7bwJeetUZGRsL0qVOn8PbtW/j6+qJz584AgE6dOuH9+/fYvXs3/vvvP6GtVCqFikqex80jIiIiIiIiIiIqdLnKUr19+xYTJkyAr68vXFxcMH36dKHHrJubGypVqoTo6GjUr18fNjY2RRowlV62trZYsmQJEhISUK5cOYSGhqJXr17C8m7dusHDwwPR0dGwtLSEmZkZQkJCAHypSfvo0SNMnDgRDx48wK1bt9CoUSNFHQoREREREREREZEgVz1p9+3bh6CgIHTq1AmHDx9Gt27dMHjwYGzduhXv37+HtbU1Bg0axAQtFSlDQ0OMHz8eAwYMgIuLC0xNTdGmTRt4e3vj+PHjOa47ZswYPH78GB07dsSYMWPg7+8PHR2dYoo8a2FhYejYsSPatm2LgICAbNvdunUL9erVk5v/8eNHODk54cKFC0UZJhERERERERERFbFcP+9tbm4Oc3NzTJo0CadPn8bevXsxb948+Pv7o02bNujatSuaNm1alLESwdnZGc7OzjLz/Pz8smx79+5d4W9dXV2sXLmySGPLC7FYDH9/fwQFBaFMmTIYPHgwzp49i+bNm8u0S05OxqxZs5CWlia3DV9fX7x//764QiYiIiIiIiIioiKS54HDVFRU4OjoiKVLl+LcuXOYMGECYmNj0b9/f7Rp06ZEJcKodNLU1FR0CN8VHh6Opk2bQk9PD6qqqnB3d8fBgwfl2s2bNw/9+/eXm3/w4EFoa2vD1NS0GKIlIiIiIiIiIqKilOck7dfKlSuHPn36YOfOnQgMDISysrIwoBiVHPl9rP7ly5fo27cv3Nzc0K1bN9y+fbvAsUikkgJvIyfKysowMzODsrJyke4HKNixxMfHw8DAQJg2MDBAXFycTJvjx4/j8+fPaN++vcz8ly9fYtOmTZgwYUK+909ERERERERERCVHgYa3F4vFOHDgAPbv34+bN2+iYsWKGD58eGHFRoWgII/Vz5s3D66urujWrRvOnDmDmTNnYseOHQWKR0mkhEOfDiExI7FA21E0PWU9tNdu//2G2ZBI5BO8IpFI+FssFmPFihXYuHGj3Hre3t6YOnUqNDQ08r1/IiIiIiIiIiIqOfKcpP306ROOHDmCffv24cKFC1BWVoaTkxPGjBkDOzs7mUQTKd7Xj9UDEB6r/zZJm/lY/dWrV4V5f//9t/D38+fPUbZs2UKJKTEjEeIMcaFs60dlZGSEixcvCtPx8fEwMjISpk+dOoW3b9+iT58+wrxOnTph+vTpePToEby9vQEAz549g4+PD2bOnAk7O7v/196dx8d47v8ff89kI4ktUSLW1qkEsQSnegRRa1FtaqtWj1KK1laN2Cs4FUIt0SghgkosDa19qbWKltaRKsGxn5RWLVEiZJmZ3x9+mW9S2tMFt0lezz76eDQz933nM31cuee63/d1X9fD+wAAAAAAAAC4b35XSJudna3PP/9ca9eu1c6dO3X79m1VrVpVI0aMULt27VSsWLEHXSf+pL/yWL3ZfGc2jJYtW+rChQuaPXv2gy+4gPjHP/6hmTNn6vLlyypWrJjWrFmjl19+2f5+p06d1KlTJ/vPfn5+Wr16tSTp888/t7/+z3/+U/3791f9+vUfXvEAAAAAAAC4r35XSBsUFKTr16+raNGi6tChgzp06KBq1ao96NpwH/zZx+pz++yzz3TkyBH17NlTmzZtUvHixR9ApQVL6dKlFRYWph49eigzM1NNmzZVixYtNGrUKDVt2lTNmjUzukQAAAAAAAA8JL8rpK1evbo6dOigFi1ayNXV9UHXhPvozz5Wv3jxYu3fv18NGzZUoUKFVL16dZUtW1YpKSmEtPdJ69at1bp16zyvTZgw4Z7bHj9+/J6vL168+L7XBQAAAAAAgIfrd4W0cXFxD7oOPCB/5bH6xMREXbx4UV27dtV//vMfXblyRZUrV37onwEAAAAAAADIz/7wwmFwLH/lsfrw8HCNHDlSH3/8sdzc3DRt2jS5u7s/xOofXe4md9msVpn+/7y9jiy/fA4AAAAAAABHRUhbAPzZx+p9fX1/c67agszN5CaT2azsTxJku3Txf+/wiDI9VlrO7bv+7w0BAAAAAADwwBDSAn+B7dJF6cfzRpfxp9mMLgAAAAAAAADiGWcHYLURpQEAAAAAAAD5FSNpHYDZZNKaszd05Xa20aX8JU8UdVWwr4fRZQAAAAAAAACPFEJaB3HldrYu3rIYXcZf4u3m2PUDAAAAAAAADwLTHQAAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCDDQ9qMjAyNHDlS9erVU8OGDRUXF/c/9/n+++8VGBioffv2PYQKAQAAAAAAAODBMXzhsMmTJ+vw4cNatGiRLly4oGHDhsnX11fPPvvsr+4zduxYpaenP8QqAQAAAAAAAODBMDSkTU9PV2JioubNm6fq1aurevXqOnHihBISEn41pF2zZo1u3rz5kCsFAAAAAAAAgAfD0OkOjh07puzsbAUGBtpfq1u3rr799ltZrda7tk9NTdWUKVM0fvz4h1kmAAAAAAAAADwwhoa0ly5dUokSJeTq6mp/rWTJksrIyNC1a9fu2n7SpEl68cUX9eSTTz7EKgEAAAAAAADgwTF0uoNbt27lCWgl2X/OzMzM8/revXt14MABrVu37i//XovF8peP8TA5OTkZXQLyuUfxb4J2j4eBto+CiHaPgoh2j4LoUWz3v8XR6gWA+83QkNbNze2uMDbn50KFCtlfu337tsaMGaPw8PA8r/9Z33333V8+xsNSuHBhVatWzegykM8dP35ct27dMroMO9o9HhbaPgoi2j0KIto9CqJHrd0DAH6boSFt6dKllZqaquzsbDk73ynl0qVLKlSokIoWLWrf7tChQ0pJSdHAgQPz7P/GG28oJCTkD89RW6NGDe5cA7n4+fkZXQJgCNo+CiLaPQoi2j0KIkdr9xaLxaEGVAHA/WZoSFu1alU5OzsrKSlJ9erVkyQdOHBANWrUkNn8f9Pl1qxZU5999lmefVu2bKn33ntPQUFBf/j3Ojk5EdICufD3gIKKto+CiHaPgoh2j4KIdg8AjsXQkLZw4cIKCQnR2LFjFRERoZ9++klxcXGaOHGipDujaosUKaJChQqpYsWKd+1funRpeXt7P+yyAQAAAAAAAOC+Mf/vTR6sESNGqHr16nrttdc0btw4DRgwQC1btpQkNWzYUBs2bDC4QgAAAAAAAAB4cAwdSSvdGU0bGRmpyMjIu947fvz4r+73W+8BAAAAAAAAgKMwfCQtAAAAAAAAABRkhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGMjykzcjI0MiRI1WvXj01bNhQcXFxv7rtzp079cILLygwMFDt2rXTtm3bHmKlAAAAAAAAAHD/GR7STp48WYcPH9aiRYsUHh6u6Ohobdq06a7tjh07pv79+6tDhw5atWqVunTpokGDBunYsWMGVA0AAAAAAAAA94ezkb88PT1diYmJmjdvnqpXr67q1avrxIkTSkhI0LPPPptn23Xr1unpp59Wt27dJEkVK1bU9u3btXHjRvn7+xtRPgAAAAAAAAD8ZYaGtMeOHVN2drYCAwPtr9WtW1dz5syR1WqV2fx/A31ffPFFZWVl3XWMGzduPJRaAQAAAAAAAOBBMDSkvXTpkkqUKCFXV1f7ayVLllRGRoauXbsmLy8v++uVK1fOs++JEyf05ZdfqkuXLn/491oslj9ftAGcnJyMLgH53KP4N0G7x8NA20dBRLtHQUS7R0H0KLb73+Jo9QLA/WZoSHvr1q08Aa0k+8+ZmZm/ut/Vq1c1YMAA1alTR82aNfvDv/e77777w/sYpXDhwqpWrZrRZSCfO378uG7dumV0GXa0ezwstH0URLR7FES0exREj1q7BwD8NkNDWjc3t7vC2JyfCxUqdM99Ll++rB49eshms2nmzJl5pkT4vWrUqMGdayAXPz8/o0sADEHbR0FEu0dBRLtHQeRo7d5isTjUgCoAuN8MDWlLly6t1NRUZWdny9n5TimXLl1SoUKFVLRo0bu2v3jxon3hsI8++ijPdAh/hJOTEyEtkAt/DyioaPsoiGj3KIho9yiIaPcA4Fj++DDU+6hq1apydnZWUlKS/bUDBw6oRo0ad42QTU9PV69evWQ2mxUfH6/SpUs/5GoBAAAAAAAA4P4zNKQtXLiwQkJCNHbsWB06dEhbt25VXFycfbTspUuXdPv2bUlSTEyM/vvf/yoyMtL+3qVLl3Tjxg3D6gcAAAAAAACAv8rQ6Q4kacSIERo7dqxee+01eXp6asCAAWrZsqUkqWHDhpo4caLat2+vzZs36/bt2+rUqVOe/V988UVNmjTJiNIBAAAAAAAA4C8zPKQtXLiwIiMj7SNkczt+/Lj9vzdt2vQwywIAAAAAAACAh8LQ6Q4AAAAAAAAAoKAjpAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIMND2oyMDI0cOVL16tVTw4YNFRcX96vbJicnq1OnTqpVq5Y6dOigw4cPP8RKAQAAAAAAAOD+MzyknTx5sg4fPqxFixYpPDxc0dHR2rRp013bpaenq3fv3qpXr54++eQTBQYGqk+fPkpPTzegagAAAAAAAAC4PwwNadPT05WYmKhRo0apevXqatGihXr16qWEhIS7tt2wYYPc3Nw0dOhQVa5cWaNGjZKHh8c9A10AAAAAAAAAcBSGhrTHjh1Tdna2AgMD7a/VrVtX3377raxWa55tv/32W9WtW1cmk0mSZDKZVKdOHSUlJT3MkgEAAAAAAADgvjI0pL106ZJKlCghV1dX+2slS5ZURkaGrl27dte2pUqVyvOat7e3fvzxx4dRKgAAAAAAAAA8EM5G/vJbt27lCWgl2X/OzMz8Xdv+crvfYrPZ7Md2cnL6MyUbwsnJSSVdTTLbTEaX8pcUc5EsFou85S2z8dMh/yXFVEwWi0XWUmVkMztOW/olU8lSslgsslgsRpdyF9r9oye/tHuJtv+g0e4fTbT7B4t2/2ii3T94+aXt0+6Nl1NvznU7ABQ0hoa0bm5ud4WsOT8XKlTod237y+1+S84UCsnJyX+mXEP5/v9/HVq6lHRRKvn//3F0SUqSKlS5868je4SnDKHdP3ryTbuXaPsPEu3+0UW7f3Bo948u2v2DlY/aPu3+0fDLqQ8BoKAwNKQtXbq0UlNTlZ2dLWfnO6VcunRJhQoVUtGiRe/a9vLly3leu3z58l1TIPwWZ2dn1ahRQ2az2T63LQAAAAAAMJbNZpPVarVnAwBQ0Bh69qtataqcnZ2VlJSkevXqSZIOHDhgD1Jzq1WrlubNmyebzSaTySSbzaZ///vf6tu37+/+fWaz+a4pEwAAAAAAAADASIZOGlS4cGGFhIRo7NixOnTokLZu3aq4uDh169ZN0p1Rtbdv35YkPfvss7p+/bomTJigkydPasKECbp165Zat25t5EcAAAAAAAAAgL/EZDN4Vu5bt25p7Nix+uyzz+Tp6amePXuqe/fukiQ/Pz9NnDhR7du3lyQdOnRI4eHhOnXqlPz8/DRu3DhVq1bNwOoBAAAAAAAA4K8xPKQFAAAAAAAAgILM0OkOAAAAAAAAAKCgI6QFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQAAAAAAAMBAhLQAAAAAAAAAYCBCWgB4BM2bN09JSUlGlwEAMIDVajW6BOChsNlsv/kzAAAFCSEtADxiLl++rIULF2r+/Pk6cuSI0eUAD82XX36pa9euGV0GYJgtW7ZIksxmuugoGEwmkyTZ+zs5PwMAUBDRA8Qj7ddGknCXHfnV5s2bVbJkSS1btkz//e9/NXv2bIJaFAj79+/XqFGjtGXLFl2/ft3ocoCHLjY2VqNHj9bhw4eNLgV4qPbu3auJEyfaz/308wEABRUhLR5ZNpvNPpJk8+bNWrFihb799ltlZ2fLZDLRgUO+89VXX2nQoEGKiYlR+fLlFR0drXPnzhHUokB48skndf36dS1cuFCbN2/WjRs3jC4JeGg2btyo48ePKyIiQgEBAUaXAzxUTz75pE6fPq3ly5dLYjQtAKDgIqTFI8lqtdo7aBERERo9erTef/99jRs3TnPnzlVmZiZBLfKdp59+WhEREZoxY4bmzJmj8uXL68MPPySoRb5269YtSVJ2drY8PT312GOPac6cOdq0aRNBLQqElJQUxcXFae3atdq7d6+ysrIkMZoQ+dO92vVjjz2mkSNHaufOnTp16pQBVQEA8GggpMUjKWcE7ZkzZ3TmzBktXrxYa9asUaNGjbRv3z7FxsYS1CJfyZnao3379nrvvfcIalEghIWFae/evcrOztaJEydUvHhxLVy4UG3atNHcuXMJapHvTZ8+XTNmzNC8efPUsmVLffPNN9q9e7f9qSEgv8lp12vXrtXChQvtr9eoUUNms1nHjx+XxOJ5AICCiZAWj6xNmzbptdde061bt+Tj46NSpUqpT58+ql27tr766iuCWuQbVqs1zyIxHTp00Pjx4+8Z1M6ZM4egFvlCZGSkdu3apWbNmsnZ2Vn+/v569tlnZbPZFBoaqpYtWxLUIl/bvn279u3bpxYtWqh48eJ6//33VbJkScXExGjfvn3Kzs6WxIha5A+5Q9fMzExt3rxZK1eu1HPPPadt27apVKlSat++vSIiInT16lUWzwMAFEh8++GRkXMRYrPZZLVaValSJT355JP2O+qS5O7urr59+6pOnTr6+uuvNWPGDEabwKHlDmi/+eYbbdu2TWlpaercufM9R9SmpKRo0qRJOn36tMGVA3+Nm5ubqlevrvT0dH3yySfy8vJSnz597OfzsLAwglrkW0eOHFF8fLxOnz6tv/3tb5IkV1dXzZ49Wx4eHoqOjtb+/fvp4yBfyN3XSUpK0rFjx9SnTx/Nnz9fNWvW1Pz58/Xqq6/K1dVVderU0aZNmyRxgwIAUPCYbHz74RGQu/OWmZkpJycnOTk56dy5c3rnnXeUkZGh1atXy8nJSdKdOQynTp0qSRo1ahQXMHBINpvN3nYnT56sFStWyMXFRZK0YMECValSRStWrNDo0aP19ttvq2/fvjpz5oxmz56tSZMmMcoEDu2zzz7T/PnzJUnffvut9uzZI29vb/uNupzz/ZQpU7R161Z17dpVHTt2lLu7u5FlA/dFzkjC6dOnq3LlyoqKirK37czMTPXr10///e9/NWXKFNWsWdPgaoH7IzIyUlu3btXNmzfVrFkz9enTR+XKlVNSUpL27t2rRYsWyWq16m9/+5uWLl0qKW9fCQCA/I6QFobL3fmaN2+e9u3bp/T0dP3tb3/TW2+9pRs3bmjkyJG6ffu2Vq9ebQ+mMjIy5Orqap/ugA4cHFVcXJzi4uIUGRmpGjVqqHv37rp9+7ZmzJihKlWqaOXKlXr33Xf1+uuva8iQIfb9fjlNAuBo3nzzTe3atUtt27ZVWFiYHnvsMft7FovFHtSOGTNG169f1/Tp0znXw6GtWbNGt2/f1uOPP66///3v2rBhgxYtWqTHH39c4eHhKly4sKQ7Qe3UqVM1dOhQ+98B4Mi2bNmicePGad68eSpVqpSuXbumypUrS/q/8/2xY8e0b98+LViwQK+88op69+5tcNUAADxchLR4ZHz44YeKj49X9+7d5erqqiVLlsjLy0tDhgxRiRIl7OHUypUr8wRTBLRwZFlZWerXr58aNmyobt266eDBg3rnnXdUtGhR/fTTT1q0aJGqVKmi5cuX69NPP9XSpUtp73B4FotFJpNJU6ZMUbFixbRjxw7VqlVLL7/8sh5//HH7drlvROSc6znnw1FNnz5dixcvVvHixVW0aFGFhISoe/fuWr9+vZYsWaIKFSpozJgx9qA2R+4bFoCjSkxM1Lp16xQXF5enPe/evVuHDx9Wr1695OzsLJvNphUrVujrr7/Wv/71L/uADAAACgKGYMEQa9eutf+3zWbT1atXtWvXLo0dO1a9e/e2X7SYzWZNmTJFPj4+mjlzpi5duqQRI0bkORYdNziSX65WbDKZdPv2bRUqVEjnz5/XkiVL9PLLL2vVqlUqWbKkBg8erHXr1umll17SsmXLWCgP+YLZbJbZbNawYcPUt29fvfTSSzp48KCWLVumM2fO5Nku52+GgBaO7uzZs1q4cKEWLlyoNm3aaO3atYqLi1Pbtm31yiuv6Pvvv1doaKgyMzPz7EdAC0fzy76OJP388886duyYvT1nZWVJkr7++mvt3Lkzz7ne3d1d+/fv17Vr1zjnAwAKFEJaPHSff/655s+fn6czZrVa9dNPP9lHj2RkZMjFxUVxcXE6d+6cli5dqvLly2v58uWKiIgwsnzgT8s9KvDUqVP6z3/+I4vFojfeeEM1atTQ4cOHdfPmTTVs2FA2m02lS5fWpUuX9Nlnn+UJZrlggaPLacM53wPt27fXK6+8on//+99avnz5XUHtL/cDHMmqVas0a9Ysff/99ypcuLAqVKigkJAQNW/eXOvXr9eCBQvUtm1bvfDCCypVqpScnZ2NLhn403L3dVJSUnT27FlJ0ssvvyxfX1+Fhoba+/mSVL9+fUl31puQpGvXruncuXO6desWfwsAgAKHbz48dMHBwWrUqJHMZrOSkpJUu3ZtFS1aVM7Oztq2bZsaNWokNzc3ZWVlydXVVX5+frp9+7YkqWzZspJ49A+OKeeiZerUqdq0aZNsNpsCAgL03nvvydPTUx9//LGKFSumatWqSZJcXFz0wQcf6KmnnmIUIfIls9lsb9cvvviiTCaTli5dquvXr2vAgAEqU6aM0SUCf8n777+v5cuXq3Tp0jp58qR27typMmXKqFSpUurQoYNMJpM2btyomzdvqn///urYsaMk5hyHY7LZbPZ2O336dK1bt06SVK9ePUVGRurNN9/UggULNGjQII0aNUq3bt3SRx99JC8vLxUtWlSSVKRIEQUHB6tt27by9vY27LMAAGAEQloYwmw269ixY+rSpYt91fqhQ4dq1KhR8vLy0sCBA+Xi4iKbzaZbt27ZO245CGjhqNasWaNPP/1UM2bMULFixZSZmSlPT09JUtGiRbVp0yYtWrRIW7Zs0Y0bN1SvXj37aHMu2JEf5b4BERISovT0dCUnJ6t06dJGlwb8JT/88IOuX7+uuLg4ValSRZMnT9b27dvl4eFhHzXbvn173bx5Uz/88EOeG3Gc7+GIctrvzJkztXz5cg0ZMkRFihRRaGioihcvrhEjRqhYsWKKiopS+/bt5evrq8KFC2vx4sX2vo6Tk5OqV69u8CcBAMAYLByGh+ZeIVNiYqLGjRund955R6+++qpWrVqlyMhIBQYGytfXV2fOnFFqaqpWrVrFI0/IF6ZOnaqUlBTNmDEjz+tr1qxRpUqV9Omnn+rUqVMqWrSopk+fLhcXF0aOo0DIHVCxSBgcmc1m0969e9WzZ0/5+Pjogw8+UI0aNWS1WjV+/Hj95z//0XPPPacXXnhBHh4eSk1NVfHixWnzcFi52+3FixfVp08fDR48WMHBwfrqq6/01ltvKTMzU88++6zef/99SdLhw4fl7u6uSpUqyWw2Kzs7m74+AKDA45sQD0XugHb37t364YcfVK1aNXXq1EnOzs4aOXKkzGazunfvroCAAM2fP19ZWVny9/fXsGHD5OzsTFCFfMHZ2VlXrlzRjRs3VKRIEfvrX375pVasWKGPPvpIaWlp9tG1XLTAEf3WyO9fe89kMtnbu8lkUmZmplxdXR90qcB9ZzKZFBQUpP79+ys6OlrHjx9XlSpV5ObmpjFjxuhf//qXNm7cqPT0dL366qsqUaKEJBHQwiHlbrc//vijMjIylJWVJS8vL50+fVrx8fEaOHCgateurS5dusjd3V3vvPOOqlevnmd+cvo6AAAQ0uIhybkgnzx5slavXi1vb2/VqVNHTz75pF588UXZbDaNGjVKmZmZ6t27t6ZOnZpnf4IqOJrcQVTu/65bt64SExO1fft2tWrVSoUKFZIkVa9eXTdu3JAke0Brs9lo93A4udv7hg0bdPHiRWVkZOjpp59W7dq1fzW8zd3eExMTJd1ZUIybc3A0OTeV+/fvr/T0dI0bN04eHh5q1qyZXF1d9e677yosLEzff/+93Nzc7PsR0MLR5A5oZ8yYoWPHjmnSpElq3LixChUqpN27d6to0aJq1qyZChUqpJIlS+rjjz+W1WrVe++9Zz8O03sAAHAHV/94YPbu3asGDRrYf966davWrFmjDz/8ULVq1dLVq1d15coVXb16Va1bt5a7u7uGDBmimzdvavDgwXmORVAFR5J74YyFCxfqxIkT+uGHHxQcHKyOHTuqW7duioiIUGZmpmrWrKly5cpp586d9tFUObhghyPKaftTpkzRqlWr1KRJE50/f17r169X27Zt1bdvX0n3nt5AkpYvX67w8HBFR0cT0MIhOTk52W9WDB06VDabTcOGDVNkZKQ9qH3//feZ1gMOL6fdbt++XV999ZVef/11FS9eXEOGDJGTk5MiIiLUuHFjlS9fXj///LPq1q2rN954Q1WrVjW4cgAAHk0kX3ggPvnkEy1evFiffPKJvQN3+fJlValSRbVq1dK+ffv08ccf66uvvlJmZqbatWun4cOHa+TIkVq3bh0XLHBoOW13+vTpWr58uQYMGCBfX18tW7ZMn3/+ueLi4pSdna0lS5ZoxowZeuyxxyRJs2fPlsQjr3B8ycnJ2rJli6KjoxUYGKj169dr+PDhCggI0IULF+Tr63vPgHbZsmWaMmWKPvjgAzVv3tzIjwD8JWaz2R7UDhs2TCaTSSNHjlR4eLjatm0rFxcXFoWEw8p93j527JgWLVqkM2fOqHLlyvb3JenGjRv67rvvlJycrGnTpiktLU3VqlWT2WxmGjMAAO6BhcPwwOR0vk6cOKEnn3xSBw8e1Msvv6zatWvru+++U6NGjdSiRQsVKlRIo0aN0rJly+Tv78/IEjis3G03NTVVb775pvr166fGjRvr888/16BBgxQREaEqVaqocuXKOn/+vC5cuKCsrCw9/fTTcnJyYmoPOKRfBk1ffPGFxo0bp61bt2rr1q0aNmyYwsLCFBQUpNjYWL3xxhsqV65cnv1yAtqIiAi1atXKqI8C3Fe52/iYMWN09uxZffTRRwZXBdwfp06dUuXKlbVmzRpNnz5dVapUUVRUlH0qp6SkJPXr109eXl7y8PDQ4sWL5eLiws0JAAB+BSEt7rvcQdU333yjf/7zn5o0aZJCQkK0c+dObdmyRcHBwQoODpabm5syMjL08ssvKzw8XLVq1cpzDMBR5G6zqampcnFxUbNmzbRmzRqdPHlSAwYM0JAhQ/T8889r3LhxeuaZZ9SmTZs8x2BUCRxR7rafnJysatWq6fLlywoLC1NgYKAWLFigESNGqHPnzjp//rxatWql6dOnq0WLFvZjENAiP8sdSNG/QX6xc+dOzZw5U7169VKbNm20du1axcfH64knntDYsWPt8y3fuHFDly9fVsWKFWU2m7kZDQDAb+AbEvdV7gsRk8mkv//973rnnXf07rvvysXFRW3btlWTJk104cIF7dq1S6VLl9bMmTNlNptVo0YN+3G4gIEjyd3up0yZogsXLuj9999XcHCwxo0bp7179+rdd99Vhw4dZLPZdPz4cfn4+NwV0hLQwtHkbvtHjhzRkCFDNHDgQLVu3VpFihRRTEyMevTooc6dO0uS3NzcVKVKFXl5edmPsWHDBkVGRmrSpEkEtHBovzY6MHcwZTKZlJmZKVdXVwMqBP68X95gqFChgqpUqaKVK1fKZDKpXbt2slqtWr58ucaNG6fw8HC5ubmpSJEiKlKkiKQ7fyMEtAAA/Dq+JXHf5L44WbNmjU6fPi1vb2+FhITIbDYrLCxMNptNzz33nM6cOaNRo0apQoUKcnd319KlS/PM3wY4kpw2++233+ro0aMKDQ2133hYuHChGjdurA4dOkiSsrKyVKRIEZUpU8bIkoG/LPcCeXPnzlVSUpLOnj2rqKgoubu767333tOlS5d06tQpzZo1S9WqVVN8fLwkqXbt2vbjVKpUSdHR0QoKCjLiYwB/yoYNG3Tx4kVlZGTo6aefVu3atX+1/2Kz2ezBVGJioiSpffv23JiDQ/nlAIonnnhCb731lubMmaPly5dLkl544QVJ0ooVKxQaGqpp06bluSFBHx8AgN9GSIv7Jqfj9f7772v16tUKCAhQkSJFVKVKFfXq1Uu3b9/W0KFDZTab1aZNG61atUrOzs567LHHZDKZePwJDstms+mrr75Sjx495Ovrq0KFCslkMunVV1/V5cuX9c0336hbt26qUaOGDh48qOvXr9tHFgKOKueCPTY2VnFxcQoPD1eTJk106NAhRUdHq1+/fpo1a5ZiYmK0fv167dy5U97e3lq+fLmcnJxksVhkNptVrVo1gz8J8MdMmTJFq1atUpMmTXT+/HmtX79ebdu2Vd++fSXlHXGY+7+XL1+u8PBwRUdHE9DCIcXFxengwYP64IMPJN0ZTdunTx/FxMQoISFBrq6ueuGFF3T79m0dPXqUfj0AAH8Qc9Livjp27Jj69++vyMhI1a1bV9evX1fRokUlSVevXtXmzZs1YcIEhYeHq1OnTvb9GEGL/CA6OlrR0dEKDw9XSEiIChcuLJvNpq1bt+rLL7/U5cuX5ePjo6FDh8rZ2Zk5aJEvhIWFqWLFiurfv78k6cKFC1q3bp02bdqkQYMGKTg4WNnZ2bp+/bpKlCjBTTk4tOTkZL399tuKjIxUYGCg1q9fr+HDh2v27Nl64okn5Ovra982d0CbM+/ypEmT8szHDDzKftk/37Fjh0JDQ9WqVStNnDjR/vr58+c1YMAAZWdn6/XXX1dISMivHgMAAPw6vjFx39lsNvt8gzkB7eHDh/XOO++oefPmeu2117Rq1ao8+9B5gyOzWCySpP79+6tXr16aMGGCduzYoczMTJlMJrVo0UJjxozRjBkzNHLkSDk7Oys7O5uAFg4n931dm82mrKwsnT17VhcvXrS/7uvrq7Zt28rT01Pjx4/XunXr5OzsLC8vL/uikgS0cFRXrlyR1WpVYGCgtm7dqjFjxmjUqFGqWLGiYmJi9P3330u6E0z9MqCNiIggoIXDyB2unjhxQkeOHJG/v79mzZql7du3a/jw4fZty5Ytq8DAQLm6uiolJSXPdwV9fAAAfj++NfGnWa3Wu14zm8366aefdOzYMUn/F16lp6frzJkzSk1NVVhYmH1eQiA/cHJysv89DBkyRN26ddPw4cO1bds2+9+AlPdChZAKjiZ36JSdna2srCy5uLioc+fOOnDggL744gv7tmXLllWlSpXk7u6uxMRE7dy50/4eC0PCESUnJ0uSqlatqvLly2vmzJkKCwvTsGHD1KVLF5nNZq1cuVJHjx6V9H/n+9wBLQvjwZHktOHJkydr0KBB6tatmxISEuTr66upU6dq27ZtGjlypKQ7/fxr164pJCRE/fv35zwPAMCfREqAPyX33fXTp0/LarWqRIkSqlKlivr166dx48apSJEiatiwoSTJ399fXl5eunnzpiTZR1PRiUN+kXvhu6FDh0qSRo4cqYyMDD3//POMJIFDy33Oj4uLU1JSktzd3dWtWze1atVKe/bsUUJCgqxWq4KDg5WWlqaffvpJzzzzjC5fvqx9+/apSZMmxn4I4E86cuSIhgwZooEDB6p169YqUqSIYmJi1KNHD/v84m5ubqpSpYr9SSLpzuJikZGRmjRpEgEtHNLq1au1evVqzZ07V87OzrJarapYsaIqVqyoqKgojRgxQo0aNZKHh4ecnJw0efJk+vgAAPwFzEmLv2T69On67LPPZDKZlJ6erk6dOunvf/+7tm/friVLlqhfv34qUaKEtmzZotTUVH388ceEVcjXcodZY8aM0dmzZ/XRRx8ZXBVwf0RGRmrFihVq0qSJbt68qVOnTik2NlaSNHv2bO3cuVM+Pj72qT7Wrl2rBQsWaPPmzVq8eLFcXFwM/gTAHzN37lwlJSVpx44dqlixokaMGKHAwED16dNHxYsXV0BAgKpVq6b4+HilpqYqMTHRPpVNcnKyUlNTFRQUZPCnAP6cuXPn6rvvvrMvFJZjz549+vHHH9W4cWOtWLFCHh4eeuWVV5hvHwCAv4iQFn/axx9/rKioKL3//vv6xz/+obCwMPtoqmLFimnLli1atmyZihcvriJFimjq1KlycXFhAQHke7nbOKNJkF/s3btXI0eOVExMjPz8/JSYmKh3331XFStW1MyZM+Xn56eDBw/qwIEDKl68uDp27ChJmjBhgq5evaqJEyfK1dXV4E8B/H6xsbGKjY1VeHi4bty4oUOHDun48ePq16+fatasqZiYGH3xxRfy8PCQt7e3PvjgA7m4uMhischsNnPuh0O5V/88MjJSO3bs0KZNmyRJmZmZcnZ21tixY5WamnpXeMuikAAA/DWEtPjdftl5i4iIkNVq1ejRo7V161YNHz5cI0aMUIUKFXT9+nU1a9ZMaWlpKly4sP1ihc4bHM1v3VT4rfdyt/XMzEzCKTicX7bvDRs2aMmSJYqPj9fRo0cVERGhf/zjH7pw4YK+/vprRUVFyd/fX1lZWUpOTtZXX32lS5cuadWqVYqPj5e/v7+Bnwb448LCwlSxYkX1799fknThwgWtW7dOmzZt0qBBgxQcHKzs7Gxdv35dJUqUoJ8Dh5X7fL9v3z7duHFDPj4+KlSokAYMGKCmTZsqLCzMvv3KlSu1Zs0azZkzR4ULFzaqbAAA8h2GM+J3sdls9s7bF198oatXr6pIkSLy8vLStm3bFBYWptDQUHXo0EHfffedIiIilJaWJnd3dzk5ObGiNxxS7ouWDRs2aMGCBZozZ46SkpIk/fqKxbnbemJiolavXp1nATHgUZf7nB8VFaU1a9bI09NTbm5uunr1qtauXatatWqpW7duql+/vs6dO6eQkBB9/vnnslgsunLlijZv3qy0tDQCWjgcm82mrKwsnT17VhcvXrS/7uvrq7Zt28rT01Pjx4/XunXr5OzsLC8vL/o5cGg55/vIyEiNHDlSERERWr58ua5du6aXX35ZX331lcaNG6f09HSdP39en332mUqVKkVACwDAfUZPEv9T7se1ExISNH/+fC1dulQ+Pj6KiIhQVlaWwsPD1alTJ0lS8eLF5ePjIzc3tzwhFo/9wdHktN8pU6Zo1apVatKkic6fP6/169erbdu26tu3r6S8fyO5/3v58uUKDw9XdHQ087PBoeS04U2bNmn16tUKDw9X48aNVblyZZnNZu3bt0/9+vWTp6enSpcurTZt2ujpp59WUFCQnJ2d1bRpUwUHBxNaweFkZ2fLarXK1dVVnTt31oIFC/TFF1+oUaNGkqSyZcuqUqVK9vlnPT097Yvi0c+BI9u5c6fWrl2ruLg4lSxZUunp6SpXrpzq1asnT09PLVmyRE2bNtVjjz0mV1dXRUdHS2JaJwAA7ieunPA/5XS8NmzYoPXr16tPnz4qXbq0OnXqpHPnzikuLk7lypXT999/r+LFi2vDhg3y9vZmgRjkC8nJydqyZYuio6MVGBio9evXa/jw4QoICNCFCxfk6+t7z4B22bJlmjJlij744AM1b97cyI8A/Cnbtm1TQkKCatasqeDgYEl3RhIePnxYJ0+eVPny5WWz2RQXFycPDw/7Kvc5j3tzYwKOJi4uTklJSXJ3d1e3bt3UqlUr+1z7VqtVwcHBSktL008//aRnnnlGly9f1r59++whLeDIbty4oTJlyqhs2bLy8PCQl5eXJGn//v1KTk7WkiVL9PXXX8vT01MBAQFycnJieg8AAO4zvlXxu1y+fFlJSUlKSkrKs0rxkCFDdP36dQ0fPlwmk0lFixaV2WxWYmKiJO6uw/H8ch7OK1euyGq1KjAwUFu3btWYMWM0atQoVaxYUTExMXrjjTdUrly5PPvlBLQRERFq0aKFUR8F+EN+eb52dXWVi4uL9uzZo+3bt6tp06YymUzy9/dX8+bN1a5dOz355JOSpE8++cR+DC7Y4YgiIyO1YsUKNWnSRNevX9fgwYMVGxur0NBQzZ49WyNGjJCPj48yMzNlMpk0Z84cLViwQJs3b1ZWVhY3puFQ7jWn/s8//6yUlBQVKlRI0v/Np3/8+HEdPHhQkvJcA1gsFs73AADcZ3yz4p5+2XkrWbKkevfurfT0dMXExKh69er2kSPjx4/Xv//9b12/fl0Wi0VNmjTh7jocUu55OJOTk1WtWjVVrVpV5cuX18yZM7VgwQKNGDFCnTt31vnz57Vy5Uo1bNhQ5cqVu2dA26pVKyM/DvC75T7nZ2ZmysnJSY0aNZKvr6+mTJmihIQEFSpUSA0aNJCLi4tGjRqlVq1aKT09Xe3ateOcD4e2d+9ebdy4UfHx8fLz81NiYqK2b9+uXr16aebMmYqIiNDBgwd14MABFS9eXB07dpR0ZyGxsmXLijV44Uhyn+9Pnz6tzMxM+fv765VXXtHHH3+s3r17a/78+fYFT/38/CTdGWnr7e1tPw5PSwAAcP9xNYW75A6qVq1apR9++EEmk0khISEKDQ2Vm5ubxo0bJ7PZrMaNG0uS6tSpk+cY3F2Ho8l90XLkyBENGTJEAwcOVOvWrVWkSBHFxMSoR48e9ke63dzcVKVKFfvjgNKdKUEiIyM1adIkAlo4jNzn/NjYWB09elRnzpxRs2bN9Nxzz2no0KGaNm2a4uPjZbPZFBQUJC8vL7Vs2dJ+DM75cGTXrl1TuXLl5Ofnp6NHj2rNmjUaOHCgLly4oIEDByoqKkqBgYEKCAhQcnKyYmJidOnSJa1atUrx8fH2MAt41OU+30+bNk1r167VzZs31aBBA02ePFmjRo3SxIkT9fLLL2vcuHG6ffu2FixYIG9v7zz9HQAA8GCYbNz+Ry65g6r3339fK1euVNWqVXXt2jWdPn1aU6ZM0d/+9jfFx8drx44dGj9+vBo2bMi0BnBoudvv3LlzlZSUpB07dqhixYoaMWKEAgMD1adPHxUvXlwBAQGqVq2a4uPj7QvH5IwmSU5OVmpqap7HAQFHERUVpSVLlmjYsGG6deuWVqxYIUn69NNP9e233yo2NlY2m00dO3ZkDk7kC1FRUXr88cdVvHhxLVq0SFOmTFFsbKzMZrP69u2rHTt2KCwsTJIUExOj+vXra+/evYqOjlaVKlXUvXt3+fv7G/wpgD8uOjpaCQkJ9mk8+vTpo/bt2ys0NFTff/+9xo8fr5SUFBUtWlRFixbVwoUL5eLics9pEgAAwP1DSIt7OnfunGbOnKmXXnpJTz31lCRp6tSpWrx4sebOnatq1app2rRpSkxMVHx8vGrVqmVwxcBfFxsbq9jYWIWHh+vGjRs6dOiQjh8/rn79+qlmzZqKiYnRF198IQ8PD3l7e+uDDz6Qi4uLLBaLzGYzNyrgsNLS0vTmm2+qR48eatq0qXbv3q3+/fsrIiJClSpVUoUKFXTmzBlNnjxZtWvXVmhoqNElA3/Jpk2bNHnyZIWHhys4OFjnz5+Xh4eHevbsqX79+qlp06bav3+/li1bpqefflrt27e3jxa3WCzMvwyHZLPZdOnSJb355pvq37+/nnnmGR08eFC9evVSVlaWmjRpoqioKJlMJp09e1Zms9k+pRNT2gAA8ODxTYs8bDab9u7dq549e+ZZrVuSQkNDlZ6erhEjRmj16tV67bXXVKFCBQUEBBhYMXD/HD9+XK+++qpat24tSWrYsKHWrVunmTNnatCgQRoxYoTCwsJ0/fp1lShRQiaTiYsWOKTco8fT09NltVp14sQJ+fr66ssvv9SAAQMUFhamZs2aacyYMapTp45eeukljRgxgpGDcHjbtm1TQkKCatasqeDgYEmSr6+vDh8+rJMnT6p8+fKy2WyKi4vL0xfKOd8zFycclclkkpOTk6xWq5ydnXX27FktWLBAoaGhCgoKUrt27TR69Gh169ZNlStXtvdvcrYHAAAPFs+rIA+TyaSgoCANHDhQN2/e1Llz5yTdGTUiSZ06dZLNZtMPP/ygihUrqnv37nJycrK/DziK3A8R2Gw2ZWVl6ezZs7p48aL9dV9fX7Vt21aenp4aP3681q1bJ2dnZ3l5eclkMjGSCg4rJ6BNSEjQ+vXrVbRoUbVs2VJTpkxR3759NXr0aHXt2lVubm46f/68kpOTJUnVqlWT2WyW1Wo1snzgL3F1dZWLi4v27Nmj7du3S7rzN+Hv76/mzZurXbt2ev7553X+/HlNmjRJkjjfI99wd3dXixYt5Ovrq3379snDw0P169dX8eLF9dhjj2nlypVKTEzM096Z4gAAgIeDb1zkkRO2vvXWW+rZs6ciIiL05Zdf2keNeHh4qFChQsrOzs6zH6NK4EisVqs9pMrOzlZWVpZcXFzUuXNnHThwQF988YV927Jly6pSpUpyd3dXYmKidu7caX+P6Q3g6E6cOKFZs2bJYrEoKChI58+fV4MGDewL32VmZspkMqlixYp59uOCHY4oMzNTFotFjRo10qhRo1S3bl0lJCRo7969kiQXFxeNHDlSUVFR6tmzpz799FO5uLgoOzub8z3yBavVqsKFC6t3796qXLmyduzYoVKlSqly5cpydXVVjRo1FB8frxEjRhhdKgAABRJDApBHziNQZrNZYWFhslgsevPNN/XPf/5TZcuW1fbt2+Xp6akqVaoYXSrwp+Re9CIuLk5JSUlyd3dXt27d1KpVK+3Zs0cJCQmyWq0KDg5WWlqafvrpJz3zzDO6fPmy9u3bx6JJcEj3WvBl4MCBOn36tBITE9WlSxelpKRo586d9gWRTp48qbS0NHXr1s2gqoH7IzY2VkePHtXZs2fVtGlTtWvXTmFhYZo+fbri4+MlSQ0aNJC3t7f9JoV05+Y1I2iRX+R8Bzg7Oys7O1vp6ek6deqUdu/erQULFuj69euqU6eOzGazLBYLgzAAAHjIWDgM95T7Yn769OmKiYlR7dq19dRTT+ntt9+m8waHFxkZqRUrVqhJkya6efOmTp06pdjYWEnS7NmztXPnTvn4+NhHEq5du1YLFizQ5s2btXjxYrm4uBj8CYA/59SpUypRooS8vLyUkZGhadOm5Wn/u3fvVlJSks6dOydfX18NGDBAzs7OnPPhsKKiorRkyRINGzZMt27d0ooVKyRJn376qb799lvFxsbKZrOpY8eO3ISDQ/vvf/+rChUq/M/tcuYlP336tF577TU99thjcnd314IFC+Ti4nLPm3oAAODBY2gA7ilnzkGz2azBgwfL2dlZc+fOVbdu3WQ2m2Wz2bhYh8Pau3evNm7cqPj4ePn5+SkxMVHbt29Xr169NHPmTEVEROjgwYM6cOCAihcvro4dO0qSLly4oLJly4p7W3AkuS+29+zZo9DQUD311FNq3769mjRpon79+um5557T3Llz1bt3bzVs2FANGzbMcwwWyIOjSktL0zfffKOJEyeqadOm2r17t86cOaOIiAglJyercuXK6t27tyZPnqwDBw4Q0sJhzZkzR19//bUGDx78Pxf1NZlMslqteuKJJ7Rp0yalpaWpVKlSLIgKAIDB+AbGr8od1A4YMEDp6ekaPXq0MjMz1a5dO0JaOIxfjgi5du2aypUrJz8/Px09elRr1qzRwIEDdeHCBQ0cOFBRUVEKDAxUQECAkpOTFRMTo0uXLmnVqlWKj4+Xq6urgZ8G+P1yt/39+/fL29tbL730kgoXLqz+/fsrJCREzZs317Bhw7R7925duHBBZcqUuWv+TS7Y4YjS09NltVp14sQJ+fr66ssvv9SAAQMUFhamZs2aacyYMapTp45eeukljRgxQv7+/kaXDPxpFSpUUFJSkhYsWKDXXntNNWvW/M3tcwZdeHh4yMPDw/46/XsAAIzDcywFyG+txv1r75nNZmVlZUmShg0bpjZt2mjq1Km6devWA6kRuN9sNps9pIqKitKaNWvk6ekpNzc3Xb16VWvXrlWtWrXUrVs31a9fX+fOnVNISIg+//xzWSwWXblyRZs3b1ZaWpri4+O5iIfDyN32IyMj1bdvX/Xv318HDx5UQECAVq1aJYvFog8//FBTp07VwYMHdfLkSRZIQr6QkJCg9evXq2jRomrZsqWmTJmivn37avTo0eratavc3Nx0/vx5JScnS5KqVatmvzkNOKI2bdro1VdflcVi0cKFC3Xs2LE/tP+JEyd08+ZNvgMAADAQIW0BkXs01YYNG7RgwQLNmTNHSUlJkn59pW6bzWafe/OTTz5RixYttGLFCnl6ej6UuoG/KudiY9OmTVq9erWKFSumxo0ba/z48TKbzdq3b5/q1KkjT09PlS5dWm3atNH48eMVFBSkQoUKqWnTpkpMTNR7771HQAuHktP2582bp1WrVmn+/PmaMWOG3NzcNHHiRF27dk0TJ07U1KlT9fe//10pKSn2uTqZ0gOO7sSJE5o1a5YsFouCgoJ0/vx5NWjQwL4oWM584xUrVsyzH/NwwtHkPl9fv35dmZmZ+uyzz/Thhx/qyJEjv7lfzvdEQkKChg0bpitXrjzwegEAwK+jJ1pA5Fx0TJkyRRMmTNDJkyf11Vdf6d1339WcOXPs2+Xu6OXuvC1fvlwjR45UVlaWSpcu/XCLB/6ibdu2KSEhQTVr1lRwcLAkydfXVykpKTp58qTKly8vm82muLg4mUwmde7c2b7ysXTn0T8e94Yjslgs+u677zR48GAFBgbKZrPp2LFjKlOmjCIjI7Vz506VL19ekZGRWrp0qWbMmCFJjKSCwxs4cKAqVKigxMREtWrVSh07dtSNGzfUvXt3jR49Wt26dVNqaqq6detmdKnAX5Jzvo6KilJERISeeuopde7cWT/++KPmz5+vQ4cO3bVP7j7+smXLNG3aNPXq1et3LToGAAAeHELaAiQ5OVlbtmxRdHS0JkyYoE6dOuns2bMKCAjQhQsXJP1fR++XnbfJkyfrgw8+UPPmzQ2rH/i9fjkK0NXVVS4uLtqzZ4+2b98u6U5b9/f3V/PmzdWuXTs9//zzOn/+vCZNmmQ/BsEsHN2tW7d05MgRWa1WpaamatGiRXr99dc1dOhQWSwWTZ48WQsXLpQk1axZU2azWRaLxdiigT/p1KlTunr1qiTJw8NDVatW1datWyVJvXr1Ut++fdWkSRNlZGSofv36WrVqlZydnWnzcHgZGRk6dOiQBgwYoO7du2vMmDEaOHCgXF1dtWDBAh09etS+rcViydPHnzJliiZOnKg2bdoYVT4AAPj/CGnzsV/Oq3blyhVZrVYFBgZq69atGjNmjEaNGqWKFSsqJiZG33//vX2/X3beIiIi1KJFi4f+GYA/Knf7zczMlMViUaNGjTRq1CjVrVtXCQkJ2rt3ryTJxcVFo0aN0syZM9WzZ0+tWrVKLi4uys7OZiQh8gVPT09FRkbK399fBw4cUHp6uurVq6cqVarIx8dHbm5u2rVrV54bGywaA0e0Z88ede3aVWPHjtXOnTvl5uamfv366T//+Y/mzp0rSWrYsKH69++vKVOmaPDgwfYnJmjzcHSZmZk6ffq0rl+/bn+tYcOGateunZKTk/NMcZbT3pcvX27v47ds2dKIsgEAwC8Q0uZTuReMyVkUo2rVqipfvrxmzpypsLAwDRs2TF26dJHZbNbKlSvtd9lz9ssd0ObM4QY8ynK3+9jYWI0YMUKdOnXSrFmz5OzsrKFDh6pw4cKKj4/Xnj17JEleXl5q2bKlQkJC5OTkJIvFwgha5Ct169ZV7dq1tXHjRrm4uKhGjRq6ffu2MjIy1KNHD/s0H8xDC0e1f/9+eXt766WXXlK1atXUv39/jR49Wv/+9781bNgwnTlzRhcuXLhnG+d8D0fzy0EYNptNRYoU0fPPP69NmzbZ+/2SFBQUJB8fHx0+fFi7d++2v7506VJNnDhREydOpI8PAMAjhJA2H8o9kvDIkSMKDQ3Vxo0bVbJkSRUpUkQxMTHq2rWrOnfuLElyc3NTlSpV5OXlZT/Ghg0bFBkZSUALh5J7XrZ58+YpKChIHTp00NatW/X222/riSee0BtvvCEnJyctXbpUO3fuvOsYjKhCfpPzd1GnTh0dO3ZMs2fP1ltvvaUbN26obdu2kvJOcQM4ksjISPXt21f9+/fXwYMHFRAQoFWrVslisejDDz/U1KlTdfDgQZ08eZI2DoeX+2b04sWLNW7cOEVHR+vq1atq3769vLy8NG/ePB07dkySlJaWpsKFC+vll19Wv379JEn//e9/9dlnnykyMpIRtAAAPGJMNobO5Cu5L7Tnzp2rpKQk7dixQxUrVtSIESMUGBioPn36qHjx4goICFC1atUUHx+v1NRUJSYm2gOq5ORkpaamKigoyMiPA/xhaWlpevPNN9WjRw81bdpUu3fvVv/+/RUREaFKlSqpQoUKOnPmjCZPnqzatWsrNDTU6JKBh+Lq1auaO3euDh48qFKlSmnatGlycXGRxWLh5gQc0rx58xQXF6cPP/xQLi4uioqK0oULFzRu3DjVq1dPKSkpio6O1rp169SsWTPNnDmTGxJwWFar1R7QTp8+XUuXLlXVqlWVmpqqQoUKad68eTp16pTmzp2rQ4cOyd/f3z5H88qVK/Oc5y9evMhCwAAAPIIIafOp2NhYxcbGKjw8XDdu3NChQ4d0/Phx9evXTzVr1lRMTIy++OILeXh4yNvbWx988IH9Yt1sNnMBA4eR+4I7PT1d2dnZatmypRYuXKjU1FS99dZbGjJkiDp27KgxY8aoTp06eumll5ScnCx/f3/7BQ9QUGRkZMjV1VUmk0nZ2dk87g2HZLFYNHjwYDVs2FCdO3fWd999p7feekt+fn76+eef1a9fPzVp0kSSdOjQIQUEBHC+R76QkpKimJgYde7cWTVr1tS3336rWbNm6dq1a4qJiVGhQoW0detWnT59WsWLF1fXrl3tC+TRxwcA4NFGSJtPhYWFqWLFiurfv78k6cKFC1q3bp02bdqkQYMGKTg4WNnZ2bp+/bpKlCjBxTocXkJCglxdXdWpUyeNGTNG58+f1zfffKMxY8aoQ4cOkqRXX31VlStX1rhx4+z75R6ZAhQkjCiEI0tLS9MLL7ygN954Q61atdKECRNUvXp1BQUFafjw4bp9+7Y6d+6s7t272/dh1Dgcmc1m0969e9WzZ0/7AIs6depIkpKSkjRr1iylpqYqOjpaPj4+efaljw8AgGMgmcgHcufsNptNWVlZOnv2rC5evGh/3dfXV23btpWnp6fGjx+vdevWydnZWV5eXvYFY+i8wZGdOHFCs2bNksViUVBQkM6fP68GDRrY51TOzMyUyWRSxYoV8+xHQIuCioAWjszT01ORkZHy9/fXgQMHlJ6ernr16qlKlSry8fGRm5ubdu3alaePREALR5O7/ZpMJgUFBemtt97SlStXdPz4cWVmZkqSateurf79+6tUqVLq3Lmzrly5kuc49PEBAHAMfGM7uNyjALOzs2W1WuXq6qrOnTtrwYIF+uKLL9SoUSNJUtmyZVWpUiX7/LOenp72RwG5WIcjudfo14EDB+r06dNKTExUly5dlJKSop07d6p79+7y9/fXyZMnlZaWpm7duhlUNQDgfqpbt65MJpNCQ0Pl4uKiGjVq6Pbt28rIyFCPHj30/PPPS2LUOBxT7r5Oamqqbt++rTJlymjgwIGSpAkTJqhYsWJq0aKFXFxcVKtWLb3++uvasWOHihcvbmDlAADgzyKkdWC5O29xcXFKSkqSu7u7unXrplatWmnPnj1KSEiQ1WpVcHCw0tLS9NNPP+mZZ57R5cuXtW/fPntICziSnHZ/6tQplShRQl5eXvLw8FDVqlW1detWdenSRb169ZK/v7+SkpJ07tw51a9fXwMGDLDPy8aIKgBwbDnBa506dfTRRx9p9uzZ+vrrr5WWlqa2bdtKIqCFY7LZbPa+TnR0tD7//HNdu3ZN3t7e6t69uwYOHKisrCwNGzZMJpNJLVq0kLOzs+rVq6d69epJYnoPAAAcEXPS5gORkZFasWKFmjRpops3b+rUqVOKjY2VJM2ePVs7d+6Uj4+P/XHvtWvXasGCBdq8ebMWL14sFxcXgz8B8PvkvjGxZ88ehYaG6qmnnlL79u3VpEkTXb9+Xc8995xeffVV9e7d+57HYF42AMhfrl69qrlz5+rgwYMqVaqUpk2bZl8MlZAKjmzBggWaN2+exo4dq3/84x/q1q2bbt++rfnz58vX11dTp05VfHy8wsPD1a5dO9o7AAAOjqTCwe3du1cbN25UfHy8/Pz8lJiYqO3bt6tXr16aOXOmIiIidPDgQR04cEDFixdXx44dJd1ZSKxs2bIio4ejyB3Q7t+/X97e3nrppZdUuHBh9e/fXyEhIWrevLmGDRum3bt368KFCypTpsxdI6gIaAEgf/Hy8tLw4cOVkZEhV1dXFkOFw8tZYyIpKUl9+vRRy5YttXfvXp07d07jx49XSkqK0tPTFRoaqtTUVH3yyScKCQkxumwAAPAX0Xt1ML+ci/PatWsqV66c/Pz8dPToUa1Zs0YDBw7UhQsXNHDgQEVFRSkwMFABAQFKTk5WTEyMLl26pFWrVik+Pl6urq4Gfhrg98n92F9kZKSWL18uLy8v+fr6qnfv3lq1apXmz5+vDz/8UJcvX5arq6tOnjwpX19fgysHADwsbm5uksRiqHB4JpNJrq6ucnZ2VrFixbR9+3aFhoZqyJAheu655zRq1Cj99NNPmjdvnt577z0GXQAAkE+wrLkDyR1URUVFac2aNfL09JSbm5uuXr2qtWvXqlatWurWrZvq16+vc+fOKSQkRJ9//rksFouuXLmizZs3Ky0tTfHx8fL39zf4EwG/T85o2Hnz5tkD2RkzZsjNzU0TJ07UtWvXNHHiRE2dOlV///vflZKSohUrVkgSFy4AUMAwBy3yi5IlS2rmzJkKCwvT8OHD1bVrV0lS6dKl80xXZjKZ6O8AAJAPMMzAgeRcdGzatEmrV69WeHi4GjdurMqVK8tsNmvfvn3q16+fPD09Vbp0abVp00ZPP/20goKC5OzsrKZNmyo4OJgRJnBIFotF3333nQYPHqzAwEB99913OnbsmPz8/BQZGal+/fqpSZMmioyMVNeuXRUQECCJi3UAAOCYhg8fruTkZGVlZdkXAS5cuLC++eYbVapUKc+29HcAAHB8jKR1MNu2bVNCQoJq1qyp4OBgSZKvr69SUlJ08uRJlS9fXjabTXFxcTKZTOrcubOcnZ2VnZ0tSXJyciKghUO6deuWjhw5IqvVqtTUVC1atEivv/66hg4dKovFosmTJ2vhwoWSpJo1a8psNstisRhbNAAAwJ9gsVhkMpk0a9YseXl5qWfPnnr55ZfVpUsXpaamasyYMZJ4YggAgPyEtO4RZ7PZ8twZd3V1lYuLi/bs2aPt27eradOmMplM8vf3V/PmzdWuXTs9+eSTkqRPPvnEfgyCWTg6T09PRUZGytnZWQcOHFB6errq1aunKlWqyMfHRz/88IN27dql1157zf43wyrHAADAETk5Oclqtapo0aJavXq11q5dq9TUVLm6uqpjx472QRj08QEAyD9MNm6/PrJyLxKWmZkpJycnOTk56dSpU5oyZYqysrLUs2dPNWjQQJJ09epVffPNN0pPT1e7du3k5ORE5w35Ss5Ni9DQUGVnZysqKkq3b99Wv3799MILL+j555/Psx0AAIAjs1gs97zpTB8fAID8h5D2EZU7ZIqNjdXRo0d15swZNWvWTM8995wsFoumTZsmq9Wqrl27Kigo6K5j/FqnDnB0CQkJ+uijjxQSEqKvv/5aaWlpWrp0qZycnAhoAQDAIyv3IIzf83oO+vUAAOR/hLSPuKioKC1ZskTDhg3TrVu37CvWf/rpp/r2228VGxsrm82mjh07qkmTJsYWCzwkV69e1dy5c3Xw4EGVKlVK06ZNk4uLCxcwAADgkZU7iD1z5ox9rQhfX19Jv/4kUO7XDx8+LB8fH5UsWfLhFQ4AAB4KQtpHWFpamt5880316NFDTZs21e7du9W/f39FRESoUqVKqlChgs6cOaPJkyerdu3aCg0NNbpk4KHKyMiQq6urTCYTj/0BAACHMG3aNO3YsUNXr16Vl5eXgoODNWTIEEl3B7W5f46Pj9fixYv14YcfqnLlyobUDgAAHhwSjUdI7k5Yenq6rFarTpw4IV9fX3355ZcaMGCAwsLC1KxZM40ZM0Z16tTRSy+9pBEjRsjf39/g6oGHz83NTRKL4wEAAMeQkJCgTz75RFOnTpXVatXFixc1fvx4/fzzz/rXv/4lk8lkvybIfW2wbNkyzZgxQ+PHjyegBQAgnyLVeITkdMISEhLk6uqqTp06qWXLlpoyZYq++eYbjRkzRh06dJAknT9/XoUKFZIkVatWTdL/nssKyK+YgxYAADxqUlJSVKxYMRUtWtTeTz9+/Ljat2+v+vXr27crU6aM+vbtqyeeeEI9evS4Z0A7ZcoURUREqFWrVkZ9HAAA8ICR6D2CTpw4oVmzZslisSgoKEjnz59XgwYN7J2yzMxMmUwmVaxYMc9+BLQAAACA8bKysrRp0yZ98cUXkqTTp09Lko4dO6ZLly7Zt7NYLKpfv746deqkr776ShkZGbJYLPaAdvny5QS0AAAUEIykNdi9Rr8OHDhQp0+fVmJiorp06aKUlBTt3LlT3bt3l7+/v06ePKm0tDR169bNoKoBAAAA/BoXFxedO3dOK1eu1OrVq5Wdna24uDi98MILWrx4sXbt2qXGjRvbFzz19PRUdna2fSon6U5AO2HCBL3//vtq2bKlUR8FAAA8JAy9NFhOQHvq1CldvXpVkuTh4aGqVatq69atkqRevXqpb9++atKkiTIyMlS/fn2tWrVKzs7OslgshtUOAAAA4N7ee+89eXp6ateuXXr66aclSQ0aNJCfn5+WLFmiHTt2SJJ+/vlnffvttypfvrykO3PtX7x4Ubt37yagBQCgADHZbDab0UUURLlH0O7Zs0ehoaF66qmn1L59ezVp0kTXr1/Xc889p1dffVW9e/e+5zFYzR4AAAB49GRlZSkrK0vvvvuusrOzdfLkSfXt21ft2rXT4cOHtXTpUm3btk3e3t4ym80ymUxauXKlXFxc7Me4evWqvLy8DPwUAADgYSKkNUDugHb//v0qWrSoNm7cqMKFCys6OlohISFq3ry5bt68qd27d2vAgAEqU6YMiyMBAAAADujdd9/V/v37NWDAAD333HO6efOmTp48qUOHDqlYsWJq06aNnJ2dlZ2dLScnJ/r9AAAUQIS0D1nulVojIyO1fPlyeXl5ydfXV71795aPj4/mz5+vU6dO6fLly3J1ddXIkSPVuHFjgysHAAAA8EfkfvJtzJgx+vrrr9WvXz/94x//0KVLl+Tv72/f1mKx2OeoBQAABQ8hrUHmzZunuLg4ffjhh3JxcVFUVJQuXLigcePGqV69ekpJSVF0dLTWrVunZs2aaebMmXkCXgAAAACPvtzh65gxY/Tll1/KYrHoySef1Jw5c+jfAwAASYS0hrBYLBo8eLAaNmyozp0767vvvtNbb70lPz8//fzzz+rXr5+aNGkiSTp06JACAgLs0yMAAAAAMN66dev0zDPPyMPD439umzuoXbVqla5cuaLXXnuN9SUAAIAdyZ8Bbt26pSNHjshqtSo1NVWLFi3S66+/rqFDh8pisWjy5MlauHChJKlmzZoym82yWCzGFg0AAABA0p2BFEOGDNG8efN069at/7m9k5OTvT8fEhKinj172uegBQAAkAhpDeHp6anIyEj5+/vrwIEDSk9PV7169VSlShX5+PjIzc1Nu3btUu5BzsxPBQAAADwaatasqejoaM2dO1cxMTFKS0v7n/vkDmpzMJIWAADkIKQ1SN26dVW7dm1t3LhRLi4uqlGjhm7fvq2MjAz16NFDcXFxMplMYjYKAAAA4NGRM/q1efPmmjBhgmJiYrRkyZL/GdTabDb7wIt169bpP//5zwOvFQAAOA5CWoPkLBBQp04dHTt2TLNnz9Zbb72lGzduqG3btpLEQmEAAADAIyZn9GtkZKQOHz4sT09PTZs2TbGxsbp58+Y998ndr1++fLmGDBmiixcvPrSaAQDAo4/nawzWunVrpaSkaOfOnSpVqpRiYmLsj0IxxQEAAADw6NmwYYNWrlyp2bNnq0OHDjpz5oxGjx4tZ2dn9ejRI89iYrkD2mXLlun999/XzJkz1ahRI6PKBwAAjyBCWoN5eXlp+PDhysjIkKurq0wmk7Kzs5mfCgAAAHhEpaSkqFq1aqpbt64kqVq1aipSpIgGDRokq9Wq119/XZ6enncFtFOmTFFERIRatmxpZPkAAOARxHQHjwg3Nzf7HLQEtAAAAMCjwWq13vWal5eXbt68qfPnz9u3ady4sXr37q25c+dq3rx5SktLyzPFQU5A26pVq4daPwAAcAyEtI8Y5qAFAAAAHg1Wq1Vm851LpnPnzunGjRuSpMDAQF2+fFkrV67U9evX7dt4e3vr8ccf1/Hjx+1THiQkJOi9997TxIkTCWgBAMCvYsgmAAAAAPyCzWazh69Tp07V5s2blZGRoVdffVWvvfaaRo8erUGDBun27duqX7++/va3v2nz5s1q3769unfvbh984eHhocmTJzPFAQAA+E0mm81mM7oIAAAAAHhU5J5Ldtu2bQoPD9eoUaP073//W19//bWCgoL0zjvvaM+ePYqJidGZM2dUsmRJOTs7a/ny5XJxcWEhYAAA8IcQ0gIAAADAPWzYsEFbt25VtWrV1KtXL0nS0qVLlZiYqKefflr9+vWT2WzWjz/+qMzMTD355JMym80sBAwAAP4weg4AAAAAoLwjaNPS0nTgwAFt3bpVJUuWtG/z8ssvS5JWrlwpJycndejQQY8//rj9favVSkALAAD+MHoPAAAAAAq83IuEpaSkyM3NTcHBwSpWrJji4uLUqFEjNWrUSNKdoNZsNismJkZlypRRpUqV7MfJOQYAAMAfwXQHAAAAAPD/TZ8+XV988YVu3rypokWL6oknnlDx4sW1ZcsWvffee2rQoIF9261bt+qZZ55h7lkAAPCXcZsXAAAAACQtXrxYy5Yt06hRo7Rw4UIFBARo9erVCgoKUvPmzfXuu+9q79699u2bN28uJycnWSwWA6sGAAD5ASEtAAAAAEg6e/asunTporp16+rIkSNau3atJk2aJE9PT5UsWVLt2rVT3759dfjw4Tz7MZIWAAD8VYS0AAAAAAo0m80mm82mH374QR4eHjp8+LDCwsI0ePBghYSE6NChQ9q3b5+Cg4M1YMAAVa1a1eiSAQBAPsPCYQAAAAAKNJPJJEl64YUXNGrUKE2bNk2TJ0/W888/L0nKzMxUZmamAgMDFRgYKEmyWCyMoAUAAPcNI2kBAAAAQFKjRo3UoUMHVahQQcWKFZMk/fzzzzpw4IB8fHzybEtACwAA7ieTzWazGV0EAAAAADwKrly5otjYWC1dulRly5aVzWaTi4uLVqxYIRcXF9lsNvvIWwAAgPuFkBYAAAAAcsnOztbx48d19OhRFSlSRM2bN5eTk5Oys7Pl7MyMcQAA4P4jpAUAAACA/4E5aAEAwINESAsAAAAAAAAABmLhMAAAAAAAAAAwECEtAAAAAAAAABiIkBYAAAAAAAAADERICwAAAAAAAAAGIqQFAAAAAAAAAAMR0gIAAAAAAACAgQhpAQCAQ7HZbEaXAAAAAAD3FSEtAAC47/75z3/Kz89PXbp0+dVtBg8eLD8/Pw0fPvx3H/fAgQPq3bv3/9zugw8+kJ+f3+8+LgAAAAAYydnoAgAAQP5kNpuVlJSkH3/8UT4+PnneS09P144dO/7wMRMTE3Xq1Kn/uV2nTp3UqFGjP3x8AAAAADACI2kBAMADUa1aNbm5uWnTpk13vbdjxw4VLlxYpUuXfiC/28fHR7Vr134gxwYAAACA+42QFgAAPBDu7u4KDg6+Z0i7YcMGtWrVSs7O//dQj9Vq1dy5c9WiRQsFBASoVatWWrx4sf394cOH69NPP9X58+fl5+enTz75RN9//738/Py0YMECPfvss6pVq5ZWrlx5z+kOVq1apRdffFG1atVSkyZNNHXqVGVmZkqSbt++rbFjx6px48YKCAjQs88+q/nz5z+g/zMAAAAAkBchLQAAeGDatGljn/IgR1pamnbt2qXnnnsuz7Zjx47VzJkz9fzzz2vOnDl69tlnFRERoVmzZkmS3nrrLQUHB+uxxx7T8uXL1aRJE/u+H3zwgd544w1NnjxZQUFBd9WRkJCgYcOGqXr16oqOjlbv3r21ePFivffee5KkiIgI7dq1S8OGDdP8+fPVrFkzTZ48WStXrnwA/1cAAAAAIC/mpAUAAA9MkyZNVLhwYW3atEndu3eXJG3ZskXe3t6qW7eufbszZ87o448/1jvvvGNfGKxhw4YymUyKiYnRK6+8ogoVKsjLy0uurq72qQzS09MlSa1bt1aHDh3uWYPVatWsWbPUvHlzeygrSbdu3dL69euVlZWl/fv3KygoSG3btpUk1a9fX+7u7vL29r7f/0sAAAAA4C6MpAUAAA9MoUKF1LRp0zxTHqxfv16tW7eWyWSyv/bVV1/JZrOpadOmys7Otv/btGlTZWRk6MCBA7/5e6pWrfqr7505c0ZXrlxRixYt8rzes2dPffLJJ3JxcVH9+vX18ccf64033lB8fLxSUlLUr1+/PKN1AQAAAOBBYSQtAAB4oFq3bq3+/fvrxx9/lJubm7788ku9/fbbeba5du2aJNlHsv7SxYsXf/N3uLu7/+p7Ocf+rVGxo0aNko+Pj9asWaN//etf+te//qXAwECNHTtW/v7+v/m7AQAAAOCvIqQFAAAPVOPGjeXh4aFNmzbJ3d1d5cqVU0BAQJ5tihYtKklatGiRPDw87jqGr6/vn/79Oce+evVqntdTU1OVnJyswMBAubu7680339Sbb76pCxcuaMeOHfrwww8VGhqq9evX/+nfDQAAAAC/B9MdAACAB8rV1VXNmzfX5s2btXHjxnuOlq1Xr56kO8FpjRo17P9evXpVUVFR9tGwZvMf77o88cQTKlGihHbs2JHn9dWrV6t3795KS0tTq1atFBcXJ+lOINy1a1e1bdtWFy5c+MO/DwAAAAD+KEbSAgCAB65Nmzbq06ePzGazRo8efdf7fn5+ev755/Xuu+/q/PnzCggI0JkzZzR9+nSVK1dOlSpVknRnVOzly5f1+eef/+Y8tLk5OTlpwIABGj9+vLy9vdW0aVOdOXNGM2fOVNeuXVWqVClVr15d0dHRcnFxkZ+fn86cOaNPP/1UrVq1up//GwAAAADgnghpAQDAA9egQQMVLVpUZcqUUeXKle+5zcSJExUTE6Nly5bpxx9/lLe3t9q0aaO3335bTk5OkqT27dvr888/V79+/TRw4EC1adPmd/3+rl27yt3dXfPnz9fy5cvl4+OjN954Q2+88YYkafz48ZoxY4bi4uJ06dIleXt7q2PHjho0aND9+R8AAAAAAL/BZLPZbEYXAQAAAAAAAAAFFXPSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADAQIS0AAAAAAAAAGIiQFgAAAAAAAAAMREgLAAAAAAAAAAYipAUAAAAAAAAAAxHSAgAAAAAAAICBCGkBAAAAAAAAwECEtAAAAAAAAABgIEJaAAAAAAAAADDQ/wPiFcAck5FpoAAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 1400x600 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#visualizing our performance\n",
    "plot_performance('evaluation/json_results', ['Basic RAG', 'Summary Indexing', 'Summary Indexing + Re-Ranking'], colors=['skyblue', 'lightgreen', 'salmon'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluation - Going Deeper with Promptfoo\n",
    "\n",
    "This guide has illustrated the importance of measuring prompt performance empirically when prompt engineering. You can read more about our empirical methodology to prompt engineering here. Using a Jupyter Notebook is a great way to start prompt engineering but as your datasets grow larger and your prompts more numerous it is important to leverage tooling that will scale with you.\n",
    "\n",
    "In this section of the guide we will explore using Promptfoo an open source LLM evaluation toolkit. To get started head over to the ./evaluation directory and checkout the ./evaluation/README.md.\n",
    "\n",
    "Promptfoo makes it very easy to build automated test suites that compare different models, hyperparameter choices, and prompts against one another. \n",
    "\n",
    "As an example, you can run the below cell to see the average performance of Haiku vs 3.5 Sonnet across all of our test cases. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Overall Statistics:\n",
      "Best Performing Provider: 3.5 Sonnet: T-0.0 (85.00%)\n",
      "Worst Performing Provider: Haiku: T-0.0 (78.00%)\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "# Load the JSON file\n",
    "with open('data/end_to_end_results.json', 'r') as f:\n",
    "    data = json.load(f)\n",
    "\n",
    "# Extract the results\n",
    "results = data['results']['results']\n",
    "\n",
    "# Create a DataFrame\n",
    "df = pd.DataFrame(results)\n",
    "\n",
    "# Extract provider, prompt, and score information\n",
    "df['provider'] = df['provider'].apply(lambda x: x['label'] if isinstance(x, dict) else x)\n",
    "df['prompt'] = df['prompt'].apply(lambda x: x['label'] if isinstance(x, dict) else x)\n",
    "\n",
    "# Function to safely extract scores\n",
    "def extract_score(x):\n",
    "    if isinstance(x, dict) and 'score' in x:\n",
    "        return x['score'] * 100  # Convert to percentage\n",
    "    return np.nan\n",
    "\n",
    "df['score'] = df['gradingResult'].apply(extract_score)\n",
    "\n",
    "# Group by provider and prompt, then calculate mean scores\n",
    "result = df.groupby(['provider', 'prompt'])['score'].mean().unstack()\n",
    "\n",
    "# Fill NaN values with 0\n",
    "result = result.fillna(0)\n",
    "\n",
    "# Calculate the average score across all prompts for each provider\n",
    "result['Average'] = result.mean(axis=1)\n",
    "\n",
    "# Sort the result by the average score\n",
    "result = result.sort_values(by='Average', ascending=False)\n",
    "\n",
    "# Round the results to 2 decimal places\n",
    "result = result.round(2)\n",
    "# Calculate overall statistics\n",
    "overall_average = result['Average'].mean()\n",
    "overall_std = result['Average'].std()\n",
    "best_provider = result['Average'].idxmax()\n",
    "worst_provider = result['Average'].idxmin()\n",
    "\n",
    "print(f\"\\nOverall Statistics:\")\n",
    "print(f\"Best Performing Provider: {best_provider} ({result.loc[best_provider, 'Average']:.2f}%)\")\n",
    "print(f\"Worst Performing Provider: {worst_provider} ({result.loc[worst_provider, 'Average']:.2f}%)\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "py311",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}