mirror of
https://github.com/promptfoo/promptfoo.git
synced 2023-08-15 01:10:51 +03:00
Add colab notebook example
This commit is contained in:
771
examples/colab-notebook/promptfoo_example.ipynb
Normal file
771
examples/colab-notebook/promptfoo_example.ipynb
Normal file
@@ -0,0 +1,771 @@
|
||||
{
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0,
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"provenance": []
|
||||
},
|
||||
"kernelspec": {
|
||||
"name": "python3",
|
||||
"display_name": "Python 3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
}
|
||||
},
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"This notebook shows how to\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"1. Set up & configure Promptfoo\n",
|
||||
"2. Set up secrets (such as an OpenAI or Anthropic API key)\n",
|
||||
"3. Run an eval of LLM models, prompts and outputs\n",
|
||||
"4. Run another eval that includes Python code in this notebook\n",
|
||||
"\n"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "uK0lOwXqCcci"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"# Node/Promptfoo setup"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "5yj5q5FMSn6v"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Start by installing the Node.js dependency."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "qBI5MsRN-mw9"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!curl -sL https://deb.nodesource.com/setup_18.x | sudo -E bash -\n",
|
||||
"!sudo apt-get install -y nodejs"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "CLsWqecMRPNn"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Next, we'll install and initialize promptfoo."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "n61ejK4u-tFy"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"# Set up promptfoo\n",
|
||||
"%env npm_config_yes=true\n",
|
||||
"!npx promptfoo@latest init"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "WNzZutVjRab-",
|
||||
"outputId": "547a1c03-a9ed-484c-ecf7-38ddb101ef21"
|
||||
},
|
||||
"execution_count": 2,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"env: npm_config_yes=true\n",
|
||||
"\u001b[32m\u001b[1mWrote prompts.txt and promptfooconfig.yaml. Open README.md to get started!\u001b[22m\u001b[39m\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"# Configure promptfoo"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "BWbwGrFLSzzQ"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"First, we set up the prompts. See https://promptfoo.dev/docs/configuration/parameters for more info on prompt files."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "4WtI9tXZ98sB"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"%%writefile prompts.txt\n",
|
||||
"You're an ecommerce chat assistant for a shoe company.\n",
|
||||
"Answer this user's question: {{name}}: \"{{question}}\"\n",
|
||||
"---\n",
|
||||
"You're a smart, bubbly chat assistant for a shoe company.\n",
|
||||
"Answer this user's question: {{name}}: \"{{question}}\""
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "8bgtMk0n86yj",
|
||||
"outputId": "ce1392af-5271-4d97-9832-c912f3588e3e"
|
||||
},
|
||||
"execution_count": 3,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Overwriting prompts.txt\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Next, we set up the configuration. See https://promptfoo.dev/docs/configuration/guide for more info on configuration."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "T_zcpupd14Rm"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"%%writefile promptfooconfig.yaml\n",
|
||||
"prompts: [prompts.txt]\n",
|
||||
"providers: [openai:chat:gpt-3.5-turbo-0613, openai:chat:gpt-4]\n",
|
||||
"tests:\n",
|
||||
" - vars:\n",
|
||||
" name: Bob\n",
|
||||
" question: Can you help me find a pair of sandals on your website?\n",
|
||||
" - vars:\n",
|
||||
" name: Jane\n",
|
||||
" question: Do you have any discounts available?\n",
|
||||
" - vars:\n",
|
||||
" name: Dave\n",
|
||||
" question: What are your shipping and return policies?\n",
|
||||
" - vars:\n",
|
||||
" name: Jim\n",
|
||||
" question: Can you provide more info on your hiking boot options?\n",
|
||||
" - vars:\n",
|
||||
" name: Alice\n",
|
||||
" question: What is the latest trend in winter footwear?"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "EUst0Tmv9Pfy",
|
||||
"outputId": "533b38c5-af6a-4265-a3e9-dd2c7f0a5e04"
|
||||
},
|
||||
"execution_count": 4,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Overwriting promptfooconfig.yaml\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"# Set up secrets"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "5K21inptxl73"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"This section loads a `secrets.json` file from your Google Drive. This is only necessary if you are using paid APIs like OpenAI or Anthropic.\n",
|
||||
"\n",
|
||||
"The file should look something like this:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"{\n",
|
||||
" OPENAI_API_KEY=\"sk-abc123\"\n",
|
||||
"}\n",
|
||||
"```"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "2u7gsi86xqoC"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"from google.colab import drive\n",
|
||||
"drive.mount('/content/drive')"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "OBPVTKY_x0kY",
|
||||
"outputId": "ae01f547-641c-4724-f762-d0daad678abf"
|
||||
},
|
||||
"execution_count": 5,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import json\n",
|
||||
"\n",
|
||||
"with open('/content/drive/MyDrive/Projects/promptfoo/secrets.json') as f:\n",
|
||||
" secrets = json.load(f)\n",
|
||||
"\n",
|
||||
"for key, value in secrets.items():\n",
|
||||
" os.environ[key] = value"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "jYr-v2_Cx-BL"
|
||||
},
|
||||
"execution_count": 6,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"# Run the eval"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "yDgQDQiKvKCN"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"First, run the eval - this will produce a quick side-by-side table view."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "cdJXjemM1UDK"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!npx promptfoo@latest eval -c /content/promptfooconfig.yaml --no-progress-bar"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "d5hZGY2svL6D",
|
||||
"outputId": "756f850d-0641-4a58-f551-03b6c9d33248"
|
||||
},
|
||||
"execution_count": 7,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Creating cache folder at /root/.promptfoo/cache.\n",
|
||||
"\n",
|
||||
"\u001b[90m┌────────────────────\u001b[39m\u001b[90m┬────────────────────\u001b[39m\u001b[90m┬────────────────────\u001b[39m\u001b[90m┬────────────────────\u001b[39m\u001b[90m┬────────────────────\u001b[39m\u001b[90m┬────────────────────┐\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m name \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m question \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m [openai:gpt-3.5-tu \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m [openai:gpt-4] You \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m [openai:gpt-3.5-tu \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m [openai:gpt-4] You \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m rbo-0613] You're a \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m 're an ecommerce c \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m rbo-0613] You're a \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m 're a smart, bubbl \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m n ecommerce chat a \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m hat assistant for \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m smart, bubbly cha \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m y chat assistant f \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m ssistant for a sho \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m a shoe company. \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m t assistant for a \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m or a shoe company. \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m e company. \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m Answer this user's \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m shoe company. \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m Answer this user's \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m Answer this user's \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m question: {{name} \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m Answer this user's \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m question: {{name} \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m question: {{name} \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m }: \"{{question}}\" \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m question: {{name} \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m }: \"{{question}}\" \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m }: \"{{question}}\" \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m }: \"{{question}}\" \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m Bob \u001b[90m│\u001b[39m Can you help me fi \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m nd a pair of sanda \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAssistant:\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mOf course,\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mOf course,\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAbsolutely\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ls on your website \u001b[90m│\u001b[39m \u001b[1m Of course, Bob! I\u001b[22m \u001b[90m│\u001b[39m \u001b[1m Bob! I'd be happy\u001b[22m \u001b[90m│\u001b[39m \u001b[1m Bob! I'd be delig\u001b[22m \u001b[90m│\u001b[39m \u001b[1m, Bob! I'd be happ\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ? \u001b[90m│\u001b[39m \u001b[1m'd be happy to hel\u001b[22m \u001b[90m│\u001b[39m \u001b[1m to assist you. To\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhted to help you f\u001b[22m \u001b[90m│\u001b[39m \u001b[1my to assist you. T\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mp you find a pair \u001b[22m \u001b[90m│\u001b[39m \u001b[1m help you better, \u001b[22m \u001b[90m│\u001b[39m \u001b[1mind a pair of sand\u001b[22m \u001b[90m│\u001b[39m \u001b[1mo help you better,\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mof sandals on our \u001b[22m \u001b[90m│\u001b[39m \u001b[1mcould you please p\u001b[22m \u001b[90m│\u001b[39m \u001b[1mals on our website\u001b[22m \u001b[90m│\u001b[39m \u001b[1m could you please \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mwebsite. Could you\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrovide more detail\u001b[22m \u001b[90m│\u001b[39m \u001b[1m. Could you please\u001b[22m \u001b[90m│\u001b[39m \u001b[1mprovide me with mo\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m please let me kno\u001b[22m \u001b[90m│\u001b[39m \u001b[1ms? Are you looking\u001b[22m \u001b[90m│\u001b[39m \u001b[1m let me know what \u001b[22m \u001b[90m│\u001b[39m \u001b[1mre details? For in\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mw what specific ty\u001b[22m \u001b[90m│\u001b[39m \u001b[1m for a specific st\u001b[22m \u001b[90m│\u001b[39m \u001b[1mspecific type of s\u001b[22m \u001b[90m│\u001b[39m \u001b[1mstance, are you lo\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mpe of sandals you \u001b[22m \u001b[90m│\u001b[39m \u001b[1myle, color, or siz\u001b[22m \u001b[90m│\u001b[39m \u001b[1mandals you are loo\u001b[22m \u001b[90m│\u001b[39m \u001b[1moking for a specif\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mare looking for?\u001b[22m \u001b[90m│\u001b[39m \u001b[1me?\u001b[22m \u001b[90m│\u001b[39m \u001b[1mking for? Are you \u001b[22m \u001b[90m│\u001b[39m \u001b[1mic style, color, o\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1minterested in men'\u001b[22m \u001b[90m│\u001b[39m \u001b[1mr size?\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1ms or women's sanda\u001b[22m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mls? And do you hav\u001b[22m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1me any particular s\u001b[22m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mtyle, ...\u001b[22m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m Jane \u001b[90m│\u001b[39m Do you have any di \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m scounts available? \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAssistant:\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mYes, Jane,\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAssistant:\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAbsolutely\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m Hello Jane! Thank\u001b[22m \u001b[90m│\u001b[39m \u001b[1m we do have discou\u001b[22m \u001b[90m│\u001b[39m \u001b[1m Hi Jane! Thank yo\u001b[22m \u001b[90m│\u001b[39m \u001b[1m, Jane! We often h\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m you for reaching \u001b[22m \u001b[90m│\u001b[39m \u001b[1mnts available from\u001b[22m \u001b[90m│\u001b[39m \u001b[1mu for reaching out\u001b[22m \u001b[90m│\u001b[39m \u001b[1mave various discou\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mout. Yes, we do ha\u001b[22m \u001b[90m│\u001b[39m \u001b[1m time to time. Cur\u001b[22m \u001b[90m│\u001b[39m \u001b[1m to us. Yes, we do\u001b[22m \u001b[90m│\u001b[39m \u001b[1mnts and promotiona\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mve discounts avail\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrently, we are off\u001b[22m \u001b[90m│\u001b[39m \u001b[1m have some excitin\u001b[22m \u001b[90m│\u001b[39m \u001b[1ml offers available\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mable. Could you pl\u001b[22m \u001b[90m│\u001b[39m \u001b[1mering a 10% discou\u001b[22m \u001b[90m│\u001b[39m \u001b[1mg discounts availa\u001b[22m \u001b[90m│\u001b[39m \u001b[1m. The specifics ca\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mease let me know w\u001b[22m \u001b[90m│\u001b[39m \u001b[1mnt on your first o\u001b[22m \u001b[90m│\u001b[39m \u001b[1mble at the moment.\u001b[22m \u001b[90m│\u001b[39m \u001b[1mn vary, so I recom\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mhich specific shoe\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrder when you sign\u001b[22m \u001b[90m│\u001b[39m \u001b[1m Could you please \u001b[22m \u001b[90m│\u001b[39m \u001b[1mmend checking our \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1ms you are interest\u001b[22m \u001b[90m│\u001b[39m \u001b[1m up for our newsle\u001b[22m \u001b[90m│\u001b[39m \u001b[1mlet me know which \u001b[22m \u001b[90m│\u001b[39m \u001b[1mwebsite or subscri\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1med in? This way, I\u001b[22m \u001b[90m│\u001b[39m \u001b[1mtter. Also, we hav\u001b[22m \u001b[90m│\u001b[39m \u001b[1mspecific shoes you\u001b[22m \u001b[90m│\u001b[39m \u001b[1mbing to our newsle\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m can provide you w\u001b[22m \u001b[90m│\u001b[39m \u001b[1me a sale section o\u001b[22m \u001b[90m│\u001b[39m \u001b[1m are interested in\u001b[22m \u001b[90m│\u001b[39m \u001b[1mtter for the most \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mith the most accur\u001b[22m \u001b[90m│\u001b[39m \u001b[1mn our website wher\u001b[22m \u001b[90m│\u001b[39m \u001b[1m? That way, I can \u001b[22m \u001b[90m│\u001b[39m \u001b[1mup-to-date informa\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mate information re\u001b[22m \u001b[90m│\u001b[39m \u001b[1me you can find sho\u001b[22m \u001b[90m│\u001b[39m \u001b[1mprovide you with t\u001b[22m \u001b[90m│\u001b[39m \u001b[1mtion. You can also\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mgarding any availa\u001b[22m \u001b[90m│\u001b[39m \u001b[1mes at discounted p\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhe most accurate i\u001b[22m \u001b[90m│\u001b[39m \u001b[1m check our social \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mble di...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrices....\u001b[22m \u001b[90m│\u001b[39m \u001b[1mnforma...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mmedia ...\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m Dave \u001b[90m│\u001b[39m What are your ship \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ping and return po \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAssistant:\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mWe offer f\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAssistant:\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mOur shippi\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m licies? \u001b[90m│\u001b[39m \u001b[1m Hello Dave! Thank\u001b[22m \u001b[90m│\u001b[39m \u001b[1mree standard shipp\u001b[22m \u001b[90m│\u001b[39m \u001b[1m Hello Dave! Thank\u001b[22m \u001b[90m│\u001b[39m \u001b[1mng policy is as fo\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m you for your inte\u001b[22m \u001b[90m│\u001b[39m \u001b[1ming on all orders \u001b[22m \u001b[90m│\u001b[39m \u001b[1m you for your inte\u001b[22m \u001b[90m│\u001b[39m \u001b[1mllows: We offer fr\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mrest in our shoe c\u001b[22m \u001b[90m│\u001b[39m \u001b[1mover $50. For orde\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrest in our shoe c\u001b[22m \u001b[90m│\u001b[39m \u001b[1mee standard shippi\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mompany. Our shippi\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrs under $50, the \u001b[22m \u001b[90m│\u001b[39m \u001b[1mompany. I'd be hap\u001b[22m \u001b[90m│\u001b[39m \u001b[1mng on all orders o\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mng policy is as fo\u001b[22m \u001b[90m│\u001b[39m \u001b[1mstandard shipping \u001b[22m \u001b[90m│\u001b[39m \u001b[1mpy to provide you \u001b[22m \u001b[90m│\u001b[39m \u001b[1mver $50. For order\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mllows: We offer fr\u001b[22m \u001b[90m│\u001b[39m \u001b[1mfee is $5.99. Orde\u001b[22m \u001b[90m│\u001b[39m \u001b[1mwith information a\u001b[22m \u001b[90m│\u001b[39m \u001b[1ms under $50, a fla\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mee standard shippi\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrs are typically p\u001b[22m \u001b[90m│\u001b[39m \u001b[1mbout our shipping \u001b[22m \u001b[90m│\u001b[39m \u001b[1mt rate shipping fe\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mng on all orders w\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrocessed within 1-\u001b[22m \u001b[90m│\u001b[39m \u001b[1mand return policie\u001b[22m \u001b[90m│\u001b[39m \u001b[1me of $5.99 applies\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mithin the United S\u001b[22m \u001b[90m│\u001b[39m \u001b[1m2 business days an\u001b[22m \u001b[90m│\u001b[39m \u001b[1ms.\u001b[22m \u001b[90m│\u001b[39m \u001b[1m. Orders are typic\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mtates. Once your o\u001b[22m \u001b[90m│\u001b[39m \u001b[1md standard shippin\u001b[22m \u001b[90m│\u001b[39m \u001b[1mFor shipping, we o\u001b[22m \u001b[90m│\u001b[39m \u001b[1mally processed wit\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mrder is placed, it\u001b[22m \u001b[90m│\u001b[39m \u001b[1mg usually takes 5-\u001b[22m \u001b[90m│\u001b[39m \u001b[1mffer free standard\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhin 1-2 business d\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m typically takes 1\u001b[22m \u001b[90m│\u001b[39m \u001b[1m7 business days.\u001b[22m \u001b[90m│\u001b[39m \u001b[1m shipping on all o\u001b[22m \u001b[90m│\u001b[39m \u001b[1mays and standard s\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m-3 business days f\u001b[22m \u001b[90m│\u001b[39m \u001b[1mAs for returns, we\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrders within the c\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhipping usually ta\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mor pro...\u001b[22m \u001b[90m│\u001b[39m \u001b[1m accep...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mountry. Once your \u001b[22m \u001b[90m│\u001b[39m \u001b[1mkes 5-...\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mor...\u001b[22m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m Jim \u001b[90m│\u001b[39m Can you provide mo \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m re info on your hi \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAssistant:\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAbsolutely\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mOf course,\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAbsolutely\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m king boot options? \u001b[90m│\u001b[39m \u001b[1m Of course, Jim! W\u001b[22m \u001b[90m│\u001b[39m \u001b[1m, Jim! We offer a \u001b[22m \u001b[90m│\u001b[39m \u001b[1m Jim! I'd be happy\u001b[22m \u001b[90m│\u001b[39m \u001b[1m, Jim! We offer a \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1me offer a wide ran\u001b[22m \u001b[90m│\u001b[39m \u001b[1mvariety of hiking \u001b[22m \u001b[90m│\u001b[39m \u001b[1m to provide you wi\u001b[22m \u001b[90m│\u001b[39m \u001b[1mvariety of hiking \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mge of hiking boot \u001b[22m \u001b[90m│\u001b[39m \u001b[1mboots designed for\u001b[22m \u001b[90m│\u001b[39m \u001b[1mth more informatio\u001b[22m \u001b[90m│\u001b[39m \u001b[1mboots designed to \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1moptions to suit di\u001b[22m \u001b[90m│\u001b[39m \u001b[1m different terrain\u001b[22m \u001b[90m│\u001b[39m \u001b[1mn on our hiking bo\u001b[22m \u001b[90m│\u001b[39m \u001b[1mprovide comfort, d\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mfferent needs and \u001b[22m \u001b[90m│\u001b[39m \u001b[1ms and weather cond\u001b[22m \u001b[90m│\u001b[39m \u001b[1mot options. We hav\u001b[22m \u001b[90m│\u001b[39m \u001b[1murability, and sup\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mpreferences. Our h\u001b[22m \u001b[90m│\u001b[39m \u001b[1mitions. \u001b[22m \u001b[90m│\u001b[39m \u001b[1me a wide range of \u001b[22m \u001b[90m│\u001b[39m \u001b[1mport for all your \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1miking boots are de\u001b[22m \u001b[90m│\u001b[39m \u001b[1m1. **Trail Hiking \u001b[22m \u001b[90m│\u001b[39m \u001b[1mhiking boots desig\u001b[22m \u001b[90m│\u001b[39m \u001b[1moutdoor adventures\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1msigned with durabi\u001b[22m \u001b[90m│\u001b[39m \u001b[1mBoots**: These are\u001b[22m \u001b[90m│\u001b[39m \u001b[1mned to cater to di\u001b[22m \u001b[90m│\u001b[39m \u001b[1m. Here are some of\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mlity, comfort, and\u001b[22m \u001b[90m│\u001b[39m \u001b[1m designed for mudd\u001b[22m \u001b[90m│\u001b[39m \u001b[1mfferent needs and \u001b[22m \u001b[90m│\u001b[39m \u001b[1m our popular optio\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m performance in mi\u001b[22m \u001b[90m│\u001b[39m \u001b[1my or rocky trails.\u001b[22m \u001b[90m│\u001b[39m \u001b[1mpreferences. Here \u001b[22m \u001b[90m│\u001b[39m \u001b[1mns:\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mnd. Here are some \u001b[22m \u001b[90m│\u001b[39m \u001b[1m They are waterpro\u001b[22m \u001b[90m│\u001b[39m \u001b[1mare a few of our p\u001b[22m \u001b[90m│\u001b[39m \u001b[1m1. **Trail Master \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mkey features and o\u001b[22m \u001b[90m│\u001b[39m \u001b[1mof, durable, and p\u001b[22m \u001b[90m│\u001b[39m \u001b[1mopular options:\u001b[22m \u001b[90m│\u001b[39m \u001b[1m3000**: This boot \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mptions you can con\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrovide excellent t\u001b[22m \u001b[90m│\u001b[39m \u001b[1m1. Trailblazer Hik\u001b[22m \u001b[90m│\u001b[39m \u001b[1mis waterproof and \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1msider:...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mraction.\u001b[22m \u001b[90m│\u001b[39m \u001b[1ming Boo...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mfeatures a rubber \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m2. *...\u001b[22m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1ms...\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────\u001b[39m\u001b[90m┼────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m Alice \u001b[90m│\u001b[39m What is the latest \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[\u001b[22m\u001b[39m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m trend in winter f \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mAs an ecom\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mThe latest\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mHello Alic\u001b[22m \u001b[90m│\u001b[39m \u001b[32m\u001b[1m22m\u001b[39mThe latest\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ootwear? \u001b[90m│\u001b[39m \u001b[1mmerce chat assista\u001b[22m \u001b[90m│\u001b[39m \u001b[1m trends in winter \u001b[22m \u001b[90m│\u001b[39m \u001b[1me! As a smart and \u001b[22m \u001b[90m│\u001b[39m \u001b[1m trends in winter \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mnt for a shoe comp\u001b[22m \u001b[90m│\u001b[39m \u001b[1mfootwear include c\u001b[22m \u001b[90m│\u001b[39m \u001b[1mbubbly chat assist\u001b[22m \u001b[90m│\u001b[39m \u001b[1mfootwear include c\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1many, I can tell yo\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhunky boots, espec\u001b[22m \u001b[90m│\u001b[39m \u001b[1mant for a shoe com\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhunky boots, espec\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mu that the latest \u001b[22m \u001b[90m│\u001b[39m \u001b[1mially those with t\u001b[22m \u001b[90m│\u001b[39m \u001b[1mpany, I'm here to \u001b[22m \u001b[90m│\u001b[39m \u001b[1mially those with l\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mtrend in winter fo\u001b[22m \u001b[90m│\u001b[39m \u001b[1mrack soles for ext\u001b[22m \u001b[90m│\u001b[39m \u001b[1mhelp you out. The \u001b[22m \u001b[90m│\u001b[39m \u001b[1mug soles for extra\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1motwear is the rise\u001b[22m \u001b[90m│\u001b[39m \u001b[1mra grip and a fash\u001b[22m \u001b[90m│\u001b[39m \u001b[1mlatest trend in wi\u001b[22m \u001b[90m│\u001b[39m \u001b[1m grip and a fashio\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m of chunky boots. \u001b[22m \u001b[90m│\u001b[39m \u001b[1mionable edge. Fur-\u001b[22m \u001b[90m│\u001b[39m \u001b[1mnter footwear is a\u001b[22m \u001b[90m│\u001b[39m \u001b[1mnable edge. Combat\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mChunky boots, with\u001b[22m \u001b[90m│\u001b[39m \u001b[1mlined boots and th\u001b[22m \u001b[90m│\u001b[39m \u001b[1mll about combining\u001b[22m \u001b[90m│\u001b[39m \u001b[1m boots are also ve\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m their thick soles\u001b[22m \u001b[90m│\u001b[39m \u001b[1mose with shearling\u001b[22m \u001b[90m│\u001b[39m \u001b[1m style and functio\u001b[22m \u001b[90m│\u001b[39m \u001b[1mry popular. For a \u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m and sturdy constr\u001b[22m \u001b[90m│\u001b[39m \u001b[1m details are also \u001b[22m \u001b[90m│\u001b[39m \u001b[1mnality. One popula\u001b[22m \u001b[90m│\u001b[39m \u001b[1mmore elegant look,\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1muction, are not on\u001b[22m \u001b[90m│\u001b[39m \u001b[1mpopular for their \u001b[22m \u001b[90m│\u001b[39m \u001b[1mr trend this seaso\u001b[22m \u001b[90m│\u001b[39m \u001b[1m knee-high and ove\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mly fashionable but\u001b[22m \u001b[90m│\u001b[39m \u001b[1mwarmth and style. \u001b[22m \u001b[90m│\u001b[39m \u001b[1mn is chunky boots \u001b[22m \u001b[90m│\u001b[39m \u001b[1mr-the-knee boots a\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1m also practical fo\u001b[22m \u001b[90m│\u001b[39m \u001b[1mOver-the-knee boot\u001b[22m \u001b[90m│\u001b[39m \u001b[1mwith lug soles. Th\u001b[22m \u001b[90m│\u001b[39m \u001b[1mre trending, parti\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[1mr the ...\u001b[22m \u001b[90m│\u001b[39m \u001b[1ms and ...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mese bo...\u001b[22m \u001b[90m│\u001b[39m \u001b[1mcularl...\u001b[22m \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m└────────────────────\u001b[39m\u001b[90m┴────────────────────\u001b[39m\u001b[90m┴────────────────────\u001b[39m\u001b[90m┴────────────────────\u001b[39m\u001b[90m┴────────────────────\u001b[39m\u001b[90m┴────────────────────┘\u001b[39m\n",
|
||||
"======================================================================\n",
|
||||
"\u001b[32m✔\u001b[39m Evaluation complete.\n",
|
||||
"\n",
|
||||
"Run \u001b[92mpromptfoo view\u001b[39m to use the local web viewer\n",
|
||||
"Run \u001b[92mpromptfoo share\u001b[39m to create a shareable URL\n",
|
||||
"======================================================================\n",
|
||||
"\u001b[32m\u001b[1mSuccesses: 20\u001b[22m\u001b[39m\n",
|
||||
"\u001b[31m\u001b[1mFailures: 0\u001b[22m\u001b[39m\n",
|
||||
"Token usage: Total 3511, Prompt 762, Completion 2749, Cached 0\n",
|
||||
"Done.\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Now we can share and view this in the web viewer, which is easier to use and contains tools for drilling down and grading the outputs."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "OBU2upLn1b0x"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!npx promptfoo@latest share --yes"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "QQq6hV6006Cv",
|
||||
"outputId": "93dd5b45-acce-4051-8d19-185469a67bab"
|
||||
},
|
||||
"execution_count": 15,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"View results: \u001b[92m\u001b[1mhttps://app.promptfoo.dev/eval/f:ffaf05af-6d34-4ff3-b035-2a73e0c97375\u001b[22m\u001b[39m\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"# Eval a Python script in Colab"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "NB4Xw1mh2ILr"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"This next section provides an example of how to run an eval on Python code within Colab.\n",
|
||||
"\n",
|
||||
"The goal in this example is to set up a LangChain implementation and compare it against vanilla GPT-4. We're using LangChain in this example, but this approach works for any code."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "v8o97GM_2LWo"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!pip install langchain openai"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "1MIC2FYu2aV1"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Here's a chain that uses `LLMMathChain` to do math:"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "RzTNvbVL-StX"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"%%writefile langchain_example.py\n",
|
||||
"import sys\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"from langchain import OpenAI\n",
|
||||
"from langchain.chains import LLMMathChain\n",
|
||||
"\n",
|
||||
"llm = OpenAI(\n",
|
||||
" temperature=0,\n",
|
||||
" openai_api_key=os.getenv('OPENAI_API_KEY')\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"llm_math = LLMMathChain(llm=llm, verbose=True)\n",
|
||||
"\n",
|
||||
"llm_math.run(sys.argv[1])"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "2O1JriWH2oX1",
|
||||
"outputId": "964857d5-cb79-4f59-a909-548dc7474c00"
|
||||
},
|
||||
"execution_count": 10,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Overwriting langchain_example.py\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Next, we'll set up the test cases."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "rRyGlgld-a0C"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"%%writefile mathconfig.yaml\n",
|
||||
"prompts: mathprompt.txt\n",
|
||||
"providers:\n",
|
||||
" - openai:chat:gpt-4-0613\n",
|
||||
" - exec:python langchain_example.py\n",
|
||||
"tests:\n",
|
||||
" - vars:\n",
|
||||
" question: What is the cube root of 389017?\n",
|
||||
" - vars:\n",
|
||||
" question: If you have 101101 in binary, what number does it represent in base 10?\n",
|
||||
" - vars:\n",
|
||||
" question: What is the natural logarithm (ln) of 89234?\n",
|
||||
" - vars:\n",
|
||||
" question: If a geometric series has a first term of 3125 and a common ratio of 0.008, what is the sum of the first 20 terms?\n",
|
||||
" - vars:\n",
|
||||
" question: A number in base 7 is 3526. What is this number in base 10?\n",
|
||||
" - vars:\n",
|
||||
" question: If a complex number is represented as 3 + 4i, what is its magnitude?\n",
|
||||
" - vars:\n",
|
||||
" question: What is the fourth root of 1296?"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "1SSUsw-X2yZn",
|
||||
"outputId": "7c6d1cc9-7fd3-4076-f709-3cfd50601750"
|
||||
},
|
||||
"execution_count": 17,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Overwriting mathconfig.yaml\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"And provide a prompt:"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "c38wkAVA-dsh"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"%%writefile mathprompt.txt\n",
|
||||
"Think carefully and answer this math problem: {{question}}"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "CQ8st-b92-HL",
|
||||
"outputId": "b5bc9cc9-e543-42c0-ba01-22a6090827f6"
|
||||
},
|
||||
"execution_count": 12,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Overwriting mathprompt.txt\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"Run the eval to produce a comparison:"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "dfauBPgj-fIe"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!npx promptfoo@latest eval -c mathconfig.yaml --no-progress-bar"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "IAznI7r-3F1D",
|
||||
"outputId": "ce1da4dc-888d-4a3f-cf09-b20e4cfa0e84"
|
||||
},
|
||||
"execution_count": 19,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"\n",
|
||||
"\u001b[90m┌────────────────────────────────────────\u001b[39m\u001b[90m┬────────────────────────────────────────\u001b[39m\u001b[90m┬────────────────────────────────────────┐\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m question \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m [openai:gpt-4-0613] Think carefully an \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m [exec:python langchain_example.py] Thi \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m d answer this math problem: {{question \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m nk carefully and answer this math prob \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m\u001b[1m\u001b[34m \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m }} \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\u001b[1m\u001b[34m lem: {{question}} \u001b[39m\u001b[22m\u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m What is the cube root of 389017? \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39mThe cube roo \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m t of 389017 is 73. \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m roblem: What is the cube root of 38901 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 7? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 389017**(1/3) \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...numexpr.evaluate(\"389017**(1/3)\").. \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m . \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Answer: 72.99999999999999 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m > Finished chain. \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m If you have 101101 in binary, what num \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39mThe number 1 \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m ber does it represent in base 10? \u001b[90m│\u001b[39m 01101 in binary represents 45 in base \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 10. \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m roblem: If you have 101101 in binary, \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m what number does it represent in base \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 10? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 2**5 + 2**4 + 2**3 + 2**2 + 2**1 + 2** \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 0 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...numexpr.evaluate(\"2**5 + 2**4 + 2** \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 3 + 2... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m What is the natural logarithm (ln) of \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39mThe natural \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m 89234? \u001b[90m│\u001b[39m logarithm of a number is not something \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m that can be easily calculated without \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m a calculator or a computer. However, \u001b[90m│\u001b[39m roblem: What is the natural logarithm \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m using a calculator, the natural logari \u001b[90m│\u001b[39m (ln) of 89234? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m thm (ln) of 89234 is approximately 11. \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 397210. \u001b[90m│\u001b[39m log(89234) \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...numexpr.evaluate(\"log(89234)\")... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Answer: 11.399017411862108 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m > Finished chain. \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m If a geometric series has a first term \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39mThe sum of t \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m of 3125 and a common ratio of 0.008, \u001b[90m│\u001b[39m he first n terms of a geometric series \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m what is the sum of the first 20 terms? \u001b[90m│\u001b[39m can be found using the formula S_n = \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m a * (1 - r^n) / (1 - r), where a is th \u001b[90m│\u001b[39m roblem: If a geometric series has a fi \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m e first term, r is the common ratio, a \u001b[90m│\u001b[39m rst term of 3125 and a common ratio of \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m nd n is the number of terms. \u001b[90m│\u001b[39m 0.008, what is the sum of the first 2 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Substituting the given values into the \u001b[90m│\u001b[39m 0 terms? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m formula, we get: \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ... \u001b[90m│\u001b[39m 3125 * (1 - 0.008**20) / (1 - 0.008) \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m A number in base 7 is 3526. What is th \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39mThe number i \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m is number in base 10? \u001b[90m│\u001b[39m s 1303 in base 10. \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m roblem: A number in base 7 is 3526. Wh \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m at is this number in base 10? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 7**4*3 + 7**3*5 + 7**2*2 + 7**1*6 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...numexpr.evaluate(\"7**4*3 + 7**3*5 + \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 7**2*2 + 7**1*6\")... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m If a complex number is represented as \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39mThe magnitud \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m 3 + 4i, what is its magnitude? \u001b[90m│\u001b[39m e of a complex number is calculated us \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ing the formula √(a² + b²), where a an \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m d b are the real and imaginary parts o \u001b[90m│\u001b[39m roblem: If a complex number is represe \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m f the complex number, respectively. \u001b[90m│\u001b[39m nted as 3 + 4i, what is its magnitude? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m So, for the complex number 3 + 4i, its \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m magnitude would be √(3² + 4²) = √(9 + \u001b[90m│\u001b[39m (3**2 + 4**2)**(1/2) \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 16) = √25... \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...numexpr.evaluate(\"(3**2 + 4**2)**(1 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m /2)\")... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Answer: 5.0 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m > Fin... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m├────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────\u001b[39m\u001b[90m┼────────────────────────────────────────┤\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m What is the fourth root of 1296? \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m6 \u001b[90m│\u001b[39m \u001b[32m\u001b[1m[PASS] \u001b[22m\u001b[39m> Entering n \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ew LLMMathChain chain... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Think carefully and answer this math p \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m roblem: What is the fourth root of 129 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 6? \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ```text \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m 1296**(1/4) \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ``` \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m ...numexpr.evaluate(\"1296**(1/4)\")... \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m Answer: 6.0 \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m│\u001b[39m \u001b[90m│\u001b[39m \u001b[90m│\u001b[39m > Finished chain. \u001b[90m│\u001b[39m\n",
|
||||
"\u001b[90m└────────────────────────────────────────\u001b[39m\u001b[90m┴────────────────────────────────────────\u001b[39m\u001b[90m┴────────────────────────────────────────┘\u001b[39m\n",
|
||||
"======================================================================\n",
|
||||
"\u001b[32m✔\u001b[39m Evaluation complete.\n",
|
||||
"\n",
|
||||
"Run \u001b[92mpromptfoo view\u001b[39m to use the local web viewer\n",
|
||||
"Run \u001b[92mpromptfoo share\u001b[39m to create a shareable URL\n",
|
||||
"======================================================================\n",
|
||||
"\u001b[32m\u001b[1mSuccesses: 14\u001b[22m\u001b[39m\n",
|
||||
"\u001b[31m\u001b[1mFailures: 0\u001b[22m\u001b[39m\n",
|
||||
"Token usage: Total 49, Prompt 34, Completion 15, Cached 448\n",
|
||||
"Done.\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"And view the results in the web viewer:"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "UdjUshgS-kg8"
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!npx promptfoo@latest share --yes"
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "ZOgBi_Er3rgv",
|
||||
"outputId": "3150bee8-7fa8-4fdd-eb34-d0b2790c076c"
|
||||
},
|
||||
"execution_count": 20,
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"View results: \u001b[92m\u001b[1mhttps://app.promptfoo.dev/eval/f:f447942f-42a9-46fa-b38e-3ec7bf860418\u001b[22m\u001b[39m\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
Reference in New Issue
Block a user