more work

2023-08-24 23:49:44 +00:00
parent 14eae45d18
commit 40638a7848
9 changed files with 2708 additions and 463 deletions
--- a/examples/classify-recipes/generate-data.ipynb
+++ b/examples/classify-recipes/generate-data.ipynb
@@ -11,7 +11,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
@@ -53,7 +53,7 @@
    }
   ],
   "source": [
-    "%pip install openpipe==3.0.3 python-dotenv==1.0.0 joblib==1.3.2"
+    "%pip install openpipe==3.0.3 python-dotenv==1.0.0 joblib==1.3.2 datasets==2.14.4"
   ]
  },
  {
@@ -65,7 +65,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
@@ -138,14 +138,14 @@
    " - cook_time_over_30_mins\n",
    " - main_dish\n",
    "\n",
-    "That looks like a pretty random list, but there's actually an important unifying thread: we're looking for meals that my pescatarian brother can eat in his kitchen-less, near-window-less basement apartment in San Francisco! (If you haven't tried to get an apartment in SF you probably think I'm joking 😂.)\n",
+    "That looks like a pretty random list, but there's actually an important unifying thread: I'm looking for meals that my pescatarian brother/co-founder can make in his kitchen-less, near-window-less basement apartment in San Francisco! (If you haven't tried to get an apartment in SF you probably think I'm joking 😂.)\n",
    "\n",
-    "We'll use [OpenPipe](https://github.com/openpipe/openpipe) to track our calls and form a training dataset. Create an account and a project, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`. You can see an example in [./.env.example](./.env.example)."
+    "I'll use [OpenPipe](https://github.com/openpipe/openpipe) to track the API calls and form a training dataset. To follow along you'll need to create a free OpenPipe account, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`. You can see an example in [./.env.example](./.env.example)."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
@@ -153,13 +153,7 @@
     "output_type": "stream",
     "text": [
      "Classifying first recipe:\n",
-      "------------------\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
+      "------------------\n",
      "{'has_non_fish_meat': False, 'requires_oven': False, 'requires_stove': True, 'cook_time_over_30_mins': True, 'main_dish': True}\n"
     ]
    }
@@ -225,7 +219,7 @@
    "                        \"requires_oven\",\n",
    "                        \"requires_stove\",\n",
    "                        \"cook_time_over_30_mins\",\n",
-    "                        \"main_course\",\n",
+    "                        \"main_dish\",\n",
    "                    ],\n",
    "                },\n",
    "            }\n",
@@ -246,12 +240,12 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "That's working, so I'll go ahead and classify all 5000 recipes with GPT-4. Using GPT-4 for this is slowwww and costs about $40. The model I'm fine-tuning will be much faster -- let's see if we can make it as good!"
+    "That's working, so I'll go ahead and classify all 5000 recipes with GPT-4. Using GPT-4 for this is slowwww and costs about $40. The model I'm fine-tuning will be much faster -- we'll see if we can make it as good!"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
@@ -302,7 +296,12 @@
      "Classifying recipe 4100/5000: Prune Cake\n",
      "Classifying recipe 4200/5000: Strawberry Sorbet\n",
      "Classifying recipe 4300/5000: Lemonade Chicken\n",
-      "Classifying recipe 4400/5000: Crock-Pot Vegetarian Chili\n"
+      "Classifying recipe 4400/5000: Crock-Pot Vegetarian Chili\n",
+      "Classifying recipe 4500/5000: Grandma Dickrell'S Molasses Cake - 1936\n",
+      "Classifying recipe 4600/5000: Creamed Corn Casserole\n",
+      "Classifying recipe 4700/5000: Homemade Croutons\n",
+      "Classifying recipe 4800/5000: Potatoes With Leeks And Gruyere\n",
+      "Classifying recipe 4900/5000: Chocolate Oatmeal Cookie\n"
     ]
    }
   ],