more work

This commit is contained in:
Kyle Corbitt
2023-08-24 23:49:44 +00:00
parent 14eae45d18
commit 40638a7848
9 changed files with 2708 additions and 463 deletions

View File

@@ -11,7 +11,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@@ -53,7 +53,7 @@
}
],
"source": [
"%pip install openpipe==3.0.3 python-dotenv==1.0.0 joblib==1.3.2"
"%pip install openpipe==3.0.3 python-dotenv==1.0.0 joblib==1.3.2 datasets==2.14.4"
]
},
{
@@ -65,7 +65,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 2,
"metadata": {},
"outputs": [
{
@@ -138,14 +138,14 @@
" - cook_time_over_30_mins\n",
" - main_dish\n",
"\n",
"That looks like a pretty random list, but there's actually an important unifying thread: we're looking for meals that my pescatarian brother can eat in his kitchen-less, near-window-less basement apartment in San Francisco! (If you haven't tried to get an apartment in SF you probably think I'm joking 😂.)\n",
"That looks like a pretty random list, but there's actually an important unifying thread: I'm looking for meals that my pescatarian brother/co-founder can make in his kitchen-less, near-window-less basement apartment in San Francisco! (If you haven't tried to get an apartment in SF you probably think I'm joking 😂.)\n",
"\n",
"We'll use [OpenPipe](https://github.com/openpipe/openpipe) to track our calls and form a training dataset. Create an account and a project, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`. You can see an example in [./.env.example](./.env.example)."
"I'll use [OpenPipe](https://github.com/openpipe/openpipe) to track the API calls and form a training dataset. To follow along you'll need to create a free OpenPipe account, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`. You can see an example in [./.env.example](./.env.example)."
]
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 3,
"metadata": {},
"outputs": [
{
@@ -153,13 +153,7 @@
"output_type": "stream",
"text": [
"Classifying first recipe:\n",
"------------------\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"------------------\n",
"{'has_non_fish_meat': False, 'requires_oven': False, 'requires_stove': True, 'cook_time_over_30_mins': True, 'main_dish': True}\n"
]
}
@@ -225,7 +219,7 @@
" \"requires_oven\",\n",
" \"requires_stove\",\n",
" \"cook_time_over_30_mins\",\n",
" \"main_course\",\n",
" \"main_dish\",\n",
" ],\n",
" },\n",
" }\n",
@@ -246,12 +240,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"That's working, so I'll go ahead and classify all 5000 recipes with GPT-4. Using GPT-4 for this is slowwww and costs about $40. The model I'm fine-tuning will be much faster -- let's see if we can make it as good!"
"That's working, so I'll go ahead and classify all 5000 recipes with GPT-4. Using GPT-4 for this is slowwww and costs about $40. The model I'm fine-tuning will be much faster -- we'll see if we can make it as good!"
]
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 4,
"metadata": {},
"outputs": [
{
@@ -302,7 +296,12 @@
"Classifying recipe 4100/5000: Prune Cake\n",
"Classifying recipe 4200/5000: Strawberry Sorbet\n",
"Classifying recipe 4300/5000: Lemonade Chicken\n",
"Classifying recipe 4400/5000: Crock-Pot Vegetarian Chili\n"
"Classifying recipe 4400/5000: Crock-Pot Vegetarian Chili\n",
"Classifying recipe 4500/5000: Grandma Dickrell'S Molasses Cake - 1936\n",
"Classifying recipe 4600/5000: Creamed Corn Casserole\n",
"Classifying recipe 4700/5000: Homemade Croutons\n",
"Classifying recipe 4800/5000: Potatoes With Leeks And Gruyere\n",
"Classifying recipe 4900/5000: Chocolate Oatmeal Cookie\n"
]
}
],