more benchmarking
This commit is contained in:
@@ -11,7 +11,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -19,6 +19,8 @@
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Requirement already satisfied: openpipe==3.0.3 in /usr/local/lib/python3.10/dist-packages (3.0.3)\n",
|
||||
"Requirement already satisfied: python-dotenv==1.0.0 in /usr/local/lib/python3.10/dist-packages (1.0.0)\n",
|
||||
"Requirement already satisfied: joblib==1.3.2 in /usr/local/lib/python3.10/dist-packages (1.3.2)\n",
|
||||
"Requirement already satisfied: attrs<24.0.0,>=23.1.0 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (23.1.0)\n",
|
||||
"Requirement already satisfied: httpx<0.25.0,>=0.24.1 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (0.24.1)\n",
|
||||
"Requirement already satisfied: openai<0.28.0,>=0.27.8 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (0.27.9)\n",
|
||||
@@ -63,14 +65,15 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Recipe dataset:\n"
|
||||
"Recipe dataset shape:\n",
|
||||
"------------------\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -90,7 +93,7 @@
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"First recipe:\n",
|
||||
" Shrimp Creole\n",
|
||||
"------------------ Shrimp Creole\n",
|
||||
"\n",
|
||||
"Ingredients:\n",
|
||||
"- 20 shrimp (8 oz.)\n",
|
||||
@@ -128,27 +131,37 @@
|
||||
"source": [
|
||||
"Mm, delicious. Anyway, we need to generate a training dataset. We'll call GPT-4 on each of our examples.\n",
|
||||
"\n",
|
||||
"We'll use [OpenPipe](https://github.com/openpipe/openpipe) to track our calls and form a training dataset. Create an account and a project, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`."
|
||||
"In this case, I'll ask GPT-4 to classify each recipe along 5 dimensions:\n",
|
||||
" - has_non_fish_meat\n",
|
||||
" - requires_oven\n",
|
||||
" - requires_stove\n",
|
||||
" - cook_time_over_30_mins\n",
|
||||
" - main_dish\n",
|
||||
"\n",
|
||||
"That looks like a pretty random list, but there's actually an important unifying thread: we're looking for meals that my pescatarian brother can eat in his kitchen-less, near-window-less basement apartment in San Francisco! (If you haven't tried to get an apartment in SF you probably think I'm joking 😂.)\n",
|
||||
"\n",
|
||||
"We'll use [OpenPipe](https://github.com/openpipe/openpipe) to track our calls and form a training dataset. Create an account and a project, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`. You can see an example in [./.env.example](./.env.example)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'has_non_fish_meat': False,\n",
|
||||
" 'requires_oven': True,\n",
|
||||
" 'requires_stove': True,\n",
|
||||
" 'cook_time_over_30_mins': False,\n",
|
||||
" 'main_dish': False}"
|
||||
]
|
||||
},
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Classifying first recipe:\n",
|
||||
"------------------\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{'has_non_fish_meat': False, 'requires_oven': False, 'requires_stove': True, 'cook_time_over_30_mins': True, 'main_dish': True}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
@@ -157,10 +170,13 @@
|
||||
"import os\n",
|
||||
"import dotenv\n",
|
||||
"\n",
|
||||
"# Use `dotenv` to load the contents of the `.env` file into the environment\n",
|
||||
"dotenv.load_dotenv()\n",
|
||||
"\n",
|
||||
"# Configure OpenPipe using the API key from the environment\n",
|
||||
"configure_openpipe(api_key=os.environ[\"OPENPIPE_API_KEY\"])\n",
|
||||
"\n",
|
||||
"# Configure OpenAI using the API key from the environment\n",
|
||||
"openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
@@ -222,12 +238,20 @@
|
||||
" return json.loads(completion.choices[0].message.function_call.arguments)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"classify_recipe(recipes[\"recipe\"][-1])\n"
|
||||
"print(\"Classifying first recipe:\\n------------------\")\n",
|
||||
"print(classify_recipe(recipes[\"recipe\"][0]))\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"That's working, so I'll go ahead and classify all 5000 recipes with GPT-4. Using GPT-4 for this is slowwww and costs about $40. The model I'm fine-tuning will be much faster -- let's see if we can make it as good!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -238,8 +262,6 @@
|
||||
"Classifying recipe 100/5000: Spoon Bread\n",
|
||||
"Classifying recipe 200/5000: Quadrangle Grille'S Pumpkin-Walnut Cheesecake\n",
|
||||
"Classifying recipe 300/5000: Broccoli Casserole\n",
|
||||
"Error reporting to OpenPipe: 520 is not a valid HTTPStatus\n",
|
||||
"520 is not a valid HTTPStatus\n",
|
||||
"Classifying recipe 400/5000: Paal Payasam (3-Ingredient Rice Pudding)\n",
|
||||
"Classifying recipe 500/5000: Dirt Dessert\n",
|
||||
"Classifying recipe 600/5000: Dolma, Stuffed Dried Peppers And Eggplants\n",
|
||||
@@ -265,21 +287,22 @@
|
||||
"Classifying recipe 2600/5000: Pepperoni Bread\n",
|
||||
"Classifying recipe 2700/5000: Sabzi Polow\n",
|
||||
"Classifying recipe 2800/5000: Italian Vegetable Pizzas\n",
|
||||
"Error classifying recipe 2801: Bad gateway. {\"error\":{\"code\":502,\"message\":\"Bad gateway.\",\"param\":null,\"type\":\"cf_bad_gateway\"}} 502 {'error': {'code': 502, 'message': 'Bad gateway.', 'param': None, 'type': 'cf_bad_gateway'}} {'Date': 'Thu, 24 Aug 2023 15:44:45 GMT', 'Content-Type': 'application/json', 'Content-Length': '84', 'Connection': 'keep-alive', 'X-Frame-Options': 'SAMEORIGIN', 'Referrer-Policy': 'same-origin', 'Cache-Control': 'private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'Expires': 'Thu, 01 Jan 1970 00:00:01 GMT', 'Server': 'cloudflare', 'CF-RAY': '7fbca943df684de1-MCI', 'alt-svc': 'h3=\":443\"; ma=86400'}\n",
|
||||
"Classifying recipe 2900/5000: Hot Fudge Sauce, Soda Shop Style\n",
|
||||
"Classifying recipe 3000/5000: Meatball Soup With Vegetables And Brown Rice\n",
|
||||
"Classifying recipe 3100/5000: Herbed Potatoes And Onions\n",
|
||||
"Classifying recipe 3200/5000: Apple Crunch Pie (2 Extra Servings)\n",
|
||||
"Classifying recipe 3300/5000: Pineapple-Orange Punch\n",
|
||||
"Classifying recipe 3400/5000: Turkey Veggie Burgers With Avocado Mayo\n",
|
||||
"Error reporting to OpenPipe: 520 is not a valid HTTPStatus\n",
|
||||
"520 is not a valid HTTPStatus\n",
|
||||
"Classifying recipe 3500/5000: Pear & Goat Cheese Salad\n",
|
||||
"Classifying recipe 3600/5000: Triple Chocolate Cookies\n",
|
||||
"Classifying recipe 3700/5000: Strawberry Banana Yogurt Pops\n",
|
||||
"Error classifying recipe 3779: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=600)\n",
|
||||
"Classifying recipe 3800/5000: Chicken Croquettes\n",
|
||||
"Classifying recipe 3900/5000: Mushroom Casserole\n"
|
||||
"Classifying recipe 3900/5000: Mushroom Casserole\n",
|
||||
"Classifying recipe 4000/5000: Vegetarian Summer Roll\n",
|
||||
"Classifying recipe 4100/5000: Prune Cake\n",
|
||||
"Classifying recipe 4200/5000: Strawberry Sorbet\n",
|
||||
"Classifying recipe 4300/5000: Lemonade Chicken\n",
|
||||
"Classifying recipe 4400/5000: Crock-Pot Vegetarian Chili\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -295,12 +318,14 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"Ok, we have our "
|
||||
"Ok, now that my recipes are classified I'll download the training data. \n",
|
||||
"\n",
|
||||
"Next up I'll train the model -- check out [./train.ipynb](./train.ipynb) for details! Just go to https://app.openpipe.ai/request-logs, select all the logs you created, and click \"Export\". The default 10% testing split is fine for this dataset size.\n",
|
||||
"\n",
|
||||
"I got two files from that: `train.jsonl` and `test.jsonl`. I moved both of them into this repository under `./data/`."
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
Reference in New Issue
Block a user