generate-data and some eval
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now let's get to the fun part -- training a model. We'll start by installing our dependencies."
|
||||
"Now let's get to the fun part -- training a model. I'll start by installing the dependencies."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -177,11 +177,11 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We'll use the [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) library to manage our training run. It includes a lot of neat tricks that speed up training without sacrificing quality.\n",
|
||||
"I'll use the [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) library to manage our training run. It includes a lot of neat tricks that speed up training without sacrificing quality.\n",
|
||||
"\n",
|
||||
"In this case we'll use 8-bit training to use less GPU RAM, and sample packing to maximize GPU utilization. You can read more about the available options at https://github.com/OpenAccess-AI-Collective/axolotl.\n",
|
||||
"In this case I'm using 8-bit training to use less GPU RAM, and sample packing to maximize GPU utilization. You can read more about the available options at https://github.com/OpenAccess-AI-Collective/axolotl.\n",
|
||||
"\n",
|
||||
"The training run options we're using here are defined in [training-args.yaml](./training-args.yaml)."
|
||||
"The training run options are defined in [training-config.yaml](./training-config.yaml)."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -365,16 +365,16 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!accelerate launch ./axolotl/scripts/finetune.py training-args.yaml"
|
||||
"!accelerate launch ./axolotl/scripts/finetune.py training-config.yaml"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Nice work! If you look on your filesystem you should see a new directory `./models/recipe-model`. This contains your trained model, which you can use to classify more recipes.\n",
|
||||
"Sweet! If you look on your filesystem you should see a new directory `./models/run1`. This contains your trained model, which you can use to classify more recipes.\n",
|
||||
"\n",
|
||||
"Before we using it though, we need to *merge* the model. We trained our model using [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora), which is a memory-efficient training method. But the inference library we'll use for testing doesn't support LoRA models yet, so we need to \"merge\" our LoRA model to transform it into a standard Llama2-style model. We've defined a helper to do that that we'll use below."
|
||||
"There's one more step though. I trained our model using [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora), which is a memory-efficient training method. But the inference library we'll use for testing doesn't support LoRA models directly yet, so we need to \"merge\" our LoRA model to transform it into a standard Llama2-shaped model. I've defined a small helper to do that called `merge_lora_model` that I'll use below."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -418,7 +418,7 @@
|
||||
"from utils import merge_lora_model\n",
|
||||
"\n",
|
||||
"print(\"Merging model (this could take a while)\")\n",
|
||||
"final_model_dir = merge_lora_model(\"training-args.yaml\")\n",
|
||||
"final_model_dir = merge_lora_model(\"training-config.yaml\")\n",
|
||||
"print(f\"Final model saved to '{final_model_dir}'\")\n"
|
||||
]
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user