Update building_evals.ipynb

multipe -> multiple
This commit is contained in:
Ikko Eltociear Ashimine
2024-03-09 04:06:39 +09:00
committed by GitHub
parent 126b05a907
commit 7367e82f40

View File

@@ -374,7 +374,7 @@
"Now you know about different grading design patterns for evals, and are ready to start building your own. As you do, here are a few guiding pieces of wisdom to get you started.\n", "Now you know about different grading design patterns for evals, and are ready to start building your own. As you do, here are a few guiding pieces of wisdom to get you started.\n",
"- Make your evals specific to your task whenever possible, and try to have the distribution in your eval represent ~ the real life distribution of questions and question difficulties.\n", "- Make your evals specific to your task whenever possible, and try to have the distribution in your eval represent ~ the real life distribution of questions and question difficulties.\n",
"- The only way to know if a model-based grader can do a good job grading your task is to try. Try it out and read some samples to see if your task is a good candidate.\n", "- The only way to know if a model-based grader can do a good job grading your task is to try. Try it out and read some samples to see if your task is a good candidate.\n",
"- Often all that lies between you and an automatable eval is clever design. Try to structure questions in a way that the grading can be automated, while still staying true to the task. Reformatting questions into multipe choice is a common tactic here.\n", "- Often all that lies between you and an automatable eval is clever design. Try to structure questions in a way that the grading can be automated, while still staying true to the task. Reformatting questions into multiple choice is a common tactic here.\n",
"- In general, your preference should be for higher volume and lower quality of questions over very low volume with high quality." "- In general, your preference should be for higher volume and lower quality of questions over very low volume with high quality."
] ]
} }