This commit is contained in:
Kyle Corbitt
2023-06-26 14:57:53 -07:00
parent ce783f6279
commit 12af15ae32
4 changed files with 25 additions and 16 deletions

View File

@@ -1,8 +1,8 @@
(Note: this repository practices [Readme Driven Development](https://tom.preston-werner.com/2010/08/23/readme-driven-development.html). This README documents what we **want** our UX to be, not what it is right now. We'll remove this note once the repository is ready for outside testing. Thanks for your patience! 🙏) (Note: this repository practices [Readme Driven Development](https://tom.preston-werner.com/2010/08/23/readme-driven-development.html). This README documents what we **want** our UX to be, not what it is right now. We'll remove this note once the repository is ready for outside testing. Thanks for your patience! 🙏)
# 🛠 Prompt Bench # 🛠 Prompt Lab
Prompt Bench is a powerful toolset to optimize your LLM prompts. It lets you quickly generate, test and compare candidate prompts with realistic sample data. Prompt Lab is a powerful toolset to optimize your LLM prompts. It lets you quickly generate, test and compare candidate prompts with realistic sample data.
## High-Level Features ## High-Level Features
@@ -13,17 +13,17 @@ Set up multiple prompt configurations and compare their output side-by-side. Eac
Inspect prompt completions side-by-side. Inspect prompt completions side-by-side.
**Test Many Inputs** **Test Many Inputs**
Prompt Bench lets you *template* a prompt. Use the templating feature to run the prompts you're testing against many potential inputs for broader coverage of your problem space than you'd get with manual testing. Prompt Lab lets you *template* a prompt. Use the templating feature to run the prompts you're testing against many potential inputs for broader coverage of your problem space than you'd get with manual testing.
**Automatically Evaluate Prompt Quality** **Automatically Evaluate Prompt Quality**
1. If you're extracting structured data, Prompt Bench lets you define the expected output for each input sample and will automatically score each prompt variant for accuracy. 1. If you're extracting structured data, Prompt Lab lets you define the expected output for each input sample and will automatically score each prompt variant for accuracy.
2. If you're generating free-form text, Prompt Bench lets you either (a) manually review outputs side-by-side to compare quality, or (b) configure a GPT-4 based evaluator to compare and score your completions automatically. 🧞‍♂️ 2. If you're generating free-form text, Prompt Lab lets you either (a) manually review outputs side-by-side to compare quality, or (b) configure a GPT-4 based evaluator to compare and score your completions automatically. 🧞‍♂️
**🪄 Auto-generate Prompts and Data** **🪄 Auto-generate Prompts and Data**
Prompt Bench includes easy tools to generate both new prompt variants and new test inputs. It can even use the test inputs with incorrect results to guide the variant generation more intelligently! Prompt Lab includes easy tools to generate both new prompt variants and new test inputs. It can even use the test inputs with incorrect results to guide the variant generation more intelligently!
## Supported Models ## Supported Models
Prompt Bench currently supports GPT-3.5 and GPT-4. Wider model support is planned. Prompt Lab currently supports GPT-3.5 and GPT-4. Wider model support is planned.
## More Features ## More Features
@@ -38,7 +38,7 @@ Prompt Bench currently supports GPT-3.5 and GPT-4. Wider model support is planne
# Running Locally # Running Locally
We'll have a hosted version of Prompt Bench soon to make onboarding easier but for now you can run it locally. We'll have a hosted version of Prompt Lab soon to make onboarding easier but for now you can run it locally.
1. Install [NodeJS 20](https://nodejs.org/en/download/current) (earlier versions will likely work but aren't tested) 1. Install [NodeJS 20](https://nodejs.org/en/download/current) (earlier versions will likely work but aren't tested)
2. Install `pnpm`: `npm i -g pnpm` 2. Install `pnpm`: `npm i -g pnpm`

View File

@@ -5,7 +5,7 @@ export default function AppNav(props: { children: React.ReactNode; title?: strin
return ( return (
<Flex minH="100vh"> <Flex minH="100vh">
<Head> <Head>
<title>{props.title ? `${props.title} | Prompt Bench` : "Prompt Bench"}</title> <title>{props.title ? `${props.title} | Prompt Lab` : "Prompt Lab"}</title>
</Head> </Head>
{/* Placeholder for now */} {/* Placeholder for now */}
<Box bgColor="gray.100" flexShrink={0} width="200px"> <Box bgColor="gray.100" flexShrink={0} width="200px">

View File

@@ -13,7 +13,7 @@ export default function ScenarioHeader() {
const [variables, setVariables] = useState<string[]>(initialVariables.map((v) => v.label)); const [variables, setVariables] = useState<string[]>(initialVariables.map((v) => v.label));
const [newVariable, setNewVariable] = useState<string>(""); const [newVariable, setNewVariable] = useState<string>("");
const [editing, setEditing] = useState(true); const [editing, setEditing] = useState(false);
const utils = api.useContext(); const utils = api.useContext();
const setVarsMutation = api.experiments.setTemplateVariables.useMutation(); const setVarsMutation = api.experiments.setTemplateVariables.useMutation();
@@ -50,7 +50,7 @@ export default function ScenarioHeader() {
rightIcon={<Icon as={BsChevronDown} />} rightIcon={<Icon as={BsChevronDown} />}
onClick={() => setEditing(true)} onClick={() => setEditing(true)}
> >
Edit Vars Configure
</Button> </Button>
)} )}
</HStack> </HStack>

View File

@@ -23,7 +23,16 @@ export const promptVariantsRouter = createTRPCRouter({
}) })
) )
.mutation(async ({ input }) => { .mutation(async ({ input }) => {
const maxSortIndex = const lastScenario = await prisma.promptVariant.findFirst({
where: {
experimentId: input.experimentId,
},
orderBy: {
sortIndex: "desc",
},
});
const largestSortIndex =
( (
await prisma.promptVariant.aggregate({ await prisma.promptVariant.aggregate({
where: { where: {
@@ -33,14 +42,14 @@ export const promptVariantsRouter = createTRPCRouter({
sortIndex: true, sortIndex: true,
}, },
}) })
)._max.sortIndex ?? 0; )._max?.sortIndex ?? 0;
const newScenario = await prisma.promptVariant.create({ const newScenario = await prisma.promptVariant.create({
data: { data: {
experimentId: input.experimentId, experimentId: input.experimentId,
label: `Prompt Variant ${maxSortIndex + 1}`, label: `Prompt Variant ${largestSortIndex + 2}`,
sortIndex: maxSortIndex + 1, sortIndex: (lastScenario?.sortIndex ?? 0) + 1,
config: {}, config: lastScenario?.config ?? {},
}, },
}); });