OpenPipe-llm

Author	SHA1	Message	Date
Kyle Corbitt	60765e51ac	Remove model from promptVariant and add cost Storing the model on promptVariant is problematic because it isn't always in sync with the actual prompt definition. I'm removing it for now to see if we can get away with that -- might have to add it back in later if this causes trouble. Added `cost` to modelOutput as well so we can cache that, which is important given that the cost calculations won't be the same between different API providers.	2023-07-19 16:20:53 -07:00
arcticfly	4c97b9f147	Refine prompt (#63 ) * Remove unused ScenarioVariantCell fields * Refine deriveNewConstructFn * Fix prettier * Remove migration script * Add refine modal * Fix prettier * Fix diff checker overflow * Decrease diff height	2023-07-19 15:31:40 -07:00
arcticfly	58892d8b63	Remove unused fields, refine model translation (#62 ) * Remove unused ScenarioVariantCell fields * Refine deriveNewConstructFn * Fix prettier	2023-07-19 13:59:11 -07:00
Kyle Corbitt	2cb623f332	experiment page visual tweaks	2023-07-18 22:22:58 -07:00
Kyle Corbitt	1dcdba04a6	User accounts Allows for the creation of user accounts. A few notes on the specifics: - Experiments are the main access control objects. If you can view an experiment, you can view all its prompts/scenarios/evals. If you can edit it, you can edit or delete all of those as well. - Experiments are owned by Organizations in the database. Organizations can have multiple members and members can have roles of ADMIN, MEMBER or VIEWER. - Organizations can either be "personal" or general. Each user has a "personal" organization created as soon as they try to create an experiment. There's currently no UI support for creating general orgs or adding users to them; they're just in the database to future-proof all the ACL logic. - You can require that a user is signed-in to see a route using the `protectedProcedure` helper. When you use `protectedProcedure`, you also have to call `ctx.markAccessControlRun()` (or delegate to a function that does it for you; see accessControl.ts). This is to remind us to actually check for access control when we define a new endpoint.	2023-07-18 21:19:03 -07:00
arcticfly	e0e64c4207	Allow user to create a version of their current prompt with a new model (#58 ) * Add dropdown header for model switching * Allow variant duplication * Fix prettier * Use env variable to restrict prisma logs * Fix env.mjs * Remove unnecessary scroll bar from function call output * Properly record when 404 error occurs in queryLLM task * Add SelectedModelInfo in SelectModelModal * Add react-select * Calculate new prompt after switching model * Send newly selected model with creation request * Get new prompt construction function back from GPT-4 * Fix prettier * Fix prettier	2023-07-18 18:24:04 -07:00
arcticfly	fa5b1ab1c5	Allow user to duplicate prompt (#57 ) * Add dropdown header for model switching * Allow variant duplication * Fix prettier	2023-07-18 13:49:33 -07:00
David Corbitt	999a4c08fa	Fix lint and prettier	2023-07-18 11:11:20 -07:00
arcticfly	374d0237ee	Escape characters in Regex evaluations, minor UI fixes (#56 ) * Fix ScenariosHeader stickiness * Move meta tag from _app.tsx to _document.tsx * Show spinner when saving variant * Escape quotes and regex in evaluations	2023-07-18 11:07:04 -07:00
arcticfly	4131aa67d0	Continue polling VariantStats while LLM retrieval in progress, minor UI fixes (#54 ) * Prevent zoom in on iOS * Expand function return code background to fill cell * Keep OutputStats on far right of cells * Continue polling prompt stats while cells are retrieving from LLM * Add comment to _document.tsx * Fix prettier	2023-07-17 18:04:38 -07:00
Kyle Corbitt	7d41e94ca2	cache eval outputs and add gpt4 eval	2023-07-17 17:55:36 -07:00
Kyle Corbitt	011b12abb9	cache output evals	2023-07-17 17:52:30 -07:00
Kyle Corbitt	54369dba54	Fix seeds and update eval field names	2023-07-17 14:14:20 -07:00
arcticfly	6b84a59372	Properly catch completion errors (#51 )	2023-07-17 10:50:25 -07:00
Kyle Corbitt	8db8aeacd3	Replace function chrome with comment Use a block comment to explain the expected prompt formatting instead of function chrome. The advantage here is that once a user builds a mental model of how OpenPipe works they can just delete the comment, instead of the function chrome sitting around and taking up space in the UI forever.	2023-07-17 10:30:22 -07:00
David Corbitt	3bf5eaf4a2	Properly extract scenario id in new experiment creation	2023-07-14 16:55:09 -06:00
Kyle Corbitt	26ee8698be	Make it so you can't delete the last prompt or scenario No reason for an experiment to have 0 prompts or 0 scenarios and it makes the UI look bad.	2023-07-14 15:49:42 -07:00
arcticfly	b98eb9b729	Trigger llm output retrieval on server (#39 ) * Rename tables, add graphile workers, update types * Add dev:worker command * Update pnpm-lock.yaml * Remove sentry config import from worker.ts * Stop generating new cells in cell router get query * Generate new cells for new scenarios, variants, and experiments * Remove most error throwing from queryLLM.task.ts * Remove promptVariantId and testScenarioId from ModelOutput * Remove duplicate index from ModelOutput * Move inputHash from cell to output * Add TODO * Add todo * Show cost and time for each cell * Always show output stats if there is output * Trigger LLM outputs when scenario variables are updated * Add newlines to ends of files * Add another newline * Cascade ModelOutput deletion * Fix linting and prettier * Return instead of throwing for non-pending cell * Remove pnpm dev:worker from pnpm:dev * Update pnpm-lock.yaml	2023-07-14 16:38:46 -06:00
Kyle Corbitt	a5378b106b	store model and use to calculate completion costs	2023-07-14 11:06:07 -07:00
Kyle Corbitt	4770ea34a8	Use javascript functions for prompt completions instead of templated json	2023-07-13 18:01:07 -07:00
David Corbitt	e555d13dd7	Limit prompt tokens to outputs from visible scenarios	2023-07-10 16:33:16 -06:00
arcticfly	187d6492f8	Reevaluate all prompt stats when scenario is hidden (#32 ) * Reevaluate when scenario is hidden * Add newline	2023-07-10 13:51:40 -06:00
arcticfly	96dacb0378	Move experiment scrollbar to bottom of page, make scenarios header sticky (#29 ) * Remove newline from promptVariants router * Move horizontal scroll bar to bottom of OutputsTable * Make scenarios header sticky	2023-07-10 12:40:02 -06:00
arcticfly	e64a94e06e	Record experiment updated in more places (#24 ) * Record experiment updated in more places * Update experiment updatedAt in same transaction	2023-07-10 12:00:24 -06:00
arcticfly	32a80f8475	Limit evaluations to visible test scenarios (#28 )	2023-07-10 02:10:23 -06:00
Kyle Corbitt	a8db6cadfd	format with prettier 3	2023-07-08 22:12:47 -07:00
arcticfly	6b32619e87	Add streaming to the default prompt (#27 )	2023-07-08 21:22:02 -07:00
jarenm	d7fb4a7236	add cost tooltip (#22 ) * add cost tooltip * add padding and style	2023-07-07 19:34:45 -07:00
arcticfly	0415a04dc6	Redirect to experiments page after deleting experiment (#21 )	2023-07-07 18:01:38 -07:00
Kyle Corbitt	8e0722cd22	wrong denominator	2023-07-07 17:48:34 -07:00
arcticfly	db4476d1cb	Change website layout (#18 ) * Add basic experiments page * Isolate experiment components * Fix grid on small screens * Change nav bar * Add padding to logo * Fix linking * Remove right margin on ExperimentCard flask * Change favicon * Use humanize in formatTimePast * Add TODO	2023-07-07 14:47:54 -07:00
Kyle Corbitt	46344d8fc4	small bugfixes	2023-07-07 12:22:27 -07:00
arcticfly	a2c7ef73ec	Retry requests that receive 429 (#15 ) * List number of scenarios * Retry requests after 429 * Rename requestCallback * Add sleep function * Allow manual retry on frontend * Remove unused utility functions * Auto refetch * Display wait time with Math.ceil * Take one second modulo into account * Add pluralize	2023-07-06 21:39:23 -07:00
arcticfly	fe501a80cb	Add total token cost to variant stats (#13 ) * Add total token cost to variant stats * Copy over token counts for new variants * Update invalidate call	2023-07-06 15:33:49 -07:00
Kyle Corbitt	1fa0d7bc62	bugfixes	2023-07-06 15:22:35 -07:00
arcticfly	92c240e7b8	Add request cost to OutputStats (#12 )	2023-07-06 14:36:31 -07:00
Kyle Corbitt	f728027ef6	add evaluations	2023-07-06 13:44:03 -07:00
arcticfly	1ae5612d55	Add promptTokens and completionTokens to model output (#11 ) * Default to streaming in config * Add tokens to database * Add NEXT_PUBLIC_SOCKET_URL to .env.example * Disable streaming for functions * Add newline to types	2023-07-06 13:12:59 -07:00
Kyle Corbitt	4275e6b19b	settings drawer	2023-07-05 21:34:00 -07:00
Kyle Corbitt	6510b26b1e	re-attach scenarios	2023-07-05 21:34:00 -07:00
Kyle Corbitt	63f6b646dc	scenario vars can't have spaces	2023-07-05 13:54:02 -07:00
David Corbitt	d434545fdf	Rename channelId to channel	2023-07-03 20:22:38 -07:00
David Corbitt	5f11b258ca	Streaming works for normal text	2023-07-03 19:51:34 -07:00
David Corbitt	9a6cb8dc95	Update autogen.ts	2023-07-03 17:35:29 -07:00
arcticfly	6389bd54de	Save and display timeToComplete on model outputs (#6 ) * Calculate and save timeToComplete on model outputs * Add output stats to function call as well * Record timeToComplete before parsing response json * Add default value for timeToComplete	2023-06-30 21:29:28 -07:00
Kyle Corbitt	74d2493a1b	README updates and minor tweaks	2023-06-28 11:55:25 -07:00
Kyle Corbitt	15e4fe7e5a	more robust sidebar layout	2023-06-28 10:37:37 -07:00
Kyle Corbitt	0a675cd7f7	autogen scenarios	2023-06-27 13:19:41 -07:00
Kyle Corbitt	ab32995eb9	slightly better error handling	2023-06-27 10:48:09 -07:00
Kyle Corbitt	267a5381f3	final lint errors	2023-06-26 23:46:10 -07:00

1 2

71 Commits