Commit Graph

11 Commits

Author SHA1 Message Date
arcticfly
1fb428ef4a Add scenario editing modal, twitter sentiment seeding (#101)
* testing agi-eval benchmark

* Add scenario modal editor

* Add initial values to ScenarioEditorModal

* Add seedTwitterSentiment.ts

---------

Co-authored-by: Kyle Corbitt <kyle@corbt.com>
2023-08-01 01:26:43 -07:00
arcticfly
26b6fa4f0c Requeue rate-limited query model tasks (#99)
* Continue polling stats until all evals complete

* Return evaluation changes early, before it has run

* Add task for running new eval

* requeue rate-limited tasks

* Fix prettier
2023-07-26 16:30:50 -07:00
Kyle Corbitt
60765e51ac Remove model from promptVariant and add cost
Storing the model on promptVariant is problematic because it isn't always in sync with the actual prompt definition. I'm removing it for now to see if we can get away with that -- might have to add it back in later if this causes trouble.

Added `cost` to modelOutput as well so we can cache that, which is important given that the cost calculations won't be the same between different API providers.
2023-07-19 16:20:53 -07:00
arcticfly
4131aa67d0 Continue polling VariantStats while LLM retrieval in progress, minor UI fixes (#54)
* Prevent zoom in on iOS

* Expand function return code background to fill cell

* Keep OutputStats on far right of cells

* Continue polling prompt stats while cells are retrieving from LLM

* Add comment to _document.tsx

* Fix prettier
2023-07-17 18:04:38 -07:00
Kyle Corbitt
011b12abb9 cache output evals 2023-07-17 17:52:30 -07:00
Kyle Corbitt
54369dba54 Fix seeds and update eval field names 2023-07-17 14:14:20 -07:00
Kyle Corbitt
a8db6cadfd format with prettier 3 2023-07-08 22:12:47 -07:00
jarenm
d7fb4a7236 add cost tooltip (#22)
* add cost tooltip

* add padding and style
2023-07-07 19:34:45 -07:00
Kyle Corbitt
46344d8fc4 small bugfixes 2023-07-07 12:22:27 -07:00
arcticfly
fe501a80cb Add total token cost to variant stats (#13)
* Add total token cost to variant stats

* Copy over token counts for new variants

* Update invalidate call
2023-07-06 15:33:49 -07:00
Kyle Corbitt
f728027ef6 add evaluations 2023-07-06 13:44:03 -07:00