Commit Graph

21 Commits

Author SHA1 Message Date
arcticfly
b98bce8944 Add Datasets (#118)
* Add dataset (without entries)

* Fix dataset hook

* Add dataset rows

* Add buttons to import/generate data

* Add GenerateDataModal

* Autogenerate and save data

* Fix prettier

* Fix types

* Add dataset pagination

* Fix prettier

* Use useDisclosure

* Allow generate data modal fadeaway

* hide/show data in env var

* Fix prettier
2023-08-04 11:52:03 -07:00
arcticfly
1fb428ef4a Add scenario editing modal, twitter sentiment seeding (#101)
* testing agi-eval benchmark

* Add scenario modal editor

* Add initial values to ScenarioEditorModal

* Add seedTwitterSentiment.ts

---------

Co-authored-by: Kyle Corbitt <kyle@corbt.com>
2023-08-01 01:26:43 -07:00
Kyle Corbitt
e1cbeccb90 Better streaming
- Always stream the visible scenarios, if the modelProvider supports it
 - Never stream the invisible scenarios

Also actually runs our query tasks in a background worker, which we weren't quite doing before.
2023-07-24 18:34:30 -07:00
Kyle Corbitt
2e395e4d39 Paginate scenarios
Show 10 scenarios at a time and let the user paginate through them to keep the interface responsive with potentially 1000s of scenarios.
2023-07-22 16:10:16 -07:00
Kyle Corbitt
61e5f0775d separate scenarios from prompts in outputs table 2023-07-22 07:38:19 -07:00
Kyle Corbitt
1dcdba04a6 User accounts
Allows for the creation of user accounts. A few notes on the specifics:

 - Experiments are the main access control objects. If you can view an experiment, you can view all its prompts/scenarios/evals. If you can edit it, you can edit or delete all of those as well.
 - Experiments are owned by Organizations in the database. Organizations can have multiple members and members can have roles of ADMIN, MEMBER or VIEWER.
 - Organizations can either be "personal" or general. Each user has a "personal" organization created as soon as they try to create an experiment. There's currently no UI support for creating general orgs or adding users to them; they're just in the database to future-proof all the ACL logic.
 - You can require that a user is signed-in to see a route using the `protectedProcedure` helper. When you use `protectedProcedure`, you also have to call `ctx.markAccessControlRun()` (or delegate to a function that does it for you; see accessControl.ts). This is to remind us to actually check for access control when we define a new endpoint.
2023-07-18 21:19:03 -07:00
Kyle Corbitt
011b12abb9 cache output evals 2023-07-17 17:52:30 -07:00
arcticfly
b98eb9b729 Trigger llm output retrieval on server (#39)
* Rename tables, add graphile workers, update types

* Add dev:worker command

* Update pnpm-lock.yaml

* Remove sentry config import from worker.ts

* Stop generating new cells in cell router get query

* Generate new cells for new scenarios, variants, and experiments

* Remove most error throwing from queryLLM.task.ts

* Remove promptVariantId and testScenarioId from ModelOutput

* Remove duplicate index from ModelOutput

* Move inputHash from cell to output

* Add TODO

* Add todo

* Show cost and time for each cell

* Always show output stats if there is output

* Trigger LLM outputs when scenario variables are updated

* Add newlines to ends of files

* Add another newline

* Cascade ModelOutput deletion

* Fix linting and prettier

* Return instead of throwing for non-pending cell

* Remove pnpm dev:worker from pnpm:dev

* Update pnpm-lock.yaml
2023-07-14 16:38:46 -06:00
Kyle Corbitt
4770ea34a8 Use javascript functions for prompt completions instead of templated json 2023-07-13 18:01:07 -07:00
arcticfly
187d6492f8 Reevaluate all prompt stats when scenario is hidden (#32)
* Reevaluate when scenario is hidden

* Add newline
2023-07-10 13:51:40 -06:00
arcticfly
e64a94e06e Record experiment updated in more places (#24)
* Record experiment updated in more places

* Update experiment updatedAt in same transaction
2023-07-10 12:00:24 -06:00
Kyle Corbitt
a8db6cadfd format with prettier 3 2023-07-08 22:12:47 -07:00
arcticfly
db4476d1cb Change website layout (#18)
* Add basic experiments page

* Isolate experiment components

* Fix grid on small screens

* Change nav bar

* Add padding to logo

* Fix linking

* Remove right margin on ExperimentCard flask

* Change favicon

* Use humanize in formatTimePast

* Add TODO
2023-07-07 14:47:54 -07:00
Kyle Corbitt
0a675cd7f7 autogen scenarios 2023-06-27 13:19:41 -07:00
Kyle Corbitt
267a5381f3 final lint errors 2023-06-26 23:46:10 -07:00
Kyle Corbitt
bca35c9eb2 tighter types and linting 2023-06-26 23:40:05 -07:00
Kyle Corbitt
15087f6bcd hiding and reordering scenarios 2023-06-26 11:05:49 -07:00
Kyle Corbitt
8534477236 can add scenarios and it mostly works 2023-06-23 20:00:46 -07:00
Kyle Corbitt
5fcefad461 rip out mantine table 2023-06-23 17:03:26 -07:00
Kyle Corbitt
2b0c2ad603 editing scenarios is kinda working 2023-06-23 16:18:28 -07:00
Kyle Corbitt
788911898b basic table with no info 2023-06-22 08:05:22 -07:00