Add scenario editing modal, twitter sentiment seeding (#101)

* testing agi-eval benchmark

* Add scenario modal editor

* Add initial values to ScenarioEditorModal

* Add seedTwitterSentiment.ts

---------

Co-authored-by: Kyle Corbitt <kyle@corbt.com>
This commit is contained in:
arcticfly
2023-08-01 01:26:43 -07:00
committed by GitHub
parent 6316eaae6d
commit 1fb428ef4a
11 changed files with 621 additions and 102 deletions

View File

@@ -107,6 +107,7 @@
"@types/uuid": "^9.0.2",
"@typescript-eslint/eslint-plugin": "^5.59.6",
"@typescript-eslint/parser": "^5.59.6",
"csv-parse": "^5.4.0",
"eslint": "^8.40.0",
"eslint-config-next": "^13.4.2",
"eslint-plugin-unused-imports": "^2.0.0",

9
pnpm-lock.yaml generated
View File

@@ -1,4 +1,4 @@
lockfileVersion: '6.1'
lockfileVersion: '6.0'
settings:
autoInstallPeers: true
@@ -259,6 +259,9 @@ devDependencies:
'@typescript-eslint/parser':
specifier: ^5.59.6
version: 5.59.6(eslint@8.40.0)(typescript@5.0.4)
csv-parse:
specifier: ^5.4.0
version: 5.4.0
eslint:
specifier: ^8.40.0
version: 8.40.0
@@ -4087,6 +4090,10 @@ packages:
/csstype@3.1.2:
resolution: {integrity: sha512-I7K1Uu0MBPzaFKg4nI5Q7Vs2t+3gWWW648spaF+Rg7pI9ds18Ugn+lvg4SHczUdKlHI5LWBXyqfS8+DufyBsgQ==}
/csv-parse@5.4.0:
resolution: {integrity: sha512-JiQosUWiOFgp4hQn0an+SBoV9IKdqzhROM0iiN4LB7UpfJBlsSJlWl9nq4zGgxgMAzHJ6V4t29VAVD+3+2NJAg==}
dev: true
/d@1.0.1:
resolution: {integrity: sha512-m62ShEObQ39CfralilEQRjH6oAMtNCV1xJyEx5LpRYUVN+EviphDgUc/F3hnYbADmkiNs67Y+3ylmlG7Lnu+FA==}
dependencies:

View File

@@ -0,0 +1,84 @@
Text,sentiment,emotion
@dell your customer service is horrible especially agent syedfaisal who has made this experience of purchasing a new computer downright awful and Ill reconsider ever buying a Dell in the future @DellTech,negative,anger
@zacokalo @Dell @DellCares @Dell give the man what he paid for!,neutral,anger
"COOKING STREAM DAY!!! Ty to @Alienware for sponsoring this stream! Ill be making a bunch of Japanese Alien themed foods hehe
Come check it out! https://t.co/m06tJQ06zk
#alienwarepartner #intelgaming @Dell @IntelGaming https://t.co/qOdQX2E8VD",positive,joy
@emijuju_ @Alienware @Dell @intel Beautiful 😍❤️😻,positive,joy
"What's your biggest data management challenge? • Cloud complexity? • Lengthy tech refresh cycles? • Capital budget constraints? Solve your challenges with as-a-Storage. Get simplicity, agility &amp; control with @Dell #APEX. https://t.co/mCblMtH931 https://t.co/eepKNZ4Ai3",neutral,optimism
"This week we were at the ""Top Gun"" themed @Dell Product Expo. Eddie Muñoz met Maverick look-alike, California Tom Cruise (Jerome LeBlanc)!
""I feel the need, the need for speed."" - Maverick
#topgun #topgunmaverick #dell #delltechnologies #lockncharge https://t.co/QHYH2EbMjq",positive,joy
"Itsss been more than a week...i m following up with dell for troubleshootings...my https://t.co/lWhg2YKhQa suffering so as my hard earned money...hightly disappointed...contd..
@DellCares @Dell",negative,sadness
"@ashu_k7 @Dell Pathetic!!!!! I Dont mind taking legal action, this is deficency of service for which the customer is nt getting help..",negative,anger
@ashu_k7 @Dell Making life unhappy is the new tag line of #Dell,negative,sadness
"@Dell If you are buying a Dell, make sure you are making your life hell.
Better buy other laptops. If you wanted to opt for Dell better opt for garbage on the streets.",negative,anger
"MY DESK'S FINAL FORM? Seriously, I'm finally happy with my monitor setup here... and I'll keep this setup whenever I move... FOREVER. What do you think?
https://t.co/WJZ2JXtOnX
@Alienware @Dell cheers. https://t.co/6Whhldfpv0",positive,joy
"@Dell Dell Alienware computer has had software problems with SupportAssist since purchase. Dell, despite paying for Premium Support, has never fixed issues. Latest solution was to erase everything and reload....SupportAssist still doesn't work.",negative,anger
"HUGE congratulations to Startup Battle 3.0 winner ➡️ @Ox_Fulfillment x @cyborgcharu for being featured in @BusinessInsider &amp; @Dell showcasing the journey at Ox! 🚀🚀🚀
We love to see our portfolio companies continuing to BUILD SOMETHING FROM NOTHING! 🔥 https://t.co/awBkn5ippB",positive,joy
@Dell happy Friday!,positive,joy
"@intel Core i5 1135G7 - 4732 points
@intel Core i5 1235 - 6619 points
@Dell Latitude 5420 x 5430.
Cinebench R23. Good job Intel!",positive,joy
@Dell india we purchased 52 docking station and we have around 100 users using dell laptop as well as dell monitor now they are refusing to replace my faulty product and disconnecting my every call....,negative,anger
"It's another year ans another day But cant fill it in yet the child hood dreams.
It's my birthdy today. Can anyone of you guys bless me with a simplest gaming oc that can run
@DOTA2 ?
@Dell @HP @VastGG @Acer @Alienware @Lenovo @toshiba @IBM @Fujitsu_Global @NEC https://t.co/69G8tL9sN8",neutral,joy
"@idoccor @Dell That's always the decision—wait, or, look elsewhere. In this case, I think I unfortunately need to wait since there are only two monitors with these specs and I don't like the other one 😂",negative,sadness
"@MichaelDell @Dell @DellCares For how long this will continue. It is high time you either fix the problem for good or replace the complete laptop. Spent over 60+ hours with Customer Care teams, which is not helping. Cannot keep going on like this.",negative,anger
"@Dell @DellCares but no, not really",neutral,sadness
"Business innovation requires insight, agility and efficiency. How do you get there? RP PRO, LLC recommends starting by proactively managing IT infrastructure with #OpenManage Systems from @Dell. https://t.co/fBcK1lfFMu https://t.co/xWHLkkHCjn",neutral,optimism
@Dell Yessirrrrr #NationalCoffeeDay,positive,joy
"New blog post from @Dell shared on https://t.co/EgfPChB8AT
Re-routing Our Connected and Autonomous Future https://t.co/AW8EHQrbd6
#future #futuretech #techinnovation https://t.co/koX8stKPsr",neutral,joy
"In a free-market economy, the folks @IronMountain can set prices as they see fit. Their customers are also free to find better prices at competitors like @Dell
@H3CGlobal @HPE
https://t.co/reZ56DNTBI",neutral,optimism
"Delighted to chat with many of our partners here in person at @Intel Innovation! @Dell, @Lenovo, @Supermicro_SMCI, @QuantaQCT #IntelON https://t.co/BxIeGW8deN",positive,joy
"A special gracias to our Startup Chica San Antonio 2022 sponsors @eBay, @jcpenney, @Barbie, @HEB, @Dell, @Honda, @SouthsideSATX💜✨ https://t.co/lZ6WWkziHl",positive,joy
"When your team decides to start supporting developers, your #ops must change too. More from @cote and @Dell Developer Community Manager @barton808: https://t.co/W6f1oMiTgV",neutral,optimism
@EmDStowers @LASERGIANT1 @ohwormongod @Ludovician_Vega @Dell our boy snitchin,neutral,anger
A 1st place dmi:Design Value Award goes to @Dell for a packaging modernization initiative that helped them get closer to their corporate Moonshot Sustainability Goal of 100% recycled or renewable packaging by 2030. More at https://t.co/dnhZWWLCQC #designvalue #DVA22,positive,optimism
Reducing deployment and maintenance complexity is the goal behind @dell and @WindRiver's new collaboration. https://t.co/2PxQgPuHUU,positive,optimism
@jaserhunter @Dell Love the sales pitch lol,positive,joy
@Dell india we purchased 52 docking station and we have around 100 users using dell laptop as well as dell monitor now they are refusing to replace my faulty product and disconnecting my every call....,negative,anger
@ashu_k7 @Dell One more example.. their technical support is also worse. https://t.co/20atSgI4fg,negative,anger
*angry screeches about @Dell proprietary MBR windows 8.1 partitions not being able to save as an img in clonezilla *,negative,anger
@socialitebooks @BBYC_Gamers @Dell @Alienware @BestBuyCanada @intelcanada Congratulations!!!,positive,joy
"Thank you to the @dell team for coming out to volunteer today! We truly appreciate your hard work and look forward to seeing you again soon!
If you and your team are interested in helping out at the UMLAUF, visit our website for more information: https://t.co/lVfsZT2ogS https://t.co/eLz0FY0y4M",positive,joy
"@TheCaramelGamer @intel @bravadogaming @Intel_Africa @Dell @DellTech @DellTechMEA @Alienware @IntelUK we love to see it.
Also also actually actually whoever did that artwork? 🔥🔥🔥 am a fan.",positive,joy
"LOVING MY DELL 2 IN 1 LAPTOP
YAYY 🥳🥳
@Dell #DellInspiron #DellLaptop https://t.co/vib96jf3tC",positive,joy
@Azure @OracleItalia @AWS_Italy @lenovoitalia @Dell discussing the future of #HPC during the #hpcroundtable22 in Turin today #highperformancecomputing https://t.co/jJ1WqBulPF,neutral,joy
Attracting talent @AmericanChamber. @marg_cola @Dell speaks of quality of life connectivity and the Opportunity for development being so crucial. Housing availability is now impacting on decision making for potential candidates. #WhyCork,positive,optimism
.@Dell partners with @WindRiver on modular cloud-native telecommunications infrastructure https://t.co/4SWATspwCP @SiliconANGLE @Mike_Wheatley @holgermu @constellationr,neutral,joy
@Dell Not buy Dell Inspiron laptop,neutral,sadness
"@dell #delltechforum reminding us IDC have predicted that by 2024, 50% of everything we consume in technology will be as a service https://t.co/3UBiZJX0LE",neutral,optimism
@RachMurph @HETTShow @Dell Thank you for coming! Great evening,positive,joy
Congratulations to Jason M of Moncton NB on winning a @Dell @Alienware m15 R7 15.6″ gaming laptop from @BestBuyCanada and @intelcanada's gaming days #contest on the blog. Visit https://t.co/VryaY5Rvv9 to learn about tech and for chances to win new tech. https://t.co/T6n0dzF6oL,positive,joy
@MattVisiwig @Dell Sour taste for sure 😶 But don't let ego distract you from what you really want to buy 😁,neutral,optimism
"Massive thank you goes to sponsors @HendersonLoggie @lindsaysnews @Dell @unity, all of our fantastic judges and mentors and the team at @EGX and @ExCeLLondon.
Big congratulations also to all of our other @AbertayDare teams - an amazing year! #Dare2022 https://t.co/jYe4agO7lW",positive,joy
"@timetcetera @rahaug Nah, I just need @Dell to start paying me comissions 😂",neutral,joy
"""Whether youre an engineer, a designer, or work in supply chain management or sales, there are always opportunities to think about sustainability and how you can do things more efficiently."" 👏 — Oliver Campbell, Director of Packaging Engineering, @Dell https://t.co/vUJLTWNFwP https://t.co/GJWAzGfAxJ",positive,optimism
"Hi, my name is @listerepvp and I support @Dell, always.",positive,joy
1 Text sentiment emotion
2 @dell your customer service is horrible especially agent syedfaisal who has made this experience of purchasing a new computer downright awful and I’ll reconsider ever buying a Dell in the future @DellTech negative anger
3 @zacokalo @Dell @DellCares @Dell give the man what he paid for! neutral anger
4 COOKING STREAM DAY!!! Ty to @Alienware for sponsoring this stream! I’ll be making a bunch of Japanese Alien themed foods hehe Come check it out! https://t.co/m06tJQ06zk #alienwarepartner #intelgaming @Dell @IntelGaming https://t.co/qOdQX2E8VD positive joy
5 @emijuju_ @Alienware @Dell @intel Beautiful 😍❤️😻 positive joy
6 What's your biggest data management challenge? • Cloud complexity? • Lengthy tech refresh cycles? • Capital budget constraints? Solve your challenges with as-a-Storage. Get simplicity, agility &amp; control with @Dell #APEX. https://t.co/mCblMtH931 https://t.co/eepKNZ4Ai3 neutral optimism
7 This week we were at the "Top Gun" themed @Dell Product Expo. Eddie Muñoz met Maverick look-alike, California Tom Cruise (Jerome LeBlanc)! "I feel the need, the need for speed." - Maverick #topgun #topgunmaverick #dell #delltechnologies #lockncharge https://t.co/QHYH2EbMjq positive joy
8 Itsss been more than a week...i m following up with dell for troubleshootings...my https://t.co/lWhg2YKhQa suffering so as my hard earned money...hightly disappointed...contd.. @DellCares @Dell negative sadness
9 @ashu_k7 @Dell Pathetic!!!!! I Dont mind taking legal action, this is deficency of service for which the customer is nt getting help.. negative anger
10 @ashu_k7 @Dell Making life unhappy is the new tag line of #Dell negative sadness
11 @Dell If you are buying a Dell, make sure you are making your life hell. Better buy other laptops. If you wanted to opt for Dell better opt for garbage on the streets. negative anger
12 MY DESK'S FINAL FORM? Seriously, I'm finally happy with my monitor setup here... and I'll keep this setup whenever I move... FOREVER. What do you think? https://t.co/WJZ2JXtOnX @Alienware @Dell cheers. https://t.co/6Whhldfpv0 positive joy
13 @Dell Dell Alienware computer has had software problems with SupportAssist since purchase. Dell, despite paying for Premium Support, has never fixed issues. Latest solution was to erase everything and reload....SupportAssist still doesn't work. negative anger
14 HUGE congratulations to Startup Battle 3.0 winner ➡️ @Ox_Fulfillment x @cyborgcharu for being featured in @BusinessInsider &amp; @Dell showcasing the journey at Ox! 🚀🚀🚀 We love to see our portfolio companies continuing to BUILD SOMETHING FROM NOTHING! 🔥 https://t.co/awBkn5ippB positive joy
15 @Dell happy Friday! positive joy
16 @intel Core i5 1135G7 - 4732 points @intel Core i5 1235 - 6619 points @Dell Latitude 5420 x 5430. Cinebench R23. Good job Intel! positive joy
17 @Dell india we purchased 52 docking station and we have around 100 users using dell laptop as well as dell monitor now they are refusing to replace my faulty product and disconnecting my every call.... negative anger
18 It's another year ans another day But cant fill it in yet the child hood dreams. It's my birthdy today. Can anyone of you guys bless me with a simplest gaming oc that can run @DOTA2 ? @Dell @HP @VastGG @Acer @Alienware @Lenovo @toshiba @IBM @Fujitsu_Global @NEC https://t.co/69G8tL9sN8 neutral joy
19 @idoccor @Dell That's always the decision—wait, or, look elsewhere. In this case, I think I unfortunately need to wait since there are only two monitors with these specs and I don't like the other one 😂 negative sadness
20 @MichaelDell @Dell @DellCares For how long this will continue. It is high time you either fix the problem for good or replace the complete laptop. Spent over 60+ hours with Customer Care teams, which is not helping. Cannot keep going on like this. negative anger
21 @Dell @DellCares but no, not really neutral sadness
22 Business innovation requires insight, agility and efficiency. How do you get there? RP PRO, LLC recommends starting by proactively managing IT infrastructure with #OpenManage Systems from @Dell. https://t.co/fBcK1lfFMu https://t.co/xWHLkkHCjn neutral optimism
23 @Dell Yessirrrrr #NationalCoffeeDay positive joy
24 New blog post from @Dell shared on https://t.co/EgfPChB8AT Re-routing Our Connected and Autonomous Future https://t.co/AW8EHQrbd6 #future #futuretech #techinnovation https://t.co/koX8stKPsr neutral joy
25 In a free-market economy, the folks @IronMountain can set prices as they see fit. Their customers are also free to find better prices at competitors like @Dell @H3CGlobal @HPE https://t.co/reZ56DNTBI neutral optimism
26 Delighted to chat with many of our partners here in person at @Intel Innovation! @Dell, @Lenovo, @Supermicro_SMCI, @QuantaQCT #IntelON https://t.co/BxIeGW8deN positive joy
27 A special gracias to our Startup Chica San Antonio 2022 sponsors @eBay, @jcpenney, @Barbie, @HEB, @Dell, @Honda, @SouthsideSATX💜✨ https://t.co/lZ6WWkziHl positive joy
28 When your team decides to start supporting developers, your #ops must change too. More from @cote and @Dell Developer Community Manager @barton808: https://t.co/W6f1oMiTgV neutral optimism
29 @EmDStowers @LASERGIANT1 @ohwormongod @Ludovician_Vega @Dell our boy snitchin neutral anger
30 A 1st place dmi:Design Value Award goes to @Dell for a packaging modernization initiative that helped them get closer to their corporate Moonshot Sustainability Goal of 100% recycled or renewable packaging by 2030. More at https://t.co/dnhZWWLCQC #designvalue #DVA22 positive optimism
31 Reducing deployment and maintenance complexity is the goal behind @dell and @WindRiver's new collaboration. https://t.co/2PxQgPuHUU positive optimism
32 @jaserhunter @Dell Love the sales pitch lol positive joy
33 @Dell india we purchased 52 docking station and we have around 100 users using dell laptop as well as dell monitor now they are refusing to replace my faulty product and disconnecting my every call.... negative anger
34 @ashu_k7 @Dell One more example.. their technical support is also worse. https://t.co/20atSgI4fg negative anger
35 *angry screeches about @Dell proprietary MBR windows 8.1 partitions not being able to save as an img in clonezilla * negative anger
36 @socialitebooks @BBYC_Gamers @Dell @Alienware @BestBuyCanada @intelcanada Congratulations!!! positive joy
37 Thank you to the @dell team for coming out to volunteer today! We truly appreciate your hard work and look forward to seeing you again soon! If you and your team are interested in helping out at the UMLAUF, visit our website for more information: https://t.co/lVfsZT2ogS https://t.co/eLz0FY0y4M positive joy
38 @TheCaramelGamer @intel @bravadogaming @Intel_Africa @Dell @DellTech @DellTechMEA @Alienware @IntelUK we love to see it. Also also actually actually whoever did that artwork? 🔥🔥🔥 am a fan. positive joy
39 LOVING MY DELL 2 IN 1 LAPTOP YAYY 🥳🥳 @Dell #DellInspiron #DellLaptop https://t.co/vib96jf3tC positive joy
40 @Azure @OracleItalia @AWS_Italy @lenovoitalia @Dell discussing the future of #HPC during the #hpcroundtable22 in Turin today #highperformancecomputing https://t.co/jJ1WqBulPF neutral joy
41 Attracting talent @AmericanChamber. @marg_cola @Dell speaks of quality of life connectivity and the Opportunity for development being so crucial. Housing availability is now impacting on decision making for potential candidates. #WhyCork positive optimism
42 .@Dell partners with @WindRiver on modular cloud-native telecommunications infrastructure https://t.co/4SWATspwCP @SiliconANGLE @Mike_Wheatley @holgermu @constellationr neutral joy
43 @Dell Not buy Dell Inspiron laptop neutral sadness
44 @dell #delltechforum reminding us IDC have predicted that by 2024, 50% of everything we consume in technology will be as a service https://t.co/3UBiZJX0LE neutral optimism
45 @RachMurph @HETTShow @Dell Thank you for coming! Great evening positive joy
46 Congratulations to Jason M of Moncton NB on winning a @Dell @Alienware m15 R7 15.6″ gaming laptop from @BestBuyCanada and @intelcanada's gaming days #contest on the blog. Visit https://t.co/VryaY5Rvv9 to learn about tech and for chances to win new tech. https://t.co/T6n0dzF6oL positive joy
47 @MattVisiwig @Dell Sour taste for sure 😶 But don't let ego distract you from what you really want to buy 😁 neutral optimism
48 Massive thank you goes to sponsors @HendersonLoggie @lindsaysnews @Dell @unity, all of our fantastic judges and mentors and the team at @EGX and @ExCeLLondon. Big congratulations also to all of our other @AbertayDare teams - an amazing year! #Dare2022 https://t.co/jYe4agO7lW positive joy
49 @timetcetera @rahaug Nah, I just need @Dell to start paying me comissions 😂 neutral joy
50 "Whether you’re an engineer, a designer, or work in supply chain management or sales, there are always opportunities to think about sustainability and how you can do things more efficiently." 👏 — Oliver Campbell, Director of Packaging Engineering, @Dell https://t.co/vUJLTWNFwP https://t.co/GJWAzGfAxJ positive optimism
51 Hi, my name is @listerepvp and I support @Dell, always. positive joy

127
prisma/seedAgiEval.ts Normal file
View File

@@ -0,0 +1,127 @@
import { prisma } from "~/server/db";
import { generateNewCell } from "~/server/utils/generateNewCell";
import dedent from "dedent";
import { execSync } from "child_process";
import fs from "fs";
const defaultId = "11111111-1111-1111-1111-111111111112";
await prisma.organization.deleteMany({
where: { id: defaultId },
});
// If there's an existing org, just seed into it
const org =
(await prisma.organization.findFirst({})) ??
(await prisma.organization.create({
data: { id: defaultId },
}));
// Clone the repo from git@github.com:microsoft/AGIEval.git into a tmp dir if it doesn't exist
const tmpDir = "/tmp/agi-eval";
if (!fs.existsSync(tmpDir)) {
execSync(`git clone git@github.com:microsoft/AGIEval.git ${tmpDir}`);
}
const datasets = [
"sat-en",
"sat-math",
"lsat-rc",
"lsat-ar",
"aqua-rat",
"logiqa-en",
"lsat-lr",
"math",
];
type Scenario = {
passage: string | null;
question: string;
options: string[] | null;
label: string;
};
for (const dataset of datasets) {
const experimentName = `AGI-Eval: ${dataset}`;
const oldExperiment = await prisma.experiment.findFirst({
where: {
label: experimentName,
organizationId: org.id,
},
});
if (oldExperiment) {
await prisma.experiment.deleteMany({
where: { id: oldExperiment.id },
});
}
const experiment = await prisma.experiment.create({
data: {
id: oldExperiment?.id ?? undefined,
label: experimentName,
organizationId: org.id,
},
});
const scenarios: Scenario[] = fs
.readFileSync(`${tmpDir}/data/v1/${dataset}.jsonl`, "utf8")
.split("\n")
.filter((line) => line.length > 0)
.map((line) => JSON.parse(line) as Scenario);
console.log("scenarios", scenarios.length);
await prisma.testScenario.createMany({
data: scenarios.slice(0, 30).map((scenario, i) => ({
experimentId: experiment.id,
sortIndex: i,
variableValues: {
passage: scenario.passage,
question: scenario.question,
options: scenario.options?.join("\n"),
label: scenario.label,
},
})),
});
await prisma.templateVariable.createMany({
data: ["passage", "question", "options", "label"].map((label) => ({
experimentId: experiment.id,
label,
})),
});
await prisma.promptVariant.createMany({
data: [
{
experimentId: experiment.id,
label: "Prompt Variant 1",
sortIndex: 0,
model: "gpt-3.5-turbo-0613",
modelProvider: "openai/ChatCompletion",
constructFnVersion: 1,
constructFn: dedent`
definePrompt("openai/ChatCompletion", {
model: "gpt-3.5-turbo-0613",
messages: [
{
role: "user",
content: \`Passage: ${"$"}{scenario.passage}\n\nQuestion: ${"$"}{scenario.question}\n\nOptions: ${"$"}{scenario.options}\n\n Respond with just the letter of the best option in the format Answer: (A).\`
}
],
temperature: 0,
})`,
},
],
});
await prisma.evaluation.createMany({
data: [
{
experimentId: experiment.id,
label: "Eval",
evalType: "CONTAINS",
value: "Answer: ({{label}})",
},
],
});
}

View File

@@ -0,0 +1,113 @@
import { prisma } from "~/server/db";
import dedent from "dedent";
import fs from "fs";
import { parse } from "csv-parse/sync";
const defaultId = "11111111-1111-1111-1111-111111111112";
await prisma.organization.deleteMany({
where: { id: defaultId },
});
// If there's an existing org, just seed into it
const org =
(await prisma.organization.findFirst({})) ??
(await prisma.organization.create({
data: { id: defaultId },
}));
type Scenario = {
text: string;
sentiment: string;
emotion: string;
};
const experimentName = `Twitter Sentiment Analysis`;
const oldExperiment = await prisma.experiment.findFirst({
where: {
label: experimentName,
organizationId: org.id,
},
});
if (oldExperiment) {
await prisma.experiment.deleteMany({
where: { id: oldExperiment.id },
});
}
const experiment = await prisma.experiment.create({
data: {
id: oldExperiment?.id ?? undefined,
label: experimentName,
organizationId: org.id,
},
});
const content = fs.readFileSync("./prisma/datasets/validated_tweets.csv", "utf8");
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const records: any[] = parse(content, { delimiter: ",", from_line: 2 });
console.log("records", records);
const scenarios: Scenario[] = records.map((row) => ({
text: row[0],
sentiment: row[1],
emotion: row[2],
}));
console.log("scenarios", scenarios.length);
await prisma.testScenario.createMany({
data: scenarios.slice(0, 30).map((scenario, i) => ({
experimentId: experiment.id,
sortIndex: i,
variableValues: {
text: scenario.text,
sentiment: scenario.sentiment,
emotion: scenario.emotion,
},
})),
});
await prisma.templateVariable.createMany({
data: ["text", "sentiment", "emotion"].map((label) => ({
experimentId: experiment.id,
label,
})),
});
await prisma.promptVariant.createMany({
data: [
{
experimentId: experiment.id,
label: "Prompt Variant 1",
sortIndex: 0,
model: "gpt-3.5-turbo-0613",
modelProvider: "openai/ChatCompletion",
constructFnVersion: 1,
constructFn: dedent`
definePrompt("openai/ChatCompletion", {
model: "gpt-3.5-turbo-0613",
messages: [
{
role: "user",
content: \`Text: ${"$"}{scenario.text}\n\nRespond with the sentiment (negative|neutral|positive) and emotion (optimism|joy|anger|sadness) of the tweet in this format: "answer: <sentiment>-<emotion>".\`
}
],
temperature: 0,
})`,
},
],
});
await prisma.evaluation.createMany({
data: [
{
experimentId: experiment.id,
label: "Eval",
evalType: "CONTAINS",
value: "answer: {{sentiment}}-{{emotion}}",
},
],
});

View File

@@ -37,7 +37,6 @@ export const FloatingLabelInput = ({
borderColor={isFocused ? "blue.500" : "gray.400"}
autoComplete="off"
value={value}
maxHeight={32}
overflowY="auto"
overflowX="hidden"
{...props}

View File

@@ -1,15 +1,27 @@
import { type DragEvent } from "react";
import { useEffect, type DragEvent } from "react";
import { api } from "~/utils/api";
import { isEqual } from "lodash-es";
import { type Scenario } from "./types";
import { useExperiment, useExperimentAccess, useHandledAsyncCallback } from "~/utils/hooks";
import { useState } from "react";
import { Box, Button, Flex, HStack, Icon, Spinner, Stack, Tooltip, VStack } from "@chakra-ui/react";
import {
Box,
Button,
HStack,
Icon,
IconButton,
Spinner,
Stack,
Tooltip,
VStack,
Text,
} from "@chakra-ui/react";
import { cellPadding } from "../constants";
import { BsX } from "react-icons/bs";
import { BsArrowsAngleExpand, BsX } from "react-icons/bs";
import { RiDraggable } from "react-icons/ri";
import { FloatingLabelInput } from "./FloatingLabelInput";
import { ScenarioEditorModal } from "./ScenarioEditorModal";
export default function ScenarioEditor({
scenario,
@@ -28,6 +40,10 @@ export default function ScenarioEditor({
const [values, setValues] = useState<Record<string, string>>(savedValues);
useEffect(() => {
if (savedValues) setValues(savedValues);
}, [savedValues]);
const experiment = useExperiment();
const vars = api.templateVars.list.useQuery({ experimentId: experiment.data?.id ?? "" });
@@ -71,83 +87,98 @@ export default function ScenarioEditor({
[reorderMutation, scenario.id],
);
return (
<HStack
alignItems="flex-start"
px={cellPadding.x}
py={cellPadding.y}
spacing={0}
height="100%"
draggable={!variableInputHovered}
onDragStart={(e) => {
e.dataTransfer.setData("text/plain", scenario.id);
e.currentTarget.style.opacity = "0.4";
}}
onDragEnd={(e) => {
e.currentTarget.style.opacity = "1";
}}
onDragOver={(e) => {
e.preventDefault();
setIsDragTarget(true);
}}
onDragLeave={() => {
setIsDragTarget(false);
}}
onDrop={onReorder}
backgroundColor={isDragTarget ? "gray.100" : "transparent"}
>
{canModify && props.canHide && (
<Stack
alignSelf="flex-start"
opacity={props.hovered ? 1 : 0}
spacing={0}
ml={-cellPadding.x}
>
<Tooltip label="Hide scenario" hasArrow>
{/* for some reason the tooltip can't position itself properly relative to the icon without the wrapping box */}
<Button
variant="unstyled"
color="gray.400"
height="unset"
width="unset"
minW="unset"
onClick={onHide}
_hover={{
color: "gray.800",
cursor: "pointer",
}}
>
<Icon as={hidingInProgress ? Spinner : BsX} boxSize={hidingInProgress ? 4 : 6} />
</Button>
</Tooltip>
<Icon
as={RiDraggable}
boxSize={6}
color="gray.400"
_hover={{ color: "gray.800", cursor: "pointer" }}
/>
</Stack>
)}
const [scenarioEditorModalOpen, setScenarioEditorModalOpen] = useState(false);
{variableLabels.length === 0 ? (
<Box color="gray.500">{vars.data ? "No scenario variables configured" : "Loading..."}</Box>
) : (
<VStack spacing={4} flex={1} py={2}>
{variableLabels.map((key) => {
const value = values[key] ?? "";
const layoutDirection = value.length > 20 ? "column" : "row";
return (
<Flex
key={key}
direction={layoutDirection}
alignItems={layoutDirection === "column" ? "flex-start" : "center"}
flexWrap="wrap"
width="full"
return (
<>
<HStack
alignItems="flex-start"
px={cellPadding.x}
py={cellPadding.y}
spacing={0}
height="100%"
draggable={!variableInputHovered}
onDragStart={(e) => {
e.dataTransfer.setData("text/plain", scenario.id);
e.currentTarget.style.opacity = "0.4";
}}
onDragEnd={(e) => {
e.currentTarget.style.opacity = "1";
}}
onDragOver={(e) => {
e.preventDefault();
setIsDragTarget(true);
}}
onDragLeave={() => {
setIsDragTarget(false);
}}
onDrop={onReorder}
backgroundColor={isDragTarget ? "gray.100" : "transparent"}
>
{canModify && props.canHide && (
<Stack
alignSelf="flex-start"
opacity={props.hovered ? 1 : 0}
spacing={0}
ml={-cellPadding.x}
>
<Tooltip label="Hide scenario" hasArrow>
{/* for some reason the tooltip can't position itself properly relative to the icon without the wrapping box */}
<Button
variant="unstyled"
color="gray.400"
height="unset"
width="unset"
minW="unset"
onClick={onHide}
_hover={{
color: "gray.800",
cursor: "pointer",
}}
>
<Icon as={hidingInProgress ? Spinner : BsX} boxSize={hidingInProgress ? 4 : 6} />
</Button>
</Tooltip>
<Icon
as={RiDraggable}
boxSize={6}
color="gray.400"
_hover={{ color: "gray.800", cursor: "pointer" }}
/>
</Stack>
)}
{variableLabels.length === 0 ? (
<Box color="gray.500">
{vars.data ? "No scenario variables configured" : "Loading..."}
</Box>
) : (
<VStack spacing={4} flex={1} py={2}>
<HStack justifyContent="space-between" w="100%">
<Text color="gray.500">Scenario</Text>
<IconButton
className="fullscreen-toggle"
aria-label="Maximize"
icon={<BsArrowsAngleExpand />}
onClick={() => setScenarioEditorModalOpen(true)}
boxSize={6}
borderRadius={4}
p={1.5}
minW={0}
colorScheme="gray"
color="gray.500"
variant="ghost"
/>
</HStack>
{variableLabels.map((key) => {
const value = values[key] ?? "";
return (
<FloatingLabelInput
key={key}
label={key}
isDisabled={!canModify}
style={{ width: "100%" }}
maxHeight={32}
value={value}
onChange={(e) => {
setValues((prev) => ({ ...prev, [key]: e.target.value }));
@@ -162,27 +193,34 @@ export default function ScenarioEditor({
onMouseEnter={() => setVariableInputHovered(true)}
onMouseLeave={() => setVariableInputHovered(false)}
/>
</Flex>
);
})}
{hasChanged && (
<HStack justify="right">
<Button
size="sm"
onMouseDown={() => {
setValues(savedValues);
}}
colorScheme="gray"
>
Reset
</Button>
<Button size="sm" onMouseDown={onSave} colorScheme="blue">
Save
</Button>
</HStack>
)}
</VStack>
);
})}
{hasChanged && (
<HStack justify="right">
<Button
size="sm"
onMouseDown={() => {
setValues(savedValues);
}}
colorScheme="gray"
>
Reset
</Button>
<Button size="sm" onMouseDown={onSave} colorScheme="blue">
Save
</Button>
</HStack>
)}
</VStack>
)}
</HStack>
{scenarioEditorModalOpen && (
<ScenarioEditorModal
scenarioId={scenario.id}
initialValues={savedValues}
onClose={() => setScenarioEditorModalOpen(false)}
/>
)}
</HStack>
</>
);
}

View File

@@ -0,0 +1,132 @@
import {
Button,
HStack,
Icon,
Modal,
ModalBody,
ModalCloseButton,
ModalContent,
ModalFooter,
ModalHeader,
ModalOverlay,
Spinner,
Text,
VStack,
} from "@chakra-ui/react";
import { useEffect, useState } from "react";
import { BsFileTextFill } from "react-icons/bs";
import { isEqual } from "lodash-es";
import { api } from "~/utils/api";
import {
useScenario,
useHandledAsyncCallback,
useExperiment,
useExperimentAccess,
} from "~/utils/hooks";
import { FloatingLabelInput } from "./FloatingLabelInput";
export const ScenarioEditorModal = ({
scenarioId,
initialValues,
onClose,
}: {
scenarioId: string;
initialValues: Record<string, string>;
onClose: () => void;
}) => {
const utils = api.useContext();
const experiment = useExperiment();
const { canModify } = useExperimentAccess();
const scenario = useScenario(scenarioId);
const savedValues = scenario.data?.variableValues as Record<string, string>;
const [values, setValues] = useState<Record<string, string>>(initialValues);
useEffect(() => {
if (savedValues) setValues(savedValues);
}, [savedValues]);
const hasChanged = !isEqual(savedValues, values);
const mutation = api.scenarios.replaceWithValues.useMutation();
const [onSave, saving] = useHandledAsyncCallback(async () => {
await mutation.mutateAsync({
id: scenarioId,
values,
});
await utils.scenarios.list.invalidate();
}, [mutation, values]);
console.log("scenario", scenario);
const vars = api.templateVars.list.useQuery({ experimentId: experiment.data?.id ?? "" });
const variableLabels = vars.data?.map((v) => v.label) ?? [];
return (
<Modal
isOpen
onClose={onClose}
size={{ base: "xl", sm: "2xl", md: "3xl", lg: "5xl", xl: "7xl" }}
>
<ModalOverlay />
<ModalContent w={1200}>
<ModalHeader>
<HStack>
<Icon as={BsFileTextFill} />
<Text>Scenario</Text>
</HStack>
</ModalHeader>
<ModalCloseButton />
<ModalBody maxW="unset">
<VStack spacing={8}>
{values &&
variableLabels.map((key) => {
const value = values[key] ?? "";
return (
<FloatingLabelInput
key={key}
label={key}
isDisabled={!canModify}
_disabled={{ opacity: 1 }}
style={{ width: "100%" }}
value={value}
onChange={(e) => {
setValues((prev) => ({ ...prev, [key]: e.target.value }));
}}
onKeyDown={(e) => {
if (e.key === "Enter" && (e.metaKey || e.ctrlKey)) {
e.preventDefault();
e.currentTarget.blur();
onSave();
}
}}
/>
);
})}
</VStack>
</ModalBody>
<ModalFooter>
{canModify && (
<HStack>
<Button
colorScheme="gray"
onClick={() => setValues(savedValues)}
minW={24}
isDisabled={!hasChanged}
>
<Text>Reset</Text>
</Button>
<Button colorScheme="blue" onClick={onSave} minW={24} isDisabled={!hasChanged}>
{saving ? <Spinner boxSize={4} /> : <Text>Save</Text>}
</Button>
</HStack>
)}
</ModalFooter>
</ModalContent>
</Modal>
);
};

View File

@@ -48,12 +48,12 @@ export default function VariantStats(props: { variant: PromptVariant }) {
fontSize="xs"
py={cellPadding.y}
>
{showNumFinished && (
<Text>
{data.outputCount} / {data.scenarioCount}
</Text>
)}
<HStack px={cellPadding.x}>
{showNumFinished && (
<Text>
{data.outputCount} / {data.scenarioCount}
</Text>
)}
{data.evalResults.map((result) => {
const passedFrac = result.passCount / result.totalCount;
return (

View File

@@ -41,7 +41,21 @@ export const scenariosRouter = createTRPCRouter({
count,
};
}),
get: protectedProcedure.input(z.object({ id: z.string() })).query(async ({ input, ctx }) => {
const scenario = await prisma.testScenario.findUnique({
where: {
id: input.id,
},
});
if (!scenario) {
throw new Error(`Scenario with id ${input.id} does not exist`);
}
await requireCanViewExperiment(scenario.experimentId, ctx);
return scenario;
}),
create: protectedProcedure
.input(
z.object({

View File

@@ -107,4 +107,8 @@ export const useScenarios = () => {
);
};
export const useScenario = (scenarioId: string) => {
return api.scenarios.get.useQuery({ id: scenarioId });
};
export const useVisibleScenarioIds = () => useScenarios().data?.scenarios.map((s) => s.id) ?? [];