bump python client

2023-08-25 15:39:29 -07:00
65 changed files with 314 additions and 4124 deletions
--- a/README.md
+++ b/README.md
@@ -1,52 +1,14 @@
- <p align="center">
-  <a href="https://openpipe.ai">
-    <img height="70" src="https://github.com/openpipe/openpipe/assets/41524992/70af25fb-1f90-42d9-8a20-3606e3b5aaba" alt="logo">
-  </a>
-</p>
-<h1 align="center">
-  OpenPipe
-</h1>
+# OpenPipe

-<p align="center">
-  <i>Turn expensive prompts into cheap fine-tuned models.</i>
-</p>
+OpenPipe is a flexible playground for comparing and optimizing LLM prompts. It lets you quickly generate, test and compare candidate prompts, and can automatically [translate](#-translate-between-model-apis) those prompts between models.

-<p align="center">
-  <a href="/LICENSE"><img alt="License Apache-2.0" src="https://img.shields.io/github/license/openpipe/openpipe?style=flat-square"></a>
-  <a href='http://makeapullrequest.com'><img alt='PRs Welcome' src='https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square'/></a>
-  <a href="https://github.com/openpipe/openpipe/graphs/commit-activity"><img alt="GitHub commit activity" src="https://img.shields.io/github/commit-activity/m/openpipe/openpipe?style=flat-square"/></a>
-  <a href="https://github.com/openpipe/openpipe/issues"><img alt="GitHub closed issues" src="https://img.shields.io/github/issues-closed/openpipe/openpipe?style=flat-square"/></a>
-</p>
-
-<p align="center">
-  <a href="https://app.openpipe.ai/">Hosted App</a> - <a href="#running-locally">Running Locally</a> - <a href="#sample-experiments">Experiments</a>
-</p>
-
-<br>
-Use powerful but expensive LLMs to fine-tune smaller and cheaper models suited to your exact needs. Evaluate model and prompt combinations in the playground. Query your past requests and export optimized training data. Try it out at https://app.openpipe.ai or <a href="#running-locally">run it locally</a>.
-<br>
-
-
-## 🪛 Features
-
- * <b>Experiment</b>
-   * Bulk-test wide-reaching scenarios using code templating.
-   * Seamlessly translate prompts across different model APIs.
-   * Tap into autogenerated scenarios for fresh test perspectives.
-
- * <b>Fine-Tune (Beta)</b>
-   * Easy integration with OpenPipe's SDK in both Python and JS.
-   * Swiftly query logs using intuitive built-in filters.
-   * Export data in multiple training formats, including Alpaca and ChatGPT, with deduplication.
-   
-<img src="https://github.com/openpipe/openpipe/assets/41524992/eaa8b92d-4536-4f63-bbef-4b0b1a60f6b5" alt="fine-tune demo">
-
-<!-- <img height="400px" src="https://github.com/openpipe/openpipe/assets/41524992/66bb1843-cb72-4130-a369-eec2df3b8201" alt="playground demo"> -->
+<img src="https://github.com/openpipe/openpipe/assets/41524992/66bb1843-cb72-4130-a369-eec2df3b8201" alt="demo">

+You can use our hosted version of OpenPipe at https://openpipe.ai. You can also clone this repository and [run it locally](#running-locally).

 ## Sample Experiments

-These are sample experiments users have created that show how OpenPipe works. Feel free to fork them and start experimenting yourself.
+These are simple experiments users have created that show how OpenPipe works. Feel free to fork them and start experimenting yourself.

 - [Twitter Sentiment Analysis](https://app.openpipe.ai/experiments/62c20a73-2012-4a64-973c-4b665ad46a57)
 - [Reddit User Needs](https://app.openpipe.ai/experiments/22222222-2222-2222-2222-222222222222)
@@ -55,25 +17,37 @@ These are sample experiments users have created that show how OpenPipe works. Fe

 ## Supported Models

-#### OpenAI
-  - [GPT 3.5 Turbo](https://platform.openai.com/docs/guides/gpt/chat-completions-api)
-  - [GPT 3.5 Turbo 16k](https://platform.openai.com/docs/guides/gpt/chat-completions-api)
-  - [GPT 4](https://openai.com/gpt-4)
-#### Llama2
-  - [7b chat](https://replicate.com/a16z-infra/llama7b-v2-chat)
-  - [13b chat](https://replicate.com/a16z-infra/llama13b-v2-chat)
-  - [70b chat](https://replicate.com/replicate/llama70b-v2-chat)
-#### Llama2 Fine-Tunes
-  - [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B)
-  - [Open-Orca/OpenOrca-Platypus2-13B](https://huggingface.co/Open-Orca/OpenOrca-Platypus2-13B)
-  - [NousResearch/Nous-Hermes-Llama2-13b](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b)
-  - [jondurbin/airoboros-l2-13b-gpt4-2.0](https://huggingface.co/jondurbin/airoboros-l2-13b-gpt4-2.0)
-  - [lmsys/vicuna-13b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5)
-  - [Gryphe/MythoMax-L2-13b](https://huggingface.co/Gryphe/MythoMax-L2-13b)
-  - [NousResearch/Nous-Hermes-llama-2-7b](https://huggingface.co/NousResearch/Nous-Hermes-llama-2-7b)
-#### Anthropic
-  - [Claude 1 Instant](https://www.anthropic.com/index/introducing-claude)
-  - [Claude 2](https://www.anthropic.com/index/claude-2)
+- All models available through the OpenAI [chat completion API](https://platform.openai.com/docs/guides/gpt/chat-completions-api)
+- Llama2 [7b chat](https://replicate.com/a16z-infra/llama7b-v2-chat), [13b chat](https://replicate.com/a16z-infra/llama13b-v2-chat), [70b chat](https://replicate.com/replicate/llama70b-v2-chat).
+- Anthropic's [Claude 1 Instant](https://www.anthropic.com/index/introducing-claude) and [Claude 2](https://www.anthropic.com/index/claude-2)
+
+## Features
+
+### 🔍 Visualize Responses
+
+Inspect prompt completions side-by-side.
+
+### 🧪 Bulk-Test
+
+OpenPipe lets you _template_ a prompt. Use the templating feature to run the prompts you're testing against many potential inputs for broad coverage of your problem space.
+
+### 📟 Translate between Model APIs
+
+Write your prompt in one format and automatically convert it to work with any other model.
+
+<!-- <img width="480" alt="Screenshot 2023-08-01 at 11 55 38 PM" src="https://github.com/OpenPipe/OpenPipe/assets/41524992/1e19ccf2-96b6-4e93-a3a5-1449710d1b5b" alt="translate between models"> -->
+
+### 🛠️ Refine Your Prompts Automatically
+
+Use a growing database of best-practice refinements to improve your prompts automatically.
+
+<!-- <img width="480" alt="Screenshot 2023-08-01 at 11 55 38 PM" src="https://github.com/OpenPipe/OpenPipe/assets/41524992/87a27fe7-daef-445c-a5e2-1c82b23f9f99" alt="add function call"> -->
+
+### 🪄 Auto-generate Test Scenarios
+
+OpenPipe includes a tool to generate new test scenarios based on your existing prompts and scenarios. Just click "Autogenerate Scenario" to try it out!
+
+<!-- <img width="600" src="https://github.com/openpipe/openpipe/assets/41524992/219a844e-3f4e-4f6b-8066-41348b42977b" alt="auto-generate"> -->

 ## Running Locally

--- a/app/Dockerfile
+++ b/app/Dockerfile
@@ -23,6 +23,7 @@ ARG NEXT_PUBLIC_SOCKET_URL
 ARG NEXT_PUBLIC_HOST
 ARG NEXT_PUBLIC_SENTRY_DSN
 ARG SENTRY_AUTH_TOKEN
+ARG NEXT_PUBLIC_FF_SHOW_BETA_FEATURES

 WORKDIR /code
 COPY --from=deps /code/node_modules ./node_modules
--- a/app/LICENSE
+++ b/app/LICENSE
--- a/app/package.json
+++ b/app/package.json
@@ -48,7 +48,6 @@
    "@trpc/react-query": "^10.26.0",
    "@trpc/server": "^10.26.0",
    "@vercel/og": "^0.5.9",
-    "archiver": "^6.0.0",
    "ast-types": "^0.14.2",
    "chroma-js": "^2.4.2",
    "concurrently": "^8.2.0",
@@ -79,8 +78,7 @@
    "nextjs-routes": "^2.0.1",
    "nodemailer": "^6.9.4",
    "openai": "4.0.0-beta.7",
-    "openpipe": "^0.3.0",
-    "openpipe-dev": "workspace:^",
+    "openpipe": "workspace:*",
    "pg": "^8.11.2",
    "pluralize": "^8.0.0",
    "posthog-js": "^1.75.3",
@@ -101,7 +99,6 @@
    "replicate": "^0.12.3",
    "socket.io": "^4.7.1",
    "socket.io-client": "^4.7.1",
-    "stream-buffers": "^3.0.2",
    "superjson": "1.12.2",
    "trpc-openapi": "^1.2.0",
    "tsx": "^3.12.7",
@@ -114,7 +111,6 @@
  },
  "devDependencies": {
    "@openapi-contrib/openapi-schema-to-json-schema": "^4.0.5",
-    "@types/archiver": "^5.3.2",
    "@types/babel__core": "^7.20.1",
    "@types/babel__standalone": "^7.1.4",
    "@types/chroma-js": "^2.4.0",
@@ -131,7 +127,6 @@
    "@types/react": "^18.2.6",
    "@types/react-dom": "^18.2.4",
    "@types/react-syntax-highlighter": "^15.5.7",
-    "@types/stream-buffers": "^3.0.4",
    "@types/uuid": "^9.0.2",
    "@typescript-eslint/eslint-plugin": "^5.59.6",
    "@typescript-eslint/parser": "^5.59.6",
--- a/app/prisma/deleteOneFineTune.ts
+++ b/app/prisma/deleteOneFineTune.ts
@@ -1,12 +0,0 @@
-import { prisma } from "~/server/db";
-
-// delete most recent fineTune
-const mostRecentFineTune = await prisma.fineTune.findFirst({
-  orderBy: { createdAt: "desc" },
-});
-
-if (mostRecentFineTune) {
-  await prisma.fineTune.delete({
-    where: { id: mostRecentFineTune.id },
-  });
-}
--- a/app/prisma/seedDashboard.ts
+++ b/app/prisma/seedDashboard.ts
@@ -80,7 +80,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    },
    respStatus: 200,
    respPayload: {
-      id: "chatcmpl-7",
+      id: "chatcmpl-7lNspqePJWVyXwXebupxb1eMozo6Q",
      model: "gpt-3.5-turbo-0613",
      usage: {
        total_tokens: 241,
@@ -108,7 +108,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    inputTokens: 236,
    outputTokens: 5,
    finishReason: "stop",
-    tags: [{ name: "prompt_id", value: "define_func" }],
+    tags: [],
  },
  {
    reqPayload: {
@@ -167,7 +167,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    },
    respStatus: 200,
    respPayload: {
-      id: "chatcmpl-7",
+      id: "chatcmpl-7lNifmc5AncyAvleZRDBhAcLFYBIT",
      model: "gpt-3.5-turbo-0613",
      usage: {
        total_tokens: 227,
@@ -210,7 +210,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    },
    respStatus: 200,
    respPayload: {
-      id: "chatcmpl-7",
+      id: "chatcmpl-7lNh1TtrsJVgz3Nj70bKkZZk7xPi7",
      model: "gpt-3.5-turbo-0613",
      usage: {
        total_tokens: 21,
@@ -234,7 +234,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    inputTokens: 14,
    outputTokens: 7,
    finishReason: "stop",
-    tags: [{ name: "prompt_id", value: "translate_text" }],
+    tags: [{ name: "prompt_id", value: "id2" }],
  },
  {
    reqPayload: {
@@ -281,7 +281,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    },
    respStatus: 200,
    respPayload: {
-      id: "chatcmpl-7",
+      id: "chatcmpl-7lQS3MktOT8BTgNEytl9dkyssCQqL",
      model: "gpt-4-0613",
      usage: {
        total_tokens: 2910,
@@ -311,7 +311,7 @@ const MODEL_RESPONSE_TEMPLATES: {
    outputTokens: 108,
    finishReason: "stop",
    tags: [
-      { name: "prompt_id", value: "chatcmpl-7" },
+      { name: "prompt_id", value: "chatcmpl-7lQS3MktOT8BTgNEytl9dkyssCQqL" },
      { name: "some_other_tag", value: "some_other_value" },
    ],
  },
@@ -339,7 +339,7 @@ const loggedCallsToCreate: Prisma.LoggedCallCreateManyInput[] = [];
 const loggedCallModelResponsesToCreate: Prisma.LoggedCallModelResponseCreateManyInput[] = [];
 const loggedCallsToUpdate: Prisma.LoggedCallUpdateArgs[] = [];
 const loggedCallTagsToCreate: Prisma.LoggedCallTagCreateManyInput[] = [];
-for (let i = 0; i < 11437; i++) {
+for (let i = 0; i < 1437; i++) {
  const loggedCallId = uuidv4();
  const loggedCallModelResponseId = uuidv4();
  const template =
--- a/app/src/components/ChangeModelModal/ChangeModelModal.tsx
+++ b/app/src/components/ChangeModelModal/ChangeModelModal.tsx
@@ -1,4 +1,3 @@
-import { useState, useMemo, useCallback } from "react";
 import {
  Button,
  HStack,
@@ -15,18 +14,16 @@ import {
  VStack,
 } from "@chakra-ui/react";
 import { type PromptVariant } from "@prisma/client";
-import { isString } from "lodash-es";
+import { isObject, isString } from "lodash-es";
+import { useState } from "react";
 import { RiExchangeFundsFill } from "react-icons/ri";
-
 import { type ProviderModel } from "~/modelProviders/types";
 import { api } from "~/utils/api";
-import { useExperiment, useHandledAsyncCallback } from "~/utils/hooks";
+import { useExperiment, useHandledAsyncCallback, useVisibleScenarioIds } from "~/utils/hooks";
 import { lookupModel, modelLabel } from "~/utils/utils";
 import CompareFunctions from "../RefinePromptModal/CompareFunctions";
 import { ModelSearch } from "./ModelSearch";
 import { ModelStatsCard } from "./ModelStatsCard";
-import { maybeReportError } from "~/utils/errorHandling/maybeReportError";
-import { useAppStore } from "~/state/store";

 export const ChangeModelModal = ({
  variant,
@@ -35,43 +32,48 @@ export const ChangeModelModal = ({
  variant: PromptVariant;
  onClose: () => void;
 }) => {
-  const editorOptionsMap = useAppStore((s) => s.sharedVariantEditor.editorOptionsMap);
-  const originalPromptFn = useMemo(
-    () => editorOptionsMap[variant.uiId]?.getContent() || "",
-    [editorOptionsMap, variant.uiId],
-  );
-
  const originalModel = lookupModel(variant.modelProvider, variant.model);
  const [selectedModel, setSelectedModel] = useState({
    provider: variant.modelProvider,
    model: variant.model,
  } as ProviderModel);
  const [convertedModel, setConvertedModel] = useState<ProviderModel | undefined>();
-  const [modifiedPromptFn, setModifiedPromptFn] = useState<string>();
+  const visibleScenarios = useVisibleScenarioIds();
+
+  const utils = api.useContext();

  const experiment = useExperiment();

-  const { mutateAsync: getModifiedPromptMutateAsync } =
+  const { mutateAsync: getModifiedPromptMutateAsync, data: modifiedPromptFn } =
    api.promptVariants.getModifiedPromptFn.useMutation();

  const [getModifiedPromptFn, modificationInProgress] = useHandledAsyncCallback(async () => {
    if (!experiment) return;

-    const resp = await getModifiedPromptMutateAsync({
+    await getModifiedPromptMutateAsync({
      id: variant.id,
-      originalPromptFn,
      newModel: selectedModel,
    });
-    if (maybeReportError(resp)) return;
-    setModifiedPromptFn(resp.payload);
    setConvertedModel(selectedModel);
  }, [getModifiedPromptMutateAsync, onClose, experiment, variant, selectedModel]);

-  const replaceVariant = useCallback(() => {
-    if (!modifiedPromptFn) return;
-    editorOptionsMap[variant.uiId]?.setContent(modifiedPromptFn);
+  const replaceVariantMutation = api.promptVariants.replaceVariant.useMutation();
+
+  const [replaceVariant, replacementInProgress] = useHandledAsyncCallback(async () => {
+    if (
+      !variant.experimentId ||
+      !modifiedPromptFn ||
+      (isObject(modifiedPromptFn) && "status" in modifiedPromptFn)
+    )
+      return;
+    await replaceVariantMutation.mutateAsync({
+      id: variant.id,
+      promptConstructor: modifiedPromptFn,
+      streamScenarios: visibleScenarios,
+    });
+    await utils.promptVariants.list.invalidate();
    onClose();
-  }, [variant.uiId, editorOptionsMap, onClose, modifiedPromptFn]);
+  }, [replaceVariantMutation, variant, onClose, modifiedPromptFn]);

  const originalLabel = modelLabel(variant.modelProvider, variant.model);
  const selectedLabel = modelLabel(selectedModel.provider, selectedModel.model);
@@ -128,9 +130,9 @@ export const ChangeModelModal = ({
              colorScheme="blue"
              onClick={replaceVariant}
              minW={24}
-              isDisabled={!convertedModel || modificationInProgress}
+              isDisabled={!convertedModel || modificationInProgress || replacementInProgress}
            >
-              Accept
+              {replacementInProgress ? <Spinner boxSize={4} /> : <Text>Accept</Text>}
            </Button>
          </HStack>
        </ModalFooter>
--- a/app/src/components/InfoCircle.tsx
+++ b/app/src/components/InfoCircle.tsx
@@ -1,14 +0,0 @@
-import { Tooltip, Icon, VStack } from "@chakra-ui/react";
-import { RiInformationFill } from "react-icons/ri";
-
-const InfoCircle = ({ tooltipText }: { tooltipText: string }) => {
-  return (
-    <Tooltip label={tooltipText} fontSize="sm" shouldWrapChildren maxW={80}>
-      <VStack>
-        <Icon as={RiInformationFill} boxSize={5} color="gray.500" />
-      </VStack>
-    </Tooltip>
-  );
-};
-
-export default InfoCircle;
--- a/app/src/components/OutputsTable/OutputCell/OutputCell.tsx
+++ b/app/src/components/OutputsTable/OutputCell/OutputCell.tsx
@@ -147,10 +147,9 @@ export default function OutputCell({
                <ResponseLog
                  time={response.receivedAt}
                  title="Response received from API"
-                  message={[
-                    response.statusCode ? `Status: ${response.statusCode}\n` : "",
-                    response.errorMessage ?? "",
-                  ].join("")}
+                  message={`statusCode: ${response.statusCode ?? ""}\n ${
+                    response.errorMessage ?? ""
+                  }`}
                />
              )}
            </Fragment>
--- a/app/src/components/OutputsTable/VariantEditor.tsx
+++ b/app/src/components/OutputsTable/VariantEditor.tsx
@@ -10,7 +10,7 @@ import {
 } from "@chakra-ui/react";
 import { useCallback, useEffect, useRef, useState } from "react";
 import { FiMaximize, FiMinimize } from "react-icons/fi";
-import { type CreatedEditor, editorBackground } from "~/state/sharedVariantEditor.slice";
+import { editorBackground } from "~/state/sharedVariantEditor.slice";
 import { useAppStore } from "~/state/store";
 import { api } from "~/utils/api";
 import {
@@ -24,10 +24,8 @@ import { type PromptVariant } from "./types";
 export default function VariantEditor(props: { variant: PromptVariant }) {
  const { canModify } = useExperimentAccess();
  const monaco = useAppStore.use.sharedVariantEditor.monaco();
-  const updateOptionsForEditor = useAppStore.use.sharedVariantEditor.updateOptionsForEditor();
-  const editorRef = useRef<CreatedEditor | null>(null);
+  const editorRef = useRef<ReturnType<NonNullable<typeof monaco>["editor"]["create"]> | null>(null);
  const containerRef = useRef<HTMLDivElement | null>(null);
-  const lastSavedFnRef = useRef(props.variant.promptConstructor);
  const [editorId] = useState(() => `editor_${Math.random().toString(36).substring(7)}`);
  const [isChanged, setIsChanged] = useState(false);

@@ -50,18 +48,22 @@ export default function VariantEditor(props: { variant: PromptVariant }) {
  }, [isFullscreen, toggleFullscreen]);

  const lastSavedFn = props.variant.promptConstructor;
-  useEffect(() => {
-    // Store in ref so that we can access it dynamically
-    lastSavedFnRef.current = lastSavedFn;
-  }, [lastSavedFn]);

  const modifierKey = useModifierKeyLabel();

  const checkForChanges = useCallback(() => {
    if (!editorRef.current) return;
    const currentFn = editorRef.current.getValue();
-    setIsChanged(currentFn.length > 0 && currentFn !== lastSavedFnRef.current);
-  }, [editorRef]);
+    setIsChanged(currentFn.length > 0 && currentFn !== lastSavedFn);
+  }, [lastSavedFn]);
+
+  const matchUpdatedSavedFn = useCallback(() => {
+    if (!editorRef.current) return;
+    editorRef.current.setValue(lastSavedFn);
+    setIsChanged(false);
+  }, [lastSavedFn]);
+
+  useEffect(matchUpdatedSavedFn, [matchUpdatedSavedFn, lastSavedFn]);

  const replaceVariant = api.promptVariants.replaceVariant.useMutation();
  const utils = api.useContext();
@@ -134,11 +136,6 @@ export default function VariantEditor(props: { variant: PromptVariant }) {
        readOnly: !canModify,
      });

-      updateOptionsForEditor(props.variant.uiId, {
-        getContent: () => editorRef.current?.getValue() || "",
-        setContent: (content) => editorRef.current?.setValue(content),
-      });
-
      // Workaround because otherwise the commands only work on whatever
      // editor was loaded on the page last.
      // https://github.com/microsoft/monaco-editor/issues/2947#issuecomment-1422265201
@@ -158,7 +155,7 @@ export default function VariantEditor(props: { variant: PromptVariant }) {
        });
      });

-      const checkForChangesListener = editorRef.current.onDidChangeModelContent(checkForChanges);
+      editorRef.current.onDidChangeModelContent(checkForChanges);

      const resizeObserver = new ResizeObserver(() => {
        editorRef.current?.layout();
@@ -167,7 +164,6 @@ export default function VariantEditor(props: { variant: PromptVariant }) {

      return () => {
        resizeObserver.disconnect();
-        checkForChangesListener.dispose();
        editorRef.current?.dispose();
      };
    }
@@ -175,7 +171,7 @@ export default function VariantEditor(props: { variant: PromptVariant }) {
    // We intentionally skip the onSave and props.savedConfig dependencies here because
    // we don't want to re-render the editor from scratch
    /* eslint-disable-next-line react-hooks/exhaustive-deps */
-  }, [monaco, editorId, updateOptionsForEditor]);
+  }, [monaco, editorId]);

  useEffect(() => {
    if (!editorRef.current) return;
--- a/app/src/components/RefinePromptModal/RefinePromptModal.tsx
+++ b/app/src/components/RefinePromptModal/RefinePromptModal.tsx
@@ -1,4 +1,3 @@
-import { useState, useMemo, useCallback } from "react";
 import {
  Button,
  Modal,
@@ -10,23 +9,22 @@ import {
  ModalOverlay,
  VStack,
  Text,
+  Spinner,
  HStack,
  Icon,
  SimpleGrid,
 } from "@chakra-ui/react";
 import { BsStars } from "react-icons/bs";
 import { api } from "~/utils/api";
-import { useHandledAsyncCallback } from "~/utils/hooks";
+import { useHandledAsyncCallback, useVisibleScenarioIds } from "~/utils/hooks";
 import { type PromptVariant } from "@prisma/client";
-
+import { useState } from "react";
 import CompareFunctions from "./CompareFunctions";
 import { CustomInstructionsInput } from "../CustomInstructionsInput";
 import { RefineAction } from "./RefineAction";
-import { isString } from "lodash-es";
+import { isObject, isString } from "lodash-es";
 import { type RefinementAction, type SupportedProvider } from "~/modelProviders/types";
 import frontendModelProviders from "~/modelProviders/frontendModelProviders";
-import { useAppStore } from "~/state/store";
-import { maybeReportError } from "~/utils/errorHandling/maybeReportError";

 export const RefinePromptModal = ({
  variant,
@@ -35,23 +33,19 @@ export const RefinePromptModal = ({
  variant: PromptVariant;
  onClose: () => void;
 }) => {
-  const editorOptionsMap = useAppStore((s) => s.sharedVariantEditor.editorOptionsMap);
-  const originalPromptFn = useMemo(
-    () => editorOptionsMap[variant.uiId]?.getContent() || "",
-    [editorOptionsMap, variant.uiId],
-  );
+  const utils = api.useContext();
+  const visibleScenarios = useVisibleScenarioIds();

  const refinementActions =
    frontendModelProviders[variant.modelProvider as SupportedProvider].refinementActions || {};

-  const { mutateAsync: getModifiedPromptMutateAsync } =
+  const { mutateAsync: getModifiedPromptMutateAsync, data: refinedPromptFn } =
    api.promptVariants.getModifiedPromptFn.useMutation();
  const [instructions, setInstructions] = useState<string>("");

  const [activeRefineActionLabel, setActiveRefineActionLabel] = useState<string | undefined>(
    undefined,
  );
-  const [refinedPromptFn, setRefinedPromptFn] = useState<string>();

  const [getModifiedPromptFn, modificationInProgress] = useHandledAsyncCallback(
    async (label?: string) => {
@@ -60,22 +54,31 @@ export const RefinePromptModal = ({
        ? (refinementActions[label] as RefinementAction).instructions
        : instructions;
      setActiveRefineActionLabel(label);
-      const resp = await getModifiedPromptMutateAsync({
+      await getModifiedPromptMutateAsync({
        id: variant.id,
-        originalPromptFn,
        instructions: updatedInstructions,
      });
-      if (maybeReportError(resp)) return;
-      setRefinedPromptFn(resp.payload);
    },
    [getModifiedPromptMutateAsync, onClose, variant, instructions, setActiveRefineActionLabel],
  );

-  const replaceVariant = useCallback(() => {
-    if (!refinedPromptFn) return;
-    editorOptionsMap[variant.uiId]?.setContent(refinedPromptFn);
+  const replaceVariantMutation = api.promptVariants.replaceVariant.useMutation();
+
+  const [replaceVariant, replacementInProgress] = useHandledAsyncCallback(async () => {
+    if (
+      !variant.experimentId ||
+      !refinedPromptFn ||
+      (isObject(refinedPromptFn) && "status" in refinedPromptFn)
+    )
+      return;
+    await replaceVariantMutation.mutateAsync({
+      id: variant.id,
+      promptConstructor: refinedPromptFn,
+      streamScenarios: visibleScenarios,
+    });
+    await utils.promptVariants.list.invalidate();
    onClose();
-  }, [variant.uiId, editorOptionsMap, onClose, refinedPromptFn]);
+  }, [replaceVariantMutation, variant, onClose, refinedPromptFn]);

  return (
    <Modal
@@ -123,7 +126,7 @@ export const RefinePromptModal = ({
              />
            </VStack>
            <CompareFunctions
-              originalFunction={originalPromptFn}
+              originalFunction={variant.promptConstructor}
              newFunction={isString(refinedPromptFn) ? refinedPromptFn : undefined}
              maxH="40vh"
            />
@@ -136,9 +139,9 @@ export const RefinePromptModal = ({
              colorScheme="blue"
              onClick={replaceVariant}
              minW={24}
-              isDisabled={!refinedPromptFn}
+              isDisabled={replacementInProgress || !refinedPromptFn}
            >
-              Accept
+              {replacementInProgress ? <Spinner boxSize={4} /> : <Text>Accept</Text>}
            </Button>
          </HStack>
        </ModalFooter>
--- a/app/src/components/experiments/ExperimentCard.tsx
+++ b/app/src/components/experiments/ExperimentCard.tsx
@@ -98,7 +98,9 @@ export const NewExperimentCard = () => {
    >
      <VStack align="center" justify="center" w="full" h="full" p={4} onClick={createExperiment}>
        <Icon as={isLoading ? Spinner : BsPlusSquare} boxSize={8} />
-        <Text ml={2}>New Experiment</Text>
+        <Text display={{ base: "none", md: "block" }} ml={2}>
+          New Experiment
+        </Text>
      </VStack>
    </Card>
  );
--- a/app/src/components/nav/AppShell.tsx
+++ b/app/src/components/nav/AppShell.tsx
@@ -16,13 +16,13 @@ import Link from "next/link";
 import { BsGearFill, BsGithub, BsPersonCircle } from "react-icons/bs";
 import { IoStatsChartOutline } from "react-icons/io5";
 import { RiHome3Line, RiFlaskLine } from "react-icons/ri";
-import { AiOutlineThunderbolt } from "react-icons/ai";
+import { FaRobot } from "react-icons/fa";
 import { signIn, useSession } from "next-auth/react";
+import { env } from "~/env.mjs";
 import ProjectMenu from "./ProjectMenu";
 import NavSidebarOption from "./NavSidebarOption";
 import IconLink from "./IconLink";
 import { BetaModal } from "./BetaModal";
-import { useAppStore } from "~/state/store";

 const Divider = () => <Box h="1px" bgColor="gray.300" w="full" />;

@@ -75,7 +75,7 @@ const NavSidebar = () => {

            <IconLink icon={RiHome3Line} label="Dashboard" href="/dashboard" beta />
            <IconLink icon={IoStatsChartOutline} label="Request Logs" href="/request-logs" beta />
-            <IconLink icon={AiOutlineThunderbolt} label="Fine Tunes" href="/fine-tunes" beta />
+            <IconLink icon={FaRobot} label="Fine Tunes" href="/fine-tunes" beta />
            <IconLink icon={RiFlaskLine} label="Experiments" href="/experiments" />
            <VStack w="full" alignItems="flex-start" spacing={0} pt={8}>
              <Text
@@ -167,9 +167,6 @@ export default function AppShell({
    }
  }, [requireAuth, user, authLoading]);

-  const flags = useAppStore((s) => s.featureFlags.featureFlags);
-  const flagsLoaded = useAppStore((s) => s.featureFlags.flagsLoaded);
-
  return (
    <>
      <Flex h={vh} w="100vw">
@@ -181,7 +178,7 @@ export default function AppShell({
          {children}
        </Box>
      </Flex>
-      {requireBeta && flagsLoaded && !flags.betaAccess && <BetaModal />}
+      {requireBeta && !env.NEXT_PUBLIC_FF_SHOW_BETA_FEATURES && <BetaModal />}
    </>
  );
 }
--- a/app/src/components/nav/ProjectMenu.tsx
+++ b/app/src/components/nav/ProjectMenu.tsx
@@ -14,7 +14,6 @@ import {
  Link as ChakraLink,
  Image,
  Box,
-  Portal,
 } from "@chakra-ui/react";
 import { useEffect } from "react";
 import Link from "next/link";
@@ -110,66 +109,64 @@ export default function ProjectMenu() {
            </HStack>
          </NavSidebarOption>
        </PopoverTrigger>
-        <Portal>
-          <PopoverContent
-            _focusVisible={{ outline: "unset" }}
-            w={220}
-            ml={{ base: 2, md: 0 }}
-            boxShadow="0 0 40px 4px rgba(0, 0, 0, 0.1);"
-            fontSize="sm"
-          >
-            <VStack alignItems="flex-start" spacing={1} py={1}>
-              <Text px={3} py={2}>
-                {user?.user.email}
-              </Text>
-              <Divider />
-              <Text alignSelf="flex-start" fontWeight="bold" px={3} pt={2}>
-                Your Projects
-              </Text>
-              <VStack spacing={0} w="full" px={1}>
-                {projects?.map((proj) => (
-                  <ProjectOption
-                    key={proj.id}
-                    proj={proj}
-                    isActive={proj.id === selectedProjectId}
-                    onClose={popover.onClose}
-                  />
-                ))}
-                <HStack
-                  as={Button}
-                  variant="ghost"
-                  colorScheme="blue"
-                  color="blue.400"
-                  fontSize="sm"
-                  justifyContent="flex-start"
-                  onClick={createProject}
-                  w="full"
-                  borderRadius={4}
-                  spacing={0}
-                >
-                  <Text>Add project</Text>
-                  <Icon as={isLoading ? Spinner : BsPlus} boxSize={4} strokeWidth={0.5} />
-                </HStack>
-              </VStack>
-
-              <Divider />
-              <VStack w="full" px={1}>
-                <ChakraLink
-                  onClick={() => {
-                    signOut().catch(console.error);
-                  }}
-                  _hover={{ bgColor: "gray.200", textDecoration: "none" }}
-                  w="full"
-                  py={2}
-                  px={2}
-                  borderRadius={4}
-                >
-                  <Text>Sign out</Text>
-                </ChakraLink>
-              </VStack>
+        <PopoverContent
+          _focusVisible={{ outline: "unset" }}
+          w={220}
+          ml={{ base: 2, md: 0 }}
+          boxShadow="0 0 40px 4px rgba(0, 0, 0, 0.1);"
+          fontSize="sm"
+        >
+          <VStack alignItems="flex-start" spacing={1} py={1}>
+            <Text px={3} py={2}>
+              {user?.user.email}
+            </Text>
+            <Divider />
+            <Text alignSelf="flex-start" fontWeight="bold" px={3} pt={2}>
+              Your Projects
+            </Text>
+            <VStack spacing={0} w="full" px={1}>
+              {projects?.map((proj) => (
+                <ProjectOption
+                  key={proj.id}
+                  proj={proj}
+                  isActive={proj.id === selectedProjectId}
+                  onClose={popover.onClose}
+                />
+              ))}
+              <HStack
+                as={Button}
+                variant="ghost"
+                colorScheme="blue"
+                color="blue.400"
+                fontSize="sm"
+                justifyContent="flex-start"
+                onClick={createProject}
+                w="full"
+                borderRadius={4}
+                spacing={0}
+              >
+                <Text>Add project</Text>
+                <Icon as={isLoading ? Spinner : BsPlus} boxSize={4} strokeWidth={0.5} />
+              </HStack>
            </VStack>
-          </PopoverContent>
-        </Portal>
+
+            <Divider />
+            <VStack w="full" px={1}>
+              <ChakraLink
+                onClick={() => {
+                  signOut().catch(console.error);
+                }}
+                _hover={{ bgColor: "gray.200", textDecoration: "none" }}
+                w="full"
+                py={2}
+                px={2}
+                borderRadius={4}
+              >
+                <Text>Sign out</Text>
+              </ChakraLink>
+            </VStack>
+          </VStack>
+        </PopoverContent>
      </Popover>
    </VStack>
  );
--- a/app/src/components/requestLogs/ExportButton.tsx
+++ b/app/src/components/requestLogs/ExportButton.tsx
@@ -1,210 +0,0 @@
-import { useState, useEffect } from "react";
-import {
-  Modal,
-  ModalOverlay,
-  ModalContent,
-  ModalHeader,
-  ModalCloseButton,
-  ModalBody,
-  ModalFooter,
-  HStack,
-  VStack,
-  Icon,
-  Text,
-  Button,
-  Checkbox,
-  NumberInput,
-  NumberInputField,
-  NumberInputStepper,
-  NumberIncrementStepper,
-  NumberDecrementStepper,
-  Collapse,
-  Flex,
-  useDisclosure,
-  type UseDisclosureReturn,
-} from "@chakra-ui/react";
-import { BiExport } from "react-icons/bi";
-
-import { useHandledAsyncCallback } from "~/utils/hooks";
-import { api } from "~/utils/api";
-import { useAppStore } from "~/state/store";
-import ActionButton from "./ActionButton";
-import InputDropdown from "../InputDropdown";
-import { FiChevronUp, FiChevronDown } from "react-icons/fi";
-import InfoCircle from "../InfoCircle";
-
-const SUPPORTED_EXPORT_FORMATS = ["alpaca-finetune", "openai-fine-tune", "unformatted"];
-
-const ExportButton = () => {
-  const selectedLogIds = useAppStore((s) => s.selectedLogs.selectedLogIds);
-
-  const disclosure = useDisclosure();
-
-  return (
-    <>
-      <ActionButton
-        onClick={disclosure.onOpen}
-        label="Export"
-        icon={BiExport}
-        isDisabled={selectedLogIds.size === 0}
-      />
-      <ExportLogsModal disclosure={disclosure} />
-    </>
-  );
-};
-
-export default ExportButton;
-
-const ExportLogsModal = ({ disclosure }: { disclosure: UseDisclosureReturn }) => {
-  const selectedProjectId = useAppStore((s) => s.selectedProjectId);
-  const selectedLogIds = useAppStore((s) => s.selectedLogs.selectedLogIds);
-  const clearSelectedLogIds = useAppStore((s) => s.selectedLogs.clearSelectedLogIds);
-
-  const [selectedExportFormat, setSelectedExportFormat] = useState(SUPPORTED_EXPORT_FORMATS[0]);
-  const [testingSplit, setTestingSplit] = useState(10);
-  const [removeDuplicates, setRemoveDuplicates] = useState(true);
-  const [showAdvancedOptions, setShowAdvancedOptions] = useState(false);
-
-  useEffect(() => {
-    if (disclosure.isOpen) {
-      setSelectedExportFormat(SUPPORTED_EXPORT_FORMATS[0]);
-      setTestingSplit(10);
-      setRemoveDuplicates(true);
-    }
-  }, [disclosure.isOpen]);
-
-  const exportLogsMutation = api.loggedCalls.export.useMutation();
-
-  const [exportLogs, exportInProgress] = useHandledAsyncCallback(async () => {
-    if (!selectedProjectId || !selectedLogIds.size || !testingSplit || !selectedExportFormat)
-      return;
-    const response = await exportLogsMutation.mutateAsync({
-      projectId: selectedProjectId,
-      selectedLogIds: Array.from(selectedLogIds),
-      testingSplit,
-      selectedExportFormat,
-      removeDuplicates,
-    });
-
-    const dataUrl = `data:application/pdf;base64,${response}`;
-    const blob = await fetch(dataUrl).then((res) => res.blob());
-    const url = URL.createObjectURL(blob);
-    const a = document.createElement("a");
-
-    a.href = url;
-    a.download = `data.zip`;
-    document.body.appendChild(a);
-    a.click();
-    document.body.removeChild(a);
-
-    disclosure.onClose();
-    clearSelectedLogIds();
-  }, [
-    exportLogsMutation,
-    selectedProjectId,
-    selectedLogIds,
-    testingSplit,
-    selectedExportFormat,
-    removeDuplicates,
-  ]);
-
-  return (
-    <Modal size={{ base: "xl", md: "2xl" }} {...disclosure}>
-      <ModalOverlay />
-      <ModalContent w={1200}>
-        <ModalHeader>
-          <HStack>
-            <Icon as={BiExport} />
-            <Text>Export Logs</Text>
-          </HStack>
-        </ModalHeader>
-        <ModalCloseButton />
-        <ModalBody maxW="unset">
-          <VStack w="full" spacing={8} pt={4} alignItems="flex-start">
-            <Text>
-              We'll export the <b>{selectedLogIds.size}</b> logs you have selected in the format of
-              your choice.
-            </Text>
-            <VStack alignItems="flex-start" spacing={4}>
-              <Flex
-                flexDir={{ base: "column", md: "row" }}
-                alignItems={{ base: "flex-start", md: "center" }}
-              >
-                <HStack w={48} alignItems="center" spacing={1}>
-                  <Text fontWeight="bold">Format:</Text>
-                  <InfoCircle tooltipText="Format logs for for fine tuning or export them without formatting." />
-                </HStack>
-                <InputDropdown
-                  options={SUPPORTED_EXPORT_FORMATS}
-                  selectedOption={selectedExportFormat}
-                  onSelect={(option) => setSelectedExportFormat(option)}
-                  inputGroupProps={{ w: 48 }}
-                />
-              </Flex>
-              <Flex
-                flexDir={{ base: "column", md: "row" }}
-                alignItems={{ base: "flex-start", md: "center" }}
-              >
-                <HStack w={48} alignItems="center" spacing={1}>
-                  <Text fontWeight="bold">Testing Split:</Text>
-                  <InfoCircle tooltipText="The percent of your logs that will be reserved for testing and saved in another file. Logs are split randomly." />
-                </HStack>
-                <HStack>
-                  <NumberInput
-                    defaultValue={10}
-                    onChange={(_, num) => setTestingSplit(num)}
-                    min={1}
-                    max={100}
-                    w={48}
-                  >
-                    <NumberInputField />
-                    <NumberInputStepper>
-                      <NumberIncrementStepper />
-                      <NumberDecrementStepper />
-                    </NumberInputStepper>
-                  </NumberInput>
-                </HStack>
-              </Flex>
-            </VStack>
-            <VStack alignItems="flex-start" spacing={0}>
-              <Button
-                variant="unstyled"
-                color="blue.600"
-                onClick={() => setShowAdvancedOptions(!showAdvancedOptions)}
-              >
-                <HStack>
-                  <Text>Advanced Options</Text>
-                  <Icon as={showAdvancedOptions ? FiChevronUp : FiChevronDown} />
-                </HStack>
-              </Button>
-              <Collapse in={showAdvancedOptions} unmountOnExit={true}>
-                <VStack align="stretch" pt={4}>
-                  <HStack>
-                    <Checkbox
-                      colorScheme="blue"
-                      isChecked={removeDuplicates}
-                      onChange={(e) => setRemoveDuplicates(e.target.checked)}
-                    >
-                      <Text>Remove duplicates</Text>
-                    </Checkbox>
-                    <InfoCircle tooltipText="To avoid overfitting and speed up training, automatically deduplicate logs with matching input and output." />
-                  </HStack>
-                </VStack>
-              </Collapse>
-            </VStack>
-          </VStack>
-        </ModalBody>
-        <ModalFooter>
-          <HStack>
-            <Button colorScheme="gray" onClick={disclosure.onClose} minW={24}>
-              Cancel
-            </Button>
-            <Button colorScheme="blue" onClick={exportLogs} isLoading={exportInProgress} minW={24}>
-              Export
-            </Button>
-          </HStack>
-        </ModalFooter>
-      </ModalContent>
-    </Modal>
-  );
-};
--- a/app/src/components/requestLogs/FineTuneButton.tsx
+++ b/app/src/components/requestLogs/FineTuneButton.tsx
@@ -16,7 +16,7 @@ import {
  type UseDisclosureReturn,
  Input,
 } from "@chakra-ui/react";
-import { AiTwotoneThunderbolt } from "react-icons/ai";
+import { FaRobot } from "react-icons/fa";
 import humanId from "human-id";
 import { useRouter } from "next/router";

@@ -39,7 +39,7 @@ const FineTuneButton = () => {
      <ActionButton
        onClick={disclosure.onOpen}
        label="Fine Tune"
-        icon={AiTwotoneThunderbolt}
+        icon={FaRobot}
        isDisabled={selectedLogIds.size === 0}
      />
      <FineTuneModal disclosure={disclosure} />
@@ -90,7 +90,7 @@ const FineTuneModal = ({ disclosure }: { disclosure: UseDisclosureReturn }) => {
      <ModalContent w={1200}>
        <ModalHeader>
          <HStack>
-            <Icon as={AiTwotoneThunderbolt} />
+            <Icon as={FaRobot} />
            <Text>Fine Tune</Text>
          </HStack>
        </ModalHeader>
--- a/app/src/env.mjs
+++ b/app/src/env.mjs
@@ -46,6 +46,7 @@ export const env = createEnv({
    NEXT_PUBLIC_SOCKET_URL: z.string().url().default("http://localhost:3318"),
    NEXT_PUBLIC_HOST: z.string().url().default("http://localhost:3000"),
    NEXT_PUBLIC_SENTRY_DSN: z.string().optional(),
+    NEXT_PUBLIC_FF_SHOW_BETA_FEATURES: z.string().optional(),
  },

  /**
@@ -67,6 +68,7 @@ export const env = createEnv({
    NEXT_PUBLIC_SENTRY_DSN: process.env.NEXT_PUBLIC_SENTRY_DSN,
    SENTRY_AUTH_TOKEN: process.env.SENTRY_AUTH_TOKEN,
    OPENPIPE_API_KEY: process.env.OPENPIPE_API_KEY,
+    NEXT_PUBLIC_FF_SHOW_BETA_FEATURES: process.env.NEXT_PUBLIC_FF_SHOW_BETA_FEATURES,
    SENDER_EMAIL: process.env.SENDER_EMAIL,
    SMTP_HOST: process.env.SMTP_HOST,
    SMTP_PORT: process.env.SMTP_PORT,
--- a/app/src/modelProviders/openai-ChatCompletion/getCompletion.ts
+++ b/app/src/modelProviders/openai-ChatCompletion/getCompletion.ts
@@ -2,7 +2,7 @@
 import { isArray, isString } from "lodash-es";
 import { APIError } from "openai";
 import { type ChatCompletion, type CompletionCreateParams } from "openai/resources/chat";
-import mergeChunks from "openpipe/openai/mergeChunks";
+import mergeChunks from "openpipe/src/openai/mergeChunks";
 import { openai } from "~/server/utils/openai";
 import { type CompletionResponse } from "../types";

--- a/app/src/modelProviders/openpipe-chat/getCompletion.ts
+++ b/app/src/modelProviders/openpipe-chat/getCompletion.ts
@@ -17,23 +17,10 @@ const modelEndpoints: Record<OpenpipeChatInput["model"], string> = {
  "NousResearch/Nous-Hermes-llama-2-7b": "https://ua1bpc6kv3dgge-8000.proxy.runpod.net/v1",
 };

-const CUSTOM_MODELS_ENABLED = false;
-
 export async function getCompletion(
  input: OpenpipeChatInput,
  onStream: ((partialOutput: OpenpipeChatOutput) => void) | null,
 ): Promise<CompletionResponse<OpenpipeChatOutput>> {
-  // Temporarily disable these models because of GPU constraints
-
-  if (!CUSTOM_MODELS_ENABLED) {
-    return {
-      type: "error",
-      message:
-        "We've disabled this model temporarily because of GPU capacity constraints. Check back later.",
-      autoRetry: false,
-    };
-  }
-
  const { model, messages, ...rest } = input;

  const templatedPrompt = frontendModelProvider.models[model].templatePrompt?.(messages);
--- a/app/src/pages/request-logs/index.tsx
+++ b/app/src/pages/request-logs/index.tsx
@@ -11,7 +11,6 @@ import { FiFilter } from "react-icons/fi";
 import LogFilters from "~/components/requestLogs/LogFilters/LogFilters";
 import ColumnVisiblityDropdown from "~/components/requestLogs/ColumnVisiblityDropdown";
 import FineTuneButton from "~/components/requestLogs/FineTuneButton";
-import ExportButton from "~/components/requestLogs/ExportButton";

 export default function LoggedCalls() {
  const selectedLogIds = useAppStore((s) => s.selectedLogs.selectedLogIds);
@@ -36,7 +35,6 @@ export default function LoggedCalls() {
              icon={RiFlaskLine}
              isDisabled={selectedLogIds.size === 0}
            />
-            <ExportButton />
            <ColumnVisiblityDropdown />
            <ActionButton
              onClick={() => {
--- a/app/src/server/api/routers/loggedCalls.router.ts
+++ b/app/src/server/api/routers/loggedCalls.router.ts
@@ -1,16 +1,11 @@
 import { z } from "zod";
 import { type Expression, type SqlBool, sql, type RawBuilder } from "kysely";
 import { jsonArrayFrom } from "kysely/helpers/postgres";
-import archiver from "archiver";
-import { WritableStreamBuffer } from "stream-buffers";
-import { type JsonValue } from "type-fest";
-import { shuffle } from "lodash-es";

 import { createTRPCRouter, protectedProcedure } from "~/server/api/trpc";
 import { kysely, prisma } from "~/server/db";
 import { comparators, defaultFilterableFields } from "~/state/logFiltersSlice";
 import { requireCanViewProject } from "~/utils/accessControl";
-import hashObject from "~/server/utils/hashObject";

 // create comparator type based off of comparators
 const comparatorToSqlExpression = (comparator: (typeof comparators)[number], value: string) => {
@@ -185,102 +180,4 @@ export const loggedCallsRouter = createTRPCRouter({

      return tags.map((tag) => tag.name);
    }),
-  export: protectedProcedure
-    .input(
-      z.object({
-        projectId: z.string(),
-        selectedLogIds: z.string().array(),
-        testingSplit: z.number(),
-        selectedExportFormat: z.string(),
-        removeDuplicates: z.boolean(),
-      }),
-    )
-    .mutation(async ({ input, ctx }) => {
-      await requireCanViewProject(input.projectId, ctx);
-
-      // Fetch the real data using Prisma
-      const loggedCallsFromDb = await ctx.prisma.loggedCallModelResponse.findMany({
-        where: {
-          originalLoggedCall: {
-            projectId: input.projectId,
-            id: { in: input.selectedLogIds },
-          },
-          statusCode: 200,
-        },
-      });
-
-      // Convert the database data into the desired format
-      let formattedLoggedCalls: { instruction: JsonValue[]; output: JsonValue }[] =
-        loggedCallsFromDb.map((call) => ({
-          instruction: (call.reqPayload as unknown as Record<string, unknown>)
-            .messages as JsonValue[],
-          output: (call.respPayload as unknown as { choices: { message: unknown }[] }).choices[0]
-            ?.message as JsonValue,
-        }));
-
-      if (input.removeDuplicates) {
-        const deduplicatedLoggedCalls = [];
-        const loggedCallHashSet = new Set<string>();
-        for (const loggedCall of formattedLoggedCalls) {
-          const loggedCallHash = hashObject(loggedCall);
-          if (!loggedCallHashSet.has(loggedCallHash)) {
-            loggedCallHashSet.add(loggedCallHash);
-            deduplicatedLoggedCalls.push(loggedCall);
-          }
-        }
-        formattedLoggedCalls = deduplicatedLoggedCalls;
-      }
-
-      // Remove duplicate messages from instructions
-      const instructionMessageHashMap = new Map<string, number>();
-      for (const loggedCall of formattedLoggedCalls) {
-        for (const message of loggedCall.instruction) {
-          const hash = hashObject(message);
-          if (instructionMessageHashMap.has(hash)) {
-            // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
-            instructionMessageHashMap.set(hash, instructionMessageHashMap.get(hash)! + 1);
-          } else {
-            instructionMessageHashMap.set(hash, 0);
-          }
-        }
-      }
-      for (const loggedCall of formattedLoggedCalls) {
-        loggedCall.instruction = loggedCall.instruction.filter((message) => {
-          const hash = hashObject(message);
-          // If the same message appears in a single instruction multiple times, there is some danger of
-          // it being removed from all logged calls. This is enough of an edge case that we don't
-          // need to worry about it for now.
-          // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
-          return instructionMessageHashMap.get(hash)! < formattedLoggedCalls.length;
-        });
-      }
-
-      // Stringify instructions and outputs
-      const stringifiedLoggedCalls = shuffle(formattedLoggedCalls).map((loggedCall) => ({
-        instruction: JSON.stringify(loggedCall.instruction),
-        output: JSON.stringify(loggedCall.output),
-      }));
-
-      const splitIndex = Math.floor((stringifiedLoggedCalls.length * input.testingSplit) / 100);
-
-      const testingData = stringifiedLoggedCalls.slice(0, splitIndex);
-      const trainingData = stringifiedLoggedCalls.slice(splitIndex);
-
-      // Convert arrays to JSONL format
-      const trainingDataJSONL = trainingData.map((item) => JSON.stringify(item)).join("\n");
-      const testingDataJSONL = testingData.map((item) => JSON.stringify(item)).join("\n");
-
-      const output = new WritableStreamBuffer();
-      const archive = archiver("zip");
-
-      archive.pipe(output);
-      archive.append(trainingDataJSONL, { name: "train.jsonl" });
-      archive.append(testingDataJSONL, { name: "test.jsonl" });
-      await archive.finalize();
-
-      // Convert buffer to base64
-      const base64 = output.getContents().toString("base64");
-
-      return base64;
-    }),
 });
--- a/app/src/server/api/routers/promptVariants.router.ts
+++ b/app/src/server/api/routers/promptVariants.router.ts
@@ -298,7 +298,6 @@ export const promptVariantsRouter = createTRPCRouter({
    .input(
      z.object({
        id: z.string(),
-        originalPromptFn: z.string(),
        instructions: z.string().optional(),
        newModel: z
          .object({
@@ -316,21 +315,22 @@ export const promptVariantsRouter = createTRPCRouter({
      });
      await requireCanModifyExperiment(existing.experimentId, ctx);

+      const constructedPrompt = await parsePromptConstructor(existing.promptConstructor);
+
+      if ("error" in constructedPrompt) {
+        return error(constructedPrompt.error);
+      }
+
      const model = input.newModel
        ? modelProviders[input.newModel.provider].models[input.newModel.model]
        : undefined;

-      const promptConstructionFn = await deriveNewConstructFn(
-        existing,
-        input.originalPromptFn,
-        model,
-        input.instructions,
-      );
+      const promptConstructionFn = await deriveNewConstructFn(existing, model, input.instructions);

      // TODO: Validate promptConstructionFn
      // TODO: Record in some sort of history

-      return success(promptConstructionFn);
+      return promptConstructionFn;
    }),

  replaceVariant: protectedProcedure
--- a/app/src/server/utils/deriveNewContructFn.ts
+++ b/app/src/server/utils/deriveNewContructFn.ts
@@ -12,20 +12,14 @@ const isolate = new ivm.Isolate({ memoryLimit: 128 });

 export async function deriveNewConstructFn(
  originalVariant: PromptVariant | null,
-  originalPromptFn?: string,
  newModel?: Model,
  instructions?: string,
 ) {
-  if (originalPromptFn && !newModel && !instructions) {
-    return originalPromptFn;
+  if (originalVariant && !newModel && !instructions) {
+    return originalVariant.promptConstructor;
  }
-  if (originalVariant && originalPromptFn && (newModel || instructions)) {
-    return await requestUpdatedPromptFunction(
-      originalVariant,
-      originalPromptFn,
-      newModel,
-      instructions,
-    );
+  if (originalVariant && (newModel || instructions)) {
+    return await requestUpdatedPromptFunction(originalVariant, newModel, instructions);
  }
  return dedent`
    prompt = {
@@ -42,7 +36,6 @@ export async function deriveNewConstructFn(
 const NUM_RETRIES = 5;
 const requestUpdatedPromptFunction = async (
  originalVariant: PromptVariant,
-  originalPromptFn: string,
  newModel?: Model,
  instructions?: string,
 ) => {
@@ -62,7 +55,7 @@ const requestUpdatedPromptFunction = async (
        },
        {
          role: "user",
-          content: `This is the current prompt constructor function:\n---\n${originalPromptFn}`,
+          content: `This is the current prompt constructor function:\n---\n${originalVariant.promptConstructor}`,
        },
      ];
      if (newModel) {
--- a/app/src/server/utils/openai.ts
+++ b/app/src/server/utils/openai.ts
@@ -1,6 +1,6 @@
 import fs from "fs";
 import path from "path";
-import OpenAI, { type ClientOptions } from "openpipe/openai";
+import OpenAI, { type ClientOptions } from "openpipe/src/openai";

 import { env } from "~/env.mjs";

@@ -17,7 +17,14 @@ try {
  // Set a dummy key so it doesn't fail at build time
  config = {
    apiKey: env.OPENAI_API_KEY ?? "dummy-key",
+    openpipe: {
+      apiKey: env.OPENPIPE_API_KEY,
+      baseUrl: "http://localhost:3000/api/v1",
+    },
  };
 }

+// export const openai = env.OPENPIPE_API_KEY ? new OpenAI.OpenAI(config) : new OriginalOpenAI(config);
+
 export const openai = new OpenAI(config);
+`
--- a/app/src/state/featureFlags.ts
+++ b/app/src/state/featureFlags.ts
@@ -1,23 +0,0 @@
-import { type SliceCreator } from "./store";
-
-export type FeatureFlagsSlice = {
-  flagsLoaded: boolean;
-  featureFlags: {
-    betaAccess: boolean;
-  };
-  setFeatureFlags: (flags: string[] | undefined) => void;
-};
-
-export const createFeatureFlagsSlice: SliceCreator<FeatureFlagsSlice> = (set) => ({
-  flagsLoaded: false,
-  featureFlags: {
-    betaAccess: false,
-  },
-  setFeatureFlags: (flags) =>
-    set((state) => {
-      state.featureFlags.featureFlags = {
-        betaAccess: flags?.includes("betaAccess") ?? false,
-      };
-      state.featureFlags.flagsLoaded = true;
-    }),
-});
--- a/app/src/state/sharedVariantEditor.slice.ts
+++ b/app/src/state/sharedVariantEditor.slice.ts
@@ -1,26 +1,16 @@
-import loader, { type Monaco } from "@monaco-editor/loader";
-
 import { type RouterOutputs } from "~/utils/api";
 import { type SliceCreator } from "./store";
+import loader from "@monaco-editor/loader";
 import formatPromptConstructor from "~/promptConstructor/format";

 export const editorBackground = "#fafafa";

-export type CreatedEditor = ReturnType<Monaco["editor"]["create"]>;
-
-type EditorOptions = {
-  getContent: () => string;
-  setContent: (content: string) => void;
-};
-
 export type SharedVariantEditorSlice = {
-  monaco: null | Monaco;
+  monaco: null | ReturnType<typeof loader.__getMonacoInstance>;
  loadMonaco: () => Promise<void>;
  scenarioVars: RouterOutputs["scenarioVars"]["list"];
  updateScenariosModel: () => void;
  setScenarioVars: (scenarioVars: RouterOutputs["scenarioVars"]["list"]) => void;
-  editorOptionsMap: Record<string, EditorOptions>;
-  updateOptionsForEditor: (uiId: string, { getContent, setContent }: EditorOptions) => void;
 };

 export const createVariantEditorSlice: SliceCreator<SharedVariantEditorSlice> = (set, get) => ({
@@ -103,10 +93,4 @@ export const createVariantEditorSlice: SliceCreator<SharedVariantEditorSlice> =
      );
    }
  },
-  editorOptionsMap: {},
-  updateOptionsForEditor: (uiId, options) => {
-    set((state) => {
-      state.sharedVariantEditor.editorOptionsMap[uiId] = options;
-    });
-  },
 });
--- a/app/src/state/store.ts
+++ b/app/src/state/store.ts
@@ -11,8 +11,7 @@ import { type APIClient } from "~/utils/api";
 import { type PersistedState, persistOptions } from "./persist";
 import { type SelectedLogsSlice, createSelectedLogsSlice } from "./selectedLogsSlice";
 import { type LogFiltersSlice, createLogFiltersSlice } from "./logFiltersSlice";
-import { type ColumnVisibilitySlice, createColumnVisibilitySlice } from "./columnVisiblitySlice";
-import { type FeatureFlagsSlice, createFeatureFlagsSlice } from "./featureFlags";
+import { createColumnVisibilitySlice, type ColumnVisibilitySlice } from "./columnVisiblitySlice";

 enableMapSet();

@@ -29,7 +28,6 @@ export type State = {
  selectedLogs: SelectedLogsSlice;
  logFilters: LogFiltersSlice;
  columnVisibility: ColumnVisibilitySlice;
-  featureFlags: FeatureFlagsSlice;
 };

 export type SliceCreator<T> = StateCreator<State, [["zustand/immer", never]], [], T>;
@@ -64,7 +62,6 @@ const useBaseStore = create<State, [["zustand/persist", PersistedState], ["zusta
      selectedLogs: createSelectedLogsSlice(set, get, ...rest),
      logFilters: createLogFiltersSlice(set, get, ...rest),
      columnVisibility: createColumnVisibilitySlice(set, get, ...rest),
-      featureFlags: createFeatureFlagsSlice(set, get, ...rest),
    })),
    persistOptions,
  ),
--- a/app/src/utils/analytics/posthog.tsx
+++ b/app/src/utils/analytics/posthog.tsx
@@ -1,12 +1,11 @@
 "use client";
 import { useSession } from "next-auth/react";
 import React, { type ReactNode, useEffect } from "react";
-import { PostHogProvider, useActiveFeatureFlags } from "posthog-js/react";
+import { PostHogProvider } from "posthog-js/react";

 import posthog from "posthog-js";
 import { env } from "~/env.mjs";
 import { useRouter } from "next/router";
-import { useAppStore } from "~/state/store";

 // Make sure we're in the browser
 const inBrowser = typeof window !== "undefined";
@@ -25,14 +24,6 @@ export const PosthogAppProvider = ({ children }: { children: ReactNode }) => {
    };
  }, [router.events]);

-  const setFeatureFlags = useAppStore((s) => s.featureFlags.setFeatureFlags);
-  const activeFlags = useActiveFeatureFlags();
-  useEffect(() => {
-    if (activeFlags) {
-      setFeatureFlags(activeFlags);
-    }
-  }, [activeFlags, setFeatureFlags]);
-
  useEffect(() => {
    if (env.NEXT_PUBLIC_POSTHOG_KEY && inBrowser && session && session.user) {
      posthog.init(env.NEXT_PUBLIC_POSTHOG_KEY, {
--- a/client-libs/python/openpipe/shared.py
+++ b/client-libs/python/openpipe/shared.py
@@ -6,11 +6,11 @@ from openpipe.api_client.client import AuthenticatedClient
 from openpipe.api_client.models.report_json_body_tags import (
    ReportJsonBodyTags,
 )
-import toml
 import time
 import os

-version = toml.load("pyproject.toml")["tool"]["poetry"]["version"]
+# TODO: sync with pyproject.toml
+version = "3.0.3"

 configured_client = AuthenticatedClient(
    base_url="https://app.openpipe.ai/api/v1", token=""
--- a/client-libs/python/pyproject.toml
+++ b/client-libs/python/pyproject.toml
@@ -1,12 +1,13 @@
 [tool.poetry]
 name = "openpipe"
-version = "3.0.1"
+version = "3.0.3"
 description = "Python client library for the OpenPipe service"
 authors = ["Kyle Corbitt <kyle@openpipe.ai>"]
 license = "Apache-2.0"
 readme = "README.md"
 homepage = "https://github.com/OpenPipe/OpenPipe"
 repository = "https://github.com/OpenPipe/OpenPipe"
+include = ["pyproject.toml"]

 [tool.poetry.dependencies]
 python = "^3.9"
--- a/client-libs/typescript/README.md
+++ b/client-libs/typescript/README.md
@@ -1,70 +0,0 @@
-# OpenPipe Node API Library
-
-[![NPM version](https://img.shields.io/npm/v/openpipe.svg)](https://npmjs.org/package/openpipe)
-
-This library wraps TypeScript or Javascript OpenAI API calls and logs additional data to the configured `OPENPIPE_BASE_URL` for further processing.
-
-It is fully compatible with OpenAI's sdk and logs both streaming and non-streaming requests and responses.
-
-<!-- To learn more about using OpenPipe, check out our [Documentation](https://docs.openpipe.ai/docs/api). -->
-
-## Installation
-
-```sh
-npm install --save openpipe
-# or
-yarn add openpipe
-```
-
-## Usage
-
-1. Create a project at https://app.openpipe.ai
-2. Find your project's API key at https://app.openpipe.ai/project/settings
-3. Configure the OpenPipe client as shown below.
-
-```js
-// import OpenAI from 'openai'
-import OpenAI from "openpipe/openai";
-
-// Fully compatible with original OpenAI initialization
-const openai = new OpenAI({
-  apiKey: "my api key", // defaults to process.env["OPENAI_API_KEY"]
-  // openpipe key is optional
-  openpipe: {
-    apiKey: "my api key", // defaults to process.env["OPENPIPE_API_KEY"]
-    baseUrl: "my url", // defaults to process.env["OPENPIPE_BASE_URL"] or https://app.openpipe.ai/api/v1 if not set
-  },
-});
-
-async function main() {
-  // Allows optional openpipe object
-  const completion = await openai.chat.completions.create({
-    messages: [{ role: "user", content: "Say this is a test" }],
-    model: "gpt-3.5-turbo",
-    // optional
-    openpipe: {
-      // Add custom searchable tags
-      tags: {
-        prompt_id: "getCompletion",
-        any_key: "any_value",
-      },
-    },
-  });
-
-  console.log(completion.choices);
-}
-
-main();
-```
-
-## FAQ
-
-<i>How do I report calls to my self-hosted instance?</i>
-
-Start an instance by following the instructions on [Running Locally](https://github.com/OpenPipe/OpenPipe#running-locally). Once it's running, point your `OPENPIPE_BASE_URL` to your self-hosted instance.
-
-<i>What if my `OPENPIPE_BASE_URL` is misconfigured or my instance goes down? Will my OpenAI calls stop working?</i>
-
-Your OpenAI calls will continue to function as expected no matter what. The sdk handles logging errors gracefully without affecting OpenAI inference.
-
-See the [GitHub repo](https://github.com/OpenPipe/OpenPipe) for more details.
--- a/client-libs/typescript/build.sh
+++ b/client-libs/typescript/build.sh
@@ -1,27 +0,0 @@
-#!/usr/bin/env bash
-
-# Adapted from https://github.com/openai/openai-node/blob/master/build
-
-set -exuo pipefail
-
-rm -rf dist /tmp/openpipe-build-dist
-
-mkdir /tmp/openpipe-build-dist
-
-cp -rp * /tmp/openpipe-build-dist
-
-# Rename package name in package.json
-python3 -c "
-import json
-with open('/tmp/openpipe-build-dist/package.json', 'r') as f:
-    data = json.load(f)
-data['name'] = 'openpipe'
-with open('/tmp/openpipe-build-dist/package.json', 'w') as f:
-    json.dump(data, f, indent=4)
-"
-
-rm -rf /tmp/openpipe-build-dist/node_modules
-mv /tmp/openpipe-build-dist dist
-
-# build to .js files
-(cd dist && npm exec tsc -- --noEmit false)
--- a/client-libs/typescript/index.ts
+++ b/client-libs/typescript/index.ts
@@ -1 +1,3 @@
-export * as openai from "./openai";
+// main.ts or index.ts at the root level
+export * as OpenAI from "./src/openai";
+export * as OpenAILegacy from "./src/openai-legacy";
--- a/client-libs/typescript/package.json
+++ b/client-libs/typescript/package.json
@@ -1,17 +1,14 @@
 {
-  "name": "openpipe-dev",
-  "version": "0.3.3",
+  "name": "openpipe",
+  "version": "0.1.0",
  "type": "module",
  "description": "Metrics and auto-evaluation for LLM calls",
  "scripts": {
-    "build": "./build.sh",
+    "build": "tsc",
    "test": "vitest"
  },
-  "main": "./index.ts",
-  "publishConfig": {
-    "access": "public",
-    "main": "./index.js"
-  },
+  "main": "dist/index.js",
+  "types": "dist/index.d.ts",
  "keywords": [],
  "author": "",
  "license": "Apache-2.0",
--- a/client-libs/typescript/publish.sh
+++ b/client-libs/typescript/publish.sh
@@ -1,9 +0,0 @@
-#!/usr/bin/env bash
-
-# Adapted from https://github.com/openai/openai-node/blob/master/build
-
-set -exuo pipefail
-
-./build.sh
-
-(cd dist && pnpm publish --access public)
--- a/client-libs/typescript/src/codegen/OPClient.ts
+++ b/client-libs/typescript/src/codegen/OPClient.ts
--- a/client-libs/typescript/src/codegen/core/ApiError.ts
+++ b/client-libs/typescript/src/codegen/core/ApiError.ts
--- a/client-libs/typescript/src/codegen/core/ApiRequestOptions.ts
+++ b/client-libs/typescript/src/codegen/core/ApiRequestOptions.ts
--- a/client-libs/typescript/src/codegen/core/ApiResult.ts
+++ b/client-libs/typescript/src/codegen/core/ApiResult.ts
--- a/client-libs/typescript/src/codegen/core/BaseHttpRequest.ts
+++ b/client-libs/typescript/src/codegen/core/BaseHttpRequest.ts
--- a/client-libs/typescript/src/codegen/core/CancelablePromise.ts
+++ b/client-libs/typescript/src/codegen/core/CancelablePromise.ts
--- a/client-libs/typescript/src/codegen/core/NodeHttpRequest.ts
+++ b/client-libs/typescript/src/codegen/core/NodeHttpRequest.ts
--- a/client-libs/typescript/src/codegen/core/OpenAPI.ts
+++ b/client-libs/typescript/src/codegen/core/OpenAPI.ts
--- a/client-libs/typescript/src/codegen/core/request.ts
+++ b/client-libs/typescript/src/codegen/core/request.ts
--- a/client-libs/typescript/src/codegen/index.ts
+++ b/client-libs/typescript/src/codegen/index.ts
--- a/client-libs/typescript/src/codegen/services/DefaultService.ts
+++ b/client-libs/typescript/src/codegen/services/DefaultService.ts
--- a/client-libs/typescript/src/openai-legacy/index.ts
+++ b/client-libs/typescript/src/openai-legacy/index.ts
@@ -0,0 +1,85 @@
+import * as openPipeClient from "../codegen";
+import * as openai from "openai-legacy";
+import { version } from "../../package.json";
+
+// Anything we don't override we want to pass through to openai directly
+export * as openAILegacy from "openai-legacy";
+
+type OPConfigurationParameters = {
+  apiKey?: string;
+  basePath?: string;
+};
+
+export class Configuration extends openai.Configuration {
+  public qkConfig?: openPipeClient.Configuration;
+
+  constructor(
+    config: openai.ConfigurationParameters & {
+      opParameters?: OPConfigurationParameters;
+    }
+  ) {
+    super(config);
+    if (config.opParameters) {
+      this.qkConfig = new openPipeClient.Configuration(config.opParameters);
+    }
+  }
+}
+
+type CreateChatCompletion = InstanceType<typeof openai.OpenAIApi>["createChatCompletion"];
+
+export class OpenAIApi extends openai.OpenAIApi {
+  public openPipeApi?: openPipeClient.DefaultApi;
+
+  constructor(config: Configuration) {
+    super(config);
+    if (config.qkConfig) {
+      this.openPipeApi = new openPipeClient.DefaultApi(config.qkConfig);
+    }
+  }
+
+  public async createChatCompletion(
+    createChatCompletionRequest: Parameters<CreateChatCompletion>[0],
+    options?: Parameters<CreateChatCompletion>[1]
+  ): ReturnType<CreateChatCompletion> {
+    const requestedAt = Date.now();
+    let resp: Awaited<ReturnType<CreateChatCompletion>> | null = null;
+    let respPayload: openai.CreateChatCompletionResponse | null = null;
+    let statusCode: number | undefined = undefined;
+    let errorMessage: string | undefined;
+    try {
+      resp = await super.createChatCompletion(createChatCompletionRequest, options);
+      respPayload = resp.data;
+      statusCode = resp.status;
+    } catch (err) {
+      console.error("Error in createChatCompletion");
+      if ("isAxiosError" in err && err.isAxiosError) {
+        errorMessage = err.response?.data?.error?.message;
+        respPayload = err.response?.data;
+        statusCode = err.response?.status;
+      } else if ("message" in err) {
+        errorMessage = err.message.toString();
+      }
+      throw err;
+    } finally {
+      this.openPipeApi
+        ?.externalApiReport({
+          requestedAt,
+          receivedAt: Date.now(),
+          reqPayload: createChatCompletionRequest,
+          respPayload: respPayload,
+          statusCode: statusCode,
+          errorMessage,
+          tags: {
+            client: "openai-js",
+            clientVersion: version,
+          },
+        })
+        .catch((err) => {
+          console.error("Error reporting to OP", err);
+        });
+    }
+
+    console.log("done");
+    return resp;
+  }
+}
--- a/client-libs/typescript/src/openai/index.test.ts
+++ b/client-libs/typescript/src/openai/index.test.ts
@@ -80,7 +80,6 @@ test("bad call streaming", async () => {
      stream: true,
    });
  } catch (e) {
-    // @ts-expect-error need to check for error type
    await e.openpipe.reportingFinished;
    const lastLogged = await lastLoggedCall();
    expect(lastLogged?.modelResponse?.errorMessage).toEqual(
@@ -97,9 +96,7 @@ test("bad call", async () => {
      messages: [{ role: "system", content: "count to 10" }],
    });
  } catch (e) {
-    // @ts-expect-error need to check for error type
    assert("openpipe" in e);
-    // @ts-expect-error need to check for error type
    await e.openpipe.reportingFinished;
    const lastLogged = await lastLoggedCall();
    expect(lastLogged?.modelResponse?.errorMessage).toEqual(
@@ -123,8 +120,7 @@ test("caching", async () => {

  await completion.openpipe.reportingFinished;
  const firstLogged = await lastLoggedCall();
-
-  expect(completion.choices[0]?.message.content).toEqual(
+  expect(completion.choices[0].message.content).toEqual(
    firstLogged?.modelResponse?.respPayload.choices[0].message.content,
  );

--- a/client-libs/typescript/src/openai/index.ts
+++ b/client-libs/typescript/src/openai/index.ts
--- a/client-libs/typescript/src/openai/mergeChunks.ts
+++ b/client-libs/typescript/src/openai/mergeChunks.ts
--- a/client-libs/typescript/src/openai/streaming.ts
+++ b/client-libs/typescript/src/openai/streaming.ts
--- a/client-libs/typescript/src/shared.ts
+++ b/client-libs/typescript/src/shared.ts
@@ -1,5 +1,4 @@
-import pkg from "./package.json";
-
+import pkg from "../package.json";
 import { DefaultService } from "./codegen";

 export type OpenPipeConfig = {
--- a/client-libs/typescript/tsconfig.json
+++ b/client-libs/typescript/tsconfig.json
@@ -14,12 +14,9 @@
    "isolatedModules": true,
    "incremental": true,
    "noUncheckedIndexedAccess": true,
-    "noEmit": true,
-    "sourceMap": true,
-    "declaration": true,
-    "declarationMap": true,
-    "rootDir": "."
+    "baseUrl": ".",
+    "outDir": "dist"
  },
-  "include": ["**/*.ts"],
+  "include": ["src/**/*.ts"],
  "exclude": ["node_modules"]
 }
--- a/examples/.gitignore
+++ b/examples/.gitignore
@@ -1,4 +0,0 @@
-axolotl/
-models/
-data/
-wandb/
--- a/examples/classify-recipes/.env.example
+++ b/examples/classify-recipes/.env.example
@@ -1,7 +0,0 @@
-OPENAI_API_KEY="[your OpenAI API key]"
-OPENPIPE_API_KEY="[your OpenPipe API key from https://app.openpipe.ai/project/settings]"
-
-# You'll need this to download the Llama 2 weights from Hugging Face
-HUGGING_FACE_HUB_TOKEN="[Your Hugging Face Hub token]"
-
-WANDB_API_KEY="[Optionally, you can set a Weights & Biases API key to track your training run. Create it at https://wandb.ai/settings]"
--- a/examples/classify-recipes/.ipynb_checkpoints/benchmark-checkpoint.ipynb
+++ b/examples/classify-recipes/.ipynb_checkpoints/benchmark-checkpoint.ipynb
@@ -1,473 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Current Time: 2023-08-24 21:25:06\n",
-      "Current Time: 2023-08-24 21:25:36\n"
-     ]
-    }
-   ],
-   "source": [
-    "import time\n",
-    "\n",
-    "while True:\n",
-    "    current_time = time.strftime(\"%Y-%m-%d %H:%M:%S\", time.localtime())\n",
-    "    print(f\"Current Time: {current_time}\")\n",
-    "    time.sleep(30)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "I'm pretty happy with my model's accuracy relative to GPT-4. How does it compare cost-wise?\n",
-    "\n",
-    "I'll really push this to its limits -- let's see how quickly our poor model can classify the [full 2-million-recipe dataset](https://huggingface.co/datasets/corbt/all-recipes) 😈."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Requirement already satisfied: datasets==2.14.4 in /usr/local/lib/python3.10/dist-packages (2.14.4)\n",
-      "Requirement already satisfied: vllm==0.1.3 in /usr/local/lib/python3.10/dist-packages (0.1.3)\n",
-      "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (1.24.4)\n",
-      "Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (12.0.1)\n",
-      "Requirement already satisfied: dill<0.3.8,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (0.3.7)\n",
-      "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (2.0.3)\n",
-      "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (2.28.1)\n",
-      "Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (4.66.1)\n",
-      "Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (3.3.0)\n",
-      "Requirement already satisfied: multiprocess in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (0.70.15)\n",
-      "Requirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (2023.6.0)\n",
-      "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (3.8.5)\n",
-      "Requirement already satisfied: huggingface-hub<1.0.0,>=0.14.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (0.16.4)\n",
-      "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (23.1)\n",
-      "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (6.0)\n",
-      "Requirement already satisfied: ninja in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.11.1)\n",
-      "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (5.9.5)\n",
-      "Requirement already satisfied: ray>=2.5.1 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (2.6.3)\n",
-      "Requirement already satisfied: sentencepiece in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.1.99)\n",
-      "Requirement already satisfied: torch>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (2.0.1+cu118)\n",
-      "Requirement already satisfied: transformers>=4.31.0 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (4.33.0.dev0)\n",
-      "Requirement already satisfied: xformers>=0.0.19 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.0.21)\n",
-      "Requirement already satisfied: fastapi in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.101.1)\n",
-      "Requirement already satisfied: uvicorn in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.23.2)\n",
-      "Requirement already satisfied: pydantic<2 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.10.12)\n",
-      "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (23.1.0)\n",
-      "Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (2.1.1)\n",
-      "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (6.0.4)\n",
-      "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (4.0.3)\n",
-      "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (1.9.2)\n",
-      "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (1.4.0)\n",
-      "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (1.3.1)\n",
-      "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.4) (3.9.0)\n",
-      "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.4) (4.7.1)\n",
-      "Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (8.1.7)\n",
-      "Requirement already satisfied: jsonschema in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (4.18.0)\n",
-      "Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.0.5)\n",
-      "Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (4.24.1)\n",
-      "Requirement already satisfied: grpcio>=1.42.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.57.0)\n",
-      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.14.4) (3.4)\n",
-      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.14.4) (1.26.13)\n",
-      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.14.4) (2022.12.7)\n",
-      "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (1.11.1)\n",
-      "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (3.0)\n",
-      "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (3.1.2)\n",
-      "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (2.0.0)\n",
-      "Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=2.0.0->vllm==0.1.3) (3.25.0)\n",
-      "Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=2.0.0->vllm==0.1.3) (15.0.7)\n",
-      "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (2023.8.8)\n",
-      "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.13.3)\n",
-      "Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.3.2)\n",
-      "Requirement already satisfied: starlette<0.28.0,>=0.27.0 in /usr/local/lib/python3.10/dist-packages (from fastapi->vllm==0.1.3) (0.27.0)\n",
-      "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.14.4) (2.8.2)\n",
-      "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.14.4) (2023.3)\n",
-      "Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.14.4) (2023.3)\n",
-      "Requirement already satisfied: h11>=0.8 in /usr/local/lib/python3.10/dist-packages (from uvicorn->vllm==0.1.3) (0.14.0)\n",
-      "Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas->datasets==2.14.4) (1.16.0)\n",
-      "Requirement already satisfied: anyio<5,>=3.4.0 in /usr/local/lib/python3.10/dist-packages (from starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (3.7.1)\n",
-      "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=2.0.0->vllm==0.1.3) (2.1.2)\n",
-      "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (2023.6.1)\n",
-      "Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (0.29.1)\n",
-      "Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (0.8.10)\n",
-      "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=2.0.0->vllm==0.1.3) (1.2.1)\n",
-      "Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (1.3.0)\n",
-      "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (1.1.2)\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
-      "\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.1.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.2.1\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython -m pip install --upgrade pip\u001b[0m\n",
-      "Note: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
-   "source": [
-    "%pip install datasets==2.14.4 vllm==0.1.3"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Number of recipes: 2,147,248\n"
-     ]
-    }
-   ],
-   "source": [
-    "from datasets import load_dataset\n",
-    "\n",
-    "all_recipes = load_dataset(\"corbt/all-recipes\")[\"train\"][\"input\"]\n",
-    "\n",
-    "print(f\"Number of recipes: {len(all_recipes):,}\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "INFO 08-24 19:38:29 llm_engine.py:70] Initializing an LLM engine with config: model='./models/run1/merged', tokenizer='./models/run1/merged', tokenizer_mode=auto, trust_remote_code=False, dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)\n",
-      "INFO 08-24 19:39:48 llm_engine.py:196] # GPU blocks: 3419, # CPU blocks: 512\n"
-     ]
-    }
-   ],
-   "source": [
-    "from vllm import LLM, SamplingParams\n",
-    "\n",
-    "llm = LLM(model=\"./models/run1/merged\", max_num_batched_tokens=4096)\n",
-    "\n",
-    "sampling_params = SamplingParams(\n",
-    "    # 120 should be fine for the work we're doing here.\n",
-    "    max_tokens=120,\n",
-    "    # This is a deterministic task so temperature=0 is best.\n",
-    "    temperature=0,\n",
-    ")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Start time: 1692906050.3340027\n",
-      "Processing recipes 0 to 10,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:51<00:00, 34.30it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 10,000 to 20,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:54<00:00, 33.98it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 20,000 to 30,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:53<00:00, 34.11it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 30,000 to 40,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:53<00:00, 34.11it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 40,000 to 50,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts:  48%|████▊     | 4796/10000 [02:21<03:18, 26.22it/s]"
-     ]
-    },
-    {
-     "ename": "KeyboardInterrupt",
-     "evalue": "",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)",
-      "Cell \u001b[0;32mIn[6], line 12\u001b[0m\n\u001b[1;32m     10\u001b[0m \u001b[39mfor\u001b[39;00m i \u001b[39min\u001b[39;00m \u001b[39mrange\u001b[39m(\u001b[39m0\u001b[39m, \u001b[39mlen\u001b[39m(all_recipes), BATCH_SIZE):\n\u001b[1;32m     11\u001b[0m     \u001b[39mprint\u001b[39m(\u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mProcessing recipes \u001b[39m\u001b[39m{\u001b[39;00mi\u001b[39m:\u001b[39;00m\u001b[39m,\u001b[39m\u001b[39m}\u001b[39;00m\u001b[39m to \u001b[39m\u001b[39m{\u001b[39;00mi\u001b[39m+\u001b[39mBATCH_SIZE\u001b[39m:\u001b[39;00m\u001b[39m,\u001b[39m\u001b[39m}\u001b[39;00m\u001b[39m...\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m---> 12\u001b[0m     outputs \u001b[39m=\u001b[39m llm\u001b[39m.\u001b[39;49mgenerate(all_recipes[i:i\u001b[39m+\u001b[39;49mBATCH_SIZE], sampling_params\u001b[39m=\u001b[39;49msampling_params)\n\u001b[1;32m     14\u001b[0m     all_outputs\u001b[39m.\u001b[39mextend([o\u001b[39m.\u001b[39moutputs[\u001b[39m0\u001b[39m]\u001b[39m.\u001b[39mtext \u001b[39mfor\u001b[39;00m o \u001b[39min\u001b[39;00m outputs])\n\u001b[1;32m     16\u001b[0m end_time \u001b[39m=\u001b[39m time\u001b[39m.\u001b[39mtime()\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py:130\u001b[0m, in \u001b[0;36mLLM.generate\u001b[0;34m(self, prompts, sampling_params, prompt_token_ids, use_tqdm)\u001b[0m\n\u001b[1;32m    128\u001b[0m         token_ids \u001b[39m=\u001b[39m prompt_token_ids[i]\n\u001b[1;32m    129\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_add_request(prompt, sampling_params, token_ids)\n\u001b[0;32m--> 130\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_run_engine(use_tqdm)\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py:150\u001b[0m, in \u001b[0;36mLLM._run_engine\u001b[0;34m(self, use_tqdm)\u001b[0m\n\u001b[1;32m    148\u001b[0m outputs: List[RequestOutput] \u001b[39m=\u001b[39m []\n\u001b[1;32m    149\u001b[0m \u001b[39mwhile\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mllm_engine\u001b[39m.\u001b[39mhas_unfinished_requests():\n\u001b[0;32m--> 150\u001b[0m     step_outputs \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mllm_engine\u001b[39m.\u001b[39;49mstep()\n\u001b[1;32m    151\u001b[0m     \u001b[39mfor\u001b[39;00m output \u001b[39min\u001b[39;00m step_outputs:\n\u001b[1;32m    152\u001b[0m         \u001b[39mif\u001b[39;00m output\u001b[39m.\u001b[39mfinished:\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py:313\u001b[0m, in \u001b[0;36mLLMEngine.step\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    307\u001b[0m     \u001b[39mreturn\u001b[39;00m [\n\u001b[1;32m    308\u001b[0m         RequestOutput\u001b[39m.\u001b[39mfrom_seq_group(seq_group)\n\u001b[1;32m    309\u001b[0m         \u001b[39mfor\u001b[39;00m seq_group \u001b[39min\u001b[39;00m scheduler_outputs\u001b[39m.\u001b[39mignored_seq_groups\n\u001b[1;32m    310\u001b[0m     ]\n\u001b[1;32m    312\u001b[0m \u001b[39m# Execute the model.\u001b[39;00m\n\u001b[0;32m--> 313\u001b[0m output \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_run_workers(\n\u001b[1;32m    314\u001b[0m     \u001b[39m\"\u001b[39;49m\u001b[39mexecute_model\u001b[39;49m\u001b[39m\"\u001b[39;49m,\n\u001b[1;32m    315\u001b[0m     seq_group_metadata_list\u001b[39m=\u001b[39;49mseq_group_metadata_list,\n\u001b[1;32m    316\u001b[0m     blocks_to_swap_in\u001b[39m=\u001b[39;49mscheduler_outputs\u001b[39m.\u001b[39;49mblocks_to_swap_in,\n\u001b[1;32m    317\u001b[0m     blocks_to_swap_out\u001b[39m=\u001b[39;49mscheduler_outputs\u001b[39m.\u001b[39;49mblocks_to_swap_out,\n\u001b[1;32m    318\u001b[0m     blocks_to_copy\u001b[39m=\u001b[39;49mscheduler_outputs\u001b[39m.\u001b[39;49mblocks_to_copy,\n\u001b[1;32m    319\u001b[0m )\n\u001b[1;32m    320\u001b[0m \u001b[39m# Update the scheduler with the model outputs.\u001b[39;00m\n\u001b[1;32m    321\u001b[0m seq_groups \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mscheduler\u001b[39m.\u001b[39mupdate(output)\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py:470\u001b[0m, in \u001b[0;36mLLMEngine._run_workers\u001b[0;34m(self, method, get_all_outputs, *args, **kwargs)\u001b[0m\n\u001b[1;32m    467\u001b[0m     \u001b[39melse\u001b[39;00m:\n\u001b[1;32m    468\u001b[0m         executor \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(worker, method)\n\u001b[0;32m--> 470\u001b[0m     output \u001b[39m=\u001b[39m executor(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n\u001b[1;32m    471\u001b[0m     all_outputs\u001b[39m.\u001b[39mappend(output)\n\u001b[1;32m    473\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mparallel_config\u001b[39m.\u001b[39mworker_use_ray:\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py:115\u001b[0m, in \u001b[0;36mcontext_decorator.<locals>.decorate_context\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    112\u001b[0m \u001b[39m@functools\u001b[39m\u001b[39m.\u001b[39mwraps(func)\n\u001b[1;32m    113\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mdecorate_context\u001b[39m(\u001b[39m*\u001b[39margs, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs):\n\u001b[1;32m    114\u001b[0m     \u001b[39mwith\u001b[39;00m ctx_factory():\n\u001b[0;32m--> 115\u001b[0m         \u001b[39mreturn\u001b[39;00m func(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py:293\u001b[0m, in \u001b[0;36mWorker.execute_model\u001b[0;34m(self, seq_group_metadata_list, blocks_to_swap_in, blocks_to_swap_out, blocks_to_copy)\u001b[0m\n\u001b[1;32m    289\u001b[0m input_tokens, input_positions, input_metadata \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_prepare_inputs(\n\u001b[1;32m    290\u001b[0m     seq_group_metadata_list)\n\u001b[1;32m    292\u001b[0m \u001b[39m# Execute the model.\u001b[39;00m\n\u001b[0;32m--> 293\u001b[0m output \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mmodel(\n\u001b[1;32m    294\u001b[0m     input_ids\u001b[39m=\u001b[39;49minput_tokens,\n\u001b[1;32m    295\u001b[0m     positions\u001b[39m=\u001b[39;49minput_positions,\n\u001b[1;32m    296\u001b[0m     kv_caches\u001b[39m=\u001b[39;49m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mgpu_cache,\n\u001b[1;32m    297\u001b[0m     input_metadata\u001b[39m=\u001b[39;49minput_metadata,\n\u001b[1;32m    298\u001b[0m     cache_events\u001b[39m=\u001b[39;49mcache_events,\n\u001b[1;32m    299\u001b[0m )\n\u001b[1;32m    300\u001b[0m \u001b[39mreturn\u001b[39;00m output\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1501\u001b[0m, in \u001b[0;36mModule._call_impl\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m   1496\u001b[0m \u001b[39m# If we don't have any hooks, we want to skip the rest of the logic in\u001b[39;00m\n\u001b[1;32m   1497\u001b[0m \u001b[39m# this function, and just call forward.\u001b[39;00m\n\u001b[1;32m   1498\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m (\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_pre_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_pre_hooks\n\u001b[1;32m   1499\u001b[0m         \u001b[39mor\u001b[39;00m _global_backward_pre_hooks \u001b[39mor\u001b[39;00m _global_backward_hooks\n\u001b[1;32m   1500\u001b[0m         \u001b[39mor\u001b[39;00m _global_forward_hooks \u001b[39mor\u001b[39;00m _global_forward_pre_hooks):\n\u001b[0;32m-> 1501\u001b[0m     \u001b[39mreturn\u001b[39;00m forward_call(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n\u001b[1;32m   1502\u001b[0m \u001b[39m# Do not call functions when jit is used\u001b[39;00m\n\u001b[1;32m   1503\u001b[0m full_backward_hooks, non_full_backward_hooks \u001b[39m=\u001b[39m [], []\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py:255\u001b[0m, in \u001b[0;36mLlamaForCausalLM.forward\u001b[0;34m(self, input_ids, positions, kv_caches, input_metadata, cache_events)\u001b[0m\n\u001b[1;32m    245\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mforward\u001b[39m(\n\u001b[1;32m    246\u001b[0m     \u001b[39mself\u001b[39m,\n\u001b[1;32m    247\u001b[0m     input_ids: torch\u001b[39m.\u001b[39mTensor,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    251\u001b[0m     cache_events: Optional[List[torch\u001b[39m.\u001b[39mcuda\u001b[39m.\u001b[39mEvent]],\n\u001b[1;32m    252\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m Dict[\u001b[39mint\u001b[39m, SequenceOutputs]:\n\u001b[1;32m    253\u001b[0m     hidden_states \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mmodel(input_ids, positions, kv_caches,\n\u001b[1;32m    254\u001b[0m                                input_metadata, cache_events)\n\u001b[0;32m--> 255\u001b[0m     next_tokens \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49msampler(\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mlm_head\u001b[39m.\u001b[39;49mweight, hidden_states,\n\u001b[1;32m    256\u001b[0m                                input_metadata)\n\u001b[1;32m    257\u001b[0m     \u001b[39mreturn\u001b[39;00m next_tokens\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1501\u001b[0m, in \u001b[0;36mModule._call_impl\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m   1496\u001b[0m \u001b[39m# If we don't have any hooks, we want to skip the rest of the logic in\u001b[39;00m\n\u001b[1;32m   1497\u001b[0m \u001b[39m# this function, and just call forward.\u001b[39;00m\n\u001b[1;32m   1498\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m (\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_pre_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_pre_hooks\n\u001b[1;32m   1499\u001b[0m         \u001b[39mor\u001b[39;00m _global_backward_pre_hooks \u001b[39mor\u001b[39;00m _global_backward_hooks\n\u001b[1;32m   1500\u001b[0m         \u001b[39mor\u001b[39;00m _global_forward_hooks \u001b[39mor\u001b[39;00m _global_forward_pre_hooks):\n\u001b[0;32m-> 1501\u001b[0m     \u001b[39mreturn\u001b[39;00m forward_call(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n\u001b[1;32m   1502\u001b[0m \u001b[39m# Do not call functions when jit is used\u001b[39;00m\n\u001b[1;32m   1503\u001b[0m full_backward_hooks, non_full_backward_hooks \u001b[39m=\u001b[39m [], []\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/sampler.py:44\u001b[0m, in \u001b[0;36mSampler.forward\u001b[0;34m(self, embedding, hidden_states, input_metadata, embedding_bias)\u001b[0m\n\u001b[1;32m     36\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mforward\u001b[39m(\n\u001b[1;32m     37\u001b[0m     \u001b[39mself\u001b[39m,\n\u001b[1;32m     38\u001b[0m     embedding: torch\u001b[39m.\u001b[39mTensor,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m     42\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m Dict[\u001b[39mint\u001b[39m, SequenceOutputs]:\n\u001b[1;32m     43\u001b[0m     \u001b[39m# Get the hidden states that we use for sampling.\u001b[39;00m\n\u001b[0;32m---> 44\u001b[0m     hidden_states \u001b[39m=\u001b[39m _prune_hidden_states(hidden_states, input_metadata)\n\u001b[1;32m     46\u001b[0m     \u001b[39m# Get the logits for the next tokens.\u001b[39;00m\n\u001b[1;32m     47\u001b[0m     logits \u001b[39m=\u001b[39m torch\u001b[39m.\u001b[39mmatmul(hidden_states, embedding\u001b[39m.\u001b[39mt())\n",
-      "\u001b[0;31mKeyboardInterrupt\u001b[0m: "
-     ]
-    }
-   ],
-   "source": [
-    "# We'll process our recipes in batches of 10,000.\n",
-    "\n",
-    "import time\n",
-    "\n",
-    "BATCH_SIZE = 10000\n",
-    "all_outputs = []\n",
-    "\n",
-    "start_time = time.time()\n",
-    "print(f\"Start time: {start_time}\")\n",
-    "for i in range(0, len(all_recipes), BATCH_SIZE):\n",
-    "    print(f\"Processing recipes {i:,} to {i+BATCH_SIZE:,}...\")\n",
-    "    outputs = llm.generate(\n",
-    "        all_recipes[i : i + BATCH_SIZE], sampling_params=sampling_params\n",
-    "    )\n",
-    "\n",
-    "    all_outputs.extend([o.outputs[0].text for o in outputs])\n",
-    "\n",
-    "end_time = time.time()\n",
-    "print(f\"End time: {end_time}\")\n",
-    "print(f\"Total hours: {((end_time - start_time) / 3600):.2f}\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Nice! I've processed all 2,147,248 recipes in under 17 hours. Let's do a cost comparison with GPT-3.5 and GPT-4. I'll use the GPT-4 latency/cost numbers based on the 5000 samples used to generate our model's training data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Model</th>\n",
-       "      <th>Cost to Classify One Recipe</th>\n",
-       "      <th>Cost to Classify Entire Dataset</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>Llama 2 7B (finetuned)</td>\n",
-       "      <td>0.000009</td>\n",
-       "      <td>18.86</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>GPT-3.5</td>\n",
-       "      <td>0.000481</td>\n",
-       "      <td>1,033.26</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>GPT-3.5 (finetuned)</td>\n",
-       "      <td>0.004044</td>\n",
-       "      <td>8,683.47</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>GPT-4</td>\n",
-       "      <td>0.010800</td>\n",
-       "      <td>23,190.28</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "                    Model  Cost to Classify One Recipe  \\\n",
-       "0  Llama 2 7B (finetuned)                     0.000009   \n",
-       "1                 GPT-3.5                     0.000481   \n",
-       "2     GPT-3.5 (finetuned)                     0.004044   \n",
-       "3                   GPT-4                     0.010800   \n",
-       "\n",
-       "  Cost to Classify Entire Dataset  \n",
-       "0                           18.86  \n",
-       "1                        1,033.26  \n",
-       "2                        8,683.47  \n",
-       "3                       23,190.28  "
-      ]
-     },
-     "execution_count": 19,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import pandas as pd\n",
-    "\n",
-    "# I used an on-demand Nvidia L40 on RunPod for this, at an hourly cost of $1.14.\n",
-    "finetuned_hourly_cost = 1.14\n",
-    "\n",
-    "finetuned_total_hours = 16.54\n",
-    "\n",
-    "finetuned_avg_cost = finetuned_hourly_cost * finetuned_total_hours / len(all_recipes)\n",
-    "\n",
-    "# The average input and output tokens calculated by OpenAI, based on the 5000 recipes I sent them\n",
-    "avg_input_tokens = 276\n",
-    "avg_output_tokens = 42\n",
-    "\n",
-    "# Token pricing from https://openai.com/pricing\n",
-    "gpt_4_avg_cost = avg_input_tokens * 0.03 / 1000 + avg_output_tokens * 0.06 / 1000\n",
-    "\n",
-    "gpt_35_avg_cost = avg_input_tokens * 0.0015 / 1000 + avg_output_tokens * 0.0016 / 1000\n",
-    "\n",
-    "gpt_35_finetuned_avg_cost = (\n",
-    "    avg_input_tokens * 0.012 / 1000 + avg_output_tokens * 0.016 / 1000 + 0.06 / 1000\n",
-    ")\n",
-    "\n",
-    "# Multiply the number of recipes\n",
-    "# gpt_4_cost = len(all_recipes) * gpt_4_avg_cost\n",
-    "# gpt_35_cost = len(all_recipes) * gpt_35_avg_cost\n",
-    "# gpt_35_finetuned_cost = len(all_recipes) * gpt_35_finetuned_avg_cost\n",
-    "\n",
-    "# Let's put this in a dataframe for easier comparison.\n",
-    "\n",
-    "costs = pd.DataFrame(\n",
-    "    {\n",
-    "        \"Model\": [\n",
-    "            \"Llama 2 7B (finetuned)\",\n",
-    "            \"GPT-3.5\",\n",
-    "            \"GPT-3.5 (finetuned)\",\n",
-    "            \"GPT-4\",\n",
-    "        ],\n",
-    "        \"Cost to Classify One Recipe\": [\n",
-    "            finetuned_avg_cost,\n",
-    "            gpt_35_avg_cost,\n",
-    "            gpt_35_finetuned_avg_cost,\n",
-    "            gpt_4_avg_cost,\n",
-    "        ],\n",
-    "    }\n",
-    ")\n",
-    "\n",
-    "costs[\"Cost to Classify Entire Dataset\"] = (\n",
-    "    costs[\"Cost to Classify One Recipe\"] * len(all_recipes)\n",
-    ").map(lambda x: f\"{x:,.2f}\")\n",
-    "\n",
-    "\n",
-    "costs\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "...and just for fun, let's figure out how many recipes my pescatarian basement-dwelling brother can make! 😂"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/examples/classify-recipes/README.md
+++ b/examples/classify-recipes/README.md
@@ -1,10 +0,0 @@
-# OpenPipe demo: fine-tuning your own model
-
-Hi there! This repository should give you a brief overview of how to fine-tune a competitive model from start to finish. You should review the notebooks in this directory in the following order:
-
-1. [./generate-data.ipynb](./generate-data.ipynb): Demonstrates how to generate a sample dataset of GPT-4 completions, store it using OpenPipe, and then export it in a format suitable for training a model.
-2. [./train.ipynb](./train.ipynb): Trains a Llama 2 7B model on the dataset from step (1).
-3. [./evaluate.ipynb](./evaluate.ipynb): Evaluates the model we trained using a special test set that we set aside in step (1).
-4. [./benchmark.ipynb](./benchmark.ipynb): A script to compare costs and completion latencies between our fine-tuned model, GPT-3.5, and GPT-4.
-
-If you want to follow along yourself, I recommend using [RunPod](https://www.runpod.io/). The training scripts we use will run on any of their GPUs with 24GB of vRAM or more.
--- a/examples/classify-recipes/init.py
+++ b/examples/classify-recipes/init.py
--- a/examples/classify-recipes/benchmark.ipynb
+++ b/examples/classify-recipes/benchmark.ipynb
@@ -1,432 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "I'm pretty happy with my model's accuracy relative to GPT-4. How does it compare cost-wise?\n",
-    "\n",
-    "I'll really push this to its limits -- let's see how quickly our poor model can classify the [full 2-million-recipe dataset](https://huggingface.co/datasets/corbt/all-recipes) 😈."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Requirement already satisfied: datasets==2.14.4 in /usr/local/lib/python3.10/dist-packages (2.14.4)\n",
-      "Requirement already satisfied: vllm==0.1.3 in /usr/local/lib/python3.10/dist-packages (0.1.3)\n",
-      "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (1.24.4)\n",
-      "Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (12.0.1)\n",
-      "Requirement already satisfied: dill<0.3.8,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (0.3.7)\n",
-      "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (2.0.3)\n",
-      "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (2.28.1)\n",
-      "Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (4.66.1)\n",
-      "Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (3.3.0)\n",
-      "Requirement already satisfied: multiprocess in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (0.70.15)\n",
-      "Requirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (2023.6.0)\n",
-      "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (3.8.5)\n",
-      "Requirement already satisfied: huggingface-hub<1.0.0,>=0.14.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (0.16.4)\n",
-      "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (23.1)\n",
-      "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.14.4) (6.0)\n",
-      "Requirement already satisfied: ninja in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.11.1)\n",
-      "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (5.9.5)\n",
-      "Requirement already satisfied: ray>=2.5.1 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (2.6.3)\n",
-      "Requirement already satisfied: sentencepiece in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.1.99)\n",
-      "Requirement already satisfied: torch>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (2.0.1+cu118)\n",
-      "Requirement already satisfied: transformers>=4.31.0 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (4.33.0.dev0)\n",
-      "Requirement already satisfied: xformers>=0.0.19 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.0.21)\n",
-      "Requirement already satisfied: fastapi in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.101.1)\n",
-      "Requirement already satisfied: uvicorn in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.23.2)\n",
-      "Requirement already satisfied: pydantic<2 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.10.12)\n",
-      "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (23.1.0)\n",
-      "Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (2.1.1)\n",
-      "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (6.0.4)\n",
-      "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (4.0.3)\n",
-      "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (1.9.2)\n",
-      "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (1.4.0)\n",
-      "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.14.4) (1.3.1)\n",
-      "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.4) (3.9.0)\n",
-      "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0.0,>=0.14.0->datasets==2.14.4) (4.7.1)\n",
-      "Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (8.1.7)\n",
-      "Requirement already satisfied: jsonschema in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (4.18.0)\n",
-      "Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.0.5)\n",
-      "Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (4.24.1)\n",
-      "Requirement already satisfied: grpcio>=1.42.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.57.0)\n",
-      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.14.4) (3.4)\n",
-      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.14.4) (1.26.13)\n",
-      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.14.4) (2022.12.7)\n",
-      "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (1.11.1)\n",
-      "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (3.0)\n",
-      "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (3.1.2)\n",
-      "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (2.0.0)\n",
-      "Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=2.0.0->vllm==0.1.3) (3.25.0)\n",
-      "Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=2.0.0->vllm==0.1.3) (15.0.7)\n",
-      "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (2023.8.8)\n",
-      "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.13.3)\n",
-      "Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.3.2)\n",
-      "Requirement already satisfied: starlette<0.28.0,>=0.27.0 in /usr/local/lib/python3.10/dist-packages (from fastapi->vllm==0.1.3) (0.27.0)\n",
-      "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.14.4) (2.8.2)\n",
-      "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.14.4) (2023.3)\n",
-      "Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.14.4) (2023.3)\n",
-      "Requirement already satisfied: h11>=0.8 in /usr/local/lib/python3.10/dist-packages (from uvicorn->vllm==0.1.3) (0.14.0)\n",
-      "Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas->datasets==2.14.4) (1.16.0)\n",
-      "Requirement already satisfied: anyio<5,>=3.4.0 in /usr/local/lib/python3.10/dist-packages (from starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (3.7.1)\n",
-      "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=2.0.0->vllm==0.1.3) (2.1.2)\n",
-      "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (2023.6.1)\n",
-      "Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (0.29.1)\n",
-      "Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (0.8.10)\n",
-      "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=2.0.0->vllm==0.1.3) (1.2.1)\n",
-      "Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (1.3.0)\n",
-      "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (1.1.2)\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
-      "\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.1.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.2.1\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython -m pip install --upgrade pip\u001b[0m\n",
-      "Note: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
-   "source": [
-    "%pip install datasets==2.14.4 vllm==0.1.3"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Number of recipes: 2,147,248\n"
-     ]
-    }
-   ],
-   "source": [
-    "from datasets import load_dataset\n",
-    "\n",
-    "all_recipes = load_dataset(\"corbt/all-recipes\")[\"train\"][\"input\"]\n",
-    "\n",
-    "print(f\"Number of recipes: {len(all_recipes):,}\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "INFO 08-24 19:38:29 llm_engine.py:70] Initializing an LLM engine with config: model='./models/run1/merged', tokenizer='./models/run1/merged', tokenizer_mode=auto, trust_remote_code=False, dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)\n",
-      "INFO 08-24 19:39:48 llm_engine.py:196] # GPU blocks: 3419, # CPU blocks: 512\n"
-     ]
-    }
-   ],
-   "source": [
-    "from vllm import LLM, SamplingParams\n",
-    "\n",
-    "llm = LLM(model=\"./models/run1/merged\", max_num_batched_tokens=4096)\n",
-    "\n",
-    "sampling_params = SamplingParams(\n",
-    "    # 120 should be fine for the work we're doing here.\n",
-    "    max_tokens=120,\n",
-    "    # This is a deterministic task so temperature=0 is best.\n",
-    "    temperature=0,\n",
-    ")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Start time: 1692906050.3340027\n",
-      "Processing recipes 0 to 10,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:51<00:00, 34.30it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 10,000 to 20,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:54<00:00, 33.98it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 20,000 to 30,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:53<00:00, 34.11it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 30,000 to 40,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 10000/10000 [04:53<00:00, 34.11it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Processing recipes 40,000 to 50,000...\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts:  48%|████▊     | 4796/10000 [02:21<03:18, 26.22it/s]"
-     ]
-    },
-    {
-     "ename": "KeyboardInterrupt",
-     "evalue": "",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)",
-      "Cell \u001b[0;32mIn[6], line 12\u001b[0m\n\u001b[1;32m     10\u001b[0m \u001b[39mfor\u001b[39;00m i \u001b[39min\u001b[39;00m \u001b[39mrange\u001b[39m(\u001b[39m0\u001b[39m, \u001b[39mlen\u001b[39m(all_recipes), BATCH_SIZE):\n\u001b[1;32m     11\u001b[0m     \u001b[39mprint\u001b[39m(\u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mProcessing recipes \u001b[39m\u001b[39m{\u001b[39;00mi\u001b[39m:\u001b[39;00m\u001b[39m,\u001b[39m\u001b[39m}\u001b[39;00m\u001b[39m to \u001b[39m\u001b[39m{\u001b[39;00mi\u001b[39m+\u001b[39mBATCH_SIZE\u001b[39m:\u001b[39;00m\u001b[39m,\u001b[39m\u001b[39m}\u001b[39;00m\u001b[39m...\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m---> 12\u001b[0m     outputs \u001b[39m=\u001b[39m llm\u001b[39m.\u001b[39;49mgenerate(all_recipes[i:i\u001b[39m+\u001b[39;49mBATCH_SIZE], sampling_params\u001b[39m=\u001b[39;49msampling_params)\n\u001b[1;32m     14\u001b[0m     all_outputs\u001b[39m.\u001b[39mextend([o\u001b[39m.\u001b[39moutputs[\u001b[39m0\u001b[39m]\u001b[39m.\u001b[39mtext \u001b[39mfor\u001b[39;00m o \u001b[39min\u001b[39;00m outputs])\n\u001b[1;32m     16\u001b[0m end_time \u001b[39m=\u001b[39m time\u001b[39m.\u001b[39mtime()\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py:130\u001b[0m, in \u001b[0;36mLLM.generate\u001b[0;34m(self, prompts, sampling_params, prompt_token_ids, use_tqdm)\u001b[0m\n\u001b[1;32m    128\u001b[0m         token_ids \u001b[39m=\u001b[39m prompt_token_ids[i]\n\u001b[1;32m    129\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_add_request(prompt, sampling_params, token_ids)\n\u001b[0;32m--> 130\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_run_engine(use_tqdm)\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py:150\u001b[0m, in \u001b[0;36mLLM._run_engine\u001b[0;34m(self, use_tqdm)\u001b[0m\n\u001b[1;32m    148\u001b[0m outputs: List[RequestOutput] \u001b[39m=\u001b[39m []\n\u001b[1;32m    149\u001b[0m \u001b[39mwhile\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mllm_engine\u001b[39m.\u001b[39mhas_unfinished_requests():\n\u001b[0;32m--> 150\u001b[0m     step_outputs \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mllm_engine\u001b[39m.\u001b[39;49mstep()\n\u001b[1;32m    151\u001b[0m     \u001b[39mfor\u001b[39;00m output \u001b[39min\u001b[39;00m step_outputs:\n\u001b[1;32m    152\u001b[0m         \u001b[39mif\u001b[39;00m output\u001b[39m.\u001b[39mfinished:\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py:313\u001b[0m, in \u001b[0;36mLLMEngine.step\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    307\u001b[0m     \u001b[39mreturn\u001b[39;00m [\n\u001b[1;32m    308\u001b[0m         RequestOutput\u001b[39m.\u001b[39mfrom_seq_group(seq_group)\n\u001b[1;32m    309\u001b[0m         \u001b[39mfor\u001b[39;00m seq_group \u001b[39min\u001b[39;00m scheduler_outputs\u001b[39m.\u001b[39mignored_seq_groups\n\u001b[1;32m    310\u001b[0m     ]\n\u001b[1;32m    312\u001b[0m \u001b[39m# Execute the model.\u001b[39;00m\n\u001b[0;32m--> 313\u001b[0m output \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_run_workers(\n\u001b[1;32m    314\u001b[0m     \u001b[39m\"\u001b[39;49m\u001b[39mexecute_model\u001b[39;49m\u001b[39m\"\u001b[39;49m,\n\u001b[1;32m    315\u001b[0m     seq_group_metadata_list\u001b[39m=\u001b[39;49mseq_group_metadata_list,\n\u001b[1;32m    316\u001b[0m     blocks_to_swap_in\u001b[39m=\u001b[39;49mscheduler_outputs\u001b[39m.\u001b[39;49mblocks_to_swap_in,\n\u001b[1;32m    317\u001b[0m     blocks_to_swap_out\u001b[39m=\u001b[39;49mscheduler_outputs\u001b[39m.\u001b[39;49mblocks_to_swap_out,\n\u001b[1;32m    318\u001b[0m     blocks_to_copy\u001b[39m=\u001b[39;49mscheduler_outputs\u001b[39m.\u001b[39;49mblocks_to_copy,\n\u001b[1;32m    319\u001b[0m )\n\u001b[1;32m    320\u001b[0m \u001b[39m# Update the scheduler with the model outputs.\u001b[39;00m\n\u001b[1;32m    321\u001b[0m seq_groups \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mscheduler\u001b[39m.\u001b[39mupdate(output)\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py:470\u001b[0m, in \u001b[0;36mLLMEngine._run_workers\u001b[0;34m(self, method, get_all_outputs, *args, **kwargs)\u001b[0m\n\u001b[1;32m    467\u001b[0m     \u001b[39melse\u001b[39;00m:\n\u001b[1;32m    468\u001b[0m         executor \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(worker, method)\n\u001b[0;32m--> 470\u001b[0m     output \u001b[39m=\u001b[39m executor(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n\u001b[1;32m    471\u001b[0m     all_outputs\u001b[39m.\u001b[39mappend(output)\n\u001b[1;32m    473\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mparallel_config\u001b[39m.\u001b[39mworker_use_ray:\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py:115\u001b[0m, in \u001b[0;36mcontext_decorator.<locals>.decorate_context\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    112\u001b[0m \u001b[39m@functools\u001b[39m\u001b[39m.\u001b[39mwraps(func)\n\u001b[1;32m    113\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mdecorate_context\u001b[39m(\u001b[39m*\u001b[39margs, \u001b[39m*\u001b[39m\u001b[39m*\u001b[39mkwargs):\n\u001b[1;32m    114\u001b[0m     \u001b[39mwith\u001b[39;00m ctx_factory():\n\u001b[0;32m--> 115\u001b[0m         \u001b[39mreturn\u001b[39;00m func(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py:293\u001b[0m, in \u001b[0;36mWorker.execute_model\u001b[0;34m(self, seq_group_metadata_list, blocks_to_swap_in, blocks_to_swap_out, blocks_to_copy)\u001b[0m\n\u001b[1;32m    289\u001b[0m input_tokens, input_positions, input_metadata \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_prepare_inputs(\n\u001b[1;32m    290\u001b[0m     seq_group_metadata_list)\n\u001b[1;32m    292\u001b[0m \u001b[39m# Execute the model.\u001b[39;00m\n\u001b[0;32m--> 293\u001b[0m output \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mmodel(\n\u001b[1;32m    294\u001b[0m     input_ids\u001b[39m=\u001b[39;49minput_tokens,\n\u001b[1;32m    295\u001b[0m     positions\u001b[39m=\u001b[39;49minput_positions,\n\u001b[1;32m    296\u001b[0m     kv_caches\u001b[39m=\u001b[39;49m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mgpu_cache,\n\u001b[1;32m    297\u001b[0m     input_metadata\u001b[39m=\u001b[39;49minput_metadata,\n\u001b[1;32m    298\u001b[0m     cache_events\u001b[39m=\u001b[39;49mcache_events,\n\u001b[1;32m    299\u001b[0m )\n\u001b[1;32m    300\u001b[0m \u001b[39mreturn\u001b[39;00m output\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1501\u001b[0m, in \u001b[0;36mModule._call_impl\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m   1496\u001b[0m \u001b[39m# If we don't have any hooks, we want to skip the rest of the logic in\u001b[39;00m\n\u001b[1;32m   1497\u001b[0m \u001b[39m# this function, and just call forward.\u001b[39;00m\n\u001b[1;32m   1498\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m (\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_pre_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_pre_hooks\n\u001b[1;32m   1499\u001b[0m         \u001b[39mor\u001b[39;00m _global_backward_pre_hooks \u001b[39mor\u001b[39;00m _global_backward_hooks\n\u001b[1;32m   1500\u001b[0m         \u001b[39mor\u001b[39;00m _global_forward_hooks \u001b[39mor\u001b[39;00m _global_forward_pre_hooks):\n\u001b[0;32m-> 1501\u001b[0m     \u001b[39mreturn\u001b[39;00m forward_call(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n\u001b[1;32m   1502\u001b[0m \u001b[39m# Do not call functions when jit is used\u001b[39;00m\n\u001b[1;32m   1503\u001b[0m full_backward_hooks, non_full_backward_hooks \u001b[39m=\u001b[39m [], []\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py:255\u001b[0m, in \u001b[0;36mLlamaForCausalLM.forward\u001b[0;34m(self, input_ids, positions, kv_caches, input_metadata, cache_events)\u001b[0m\n\u001b[1;32m    245\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mforward\u001b[39m(\n\u001b[1;32m    246\u001b[0m     \u001b[39mself\u001b[39m,\n\u001b[1;32m    247\u001b[0m     input_ids: torch\u001b[39m.\u001b[39mTensor,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    251\u001b[0m     cache_events: Optional[List[torch\u001b[39m.\u001b[39mcuda\u001b[39m.\u001b[39mEvent]],\n\u001b[1;32m    252\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m Dict[\u001b[39mint\u001b[39m, SequenceOutputs]:\n\u001b[1;32m    253\u001b[0m     hidden_states \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mmodel(input_ids, positions, kv_caches,\n\u001b[1;32m    254\u001b[0m                                input_metadata, cache_events)\n\u001b[0;32m--> 255\u001b[0m     next_tokens \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49msampler(\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mlm_head\u001b[39m.\u001b[39;49mweight, hidden_states,\n\u001b[1;32m    256\u001b[0m                                input_metadata)\n\u001b[1;32m    257\u001b[0m     \u001b[39mreturn\u001b[39;00m next_tokens\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1501\u001b[0m, in \u001b[0;36mModule._call_impl\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m   1496\u001b[0m \u001b[39m# If we don't have any hooks, we want to skip the rest of the logic in\u001b[39;00m\n\u001b[1;32m   1497\u001b[0m \u001b[39m# this function, and just call forward.\u001b[39;00m\n\u001b[1;32m   1498\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m (\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_backward_pre_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_hooks \u001b[39mor\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_forward_pre_hooks\n\u001b[1;32m   1499\u001b[0m         \u001b[39mor\u001b[39;00m _global_backward_pre_hooks \u001b[39mor\u001b[39;00m _global_backward_hooks\n\u001b[1;32m   1500\u001b[0m         \u001b[39mor\u001b[39;00m _global_forward_hooks \u001b[39mor\u001b[39;00m _global_forward_pre_hooks):\n\u001b[0;32m-> 1501\u001b[0m     \u001b[39mreturn\u001b[39;00m forward_call(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n\u001b[1;32m   1502\u001b[0m \u001b[39m# Do not call functions when jit is used\u001b[39;00m\n\u001b[1;32m   1503\u001b[0m full_backward_hooks, non_full_backward_hooks \u001b[39m=\u001b[39m [], []\n",
-      "File \u001b[0;32m/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/sampler.py:44\u001b[0m, in \u001b[0;36mSampler.forward\u001b[0;34m(self, embedding, hidden_states, input_metadata, embedding_bias)\u001b[0m\n\u001b[1;32m     36\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mforward\u001b[39m(\n\u001b[1;32m     37\u001b[0m     \u001b[39mself\u001b[39m,\n\u001b[1;32m     38\u001b[0m     embedding: torch\u001b[39m.\u001b[39mTensor,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m     42\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m Dict[\u001b[39mint\u001b[39m, SequenceOutputs]:\n\u001b[1;32m     43\u001b[0m     \u001b[39m# Get the hidden states that we use for sampling.\u001b[39;00m\n\u001b[0;32m---> 44\u001b[0m     hidden_states \u001b[39m=\u001b[39m _prune_hidden_states(hidden_states, input_metadata)\n\u001b[1;32m     46\u001b[0m     \u001b[39m# Get the logits for the next tokens.\u001b[39;00m\n\u001b[1;32m     47\u001b[0m     logits \u001b[39m=\u001b[39m torch\u001b[39m.\u001b[39mmatmul(hidden_states, embedding\u001b[39m.\u001b[39mt())\n",
-      "\u001b[0;31mKeyboardInterrupt\u001b[0m: "
-     ]
-    }
-   ],
-   "source": [
-    "# We'll process our recipes in batches of 10,000.\n",
-    "\n",
-    "import time\n",
-    "\n",
-    "BATCH_SIZE = 10000\n",
-    "all_outputs = []\n",
-    "\n",
-    "start_time = time.time()\n",
-    "print(f\"Start time: {start_time}\")\n",
-    "for i in range(0, len(all_recipes), BATCH_SIZE):\n",
-    "    print(f\"Processing recipes {i:,} to {i+BATCH_SIZE:,}...\")\n",
-    "    outputs = llm.generate(\n",
-    "        all_recipes[i : i + BATCH_SIZE], sampling_params=sampling_params\n",
-    "    )\n",
-    "\n",
-    "    all_outputs.extend([o.outputs[0].text for o in outputs])\n",
-    "\n",
-    "end_time = time.time()\n",
-    "print(f\"End time: {end_time}\")\n",
-    "print(f\"Total hours: {((end_time - start_time) / 3600):.2f}\")\n",
-    "\n",
-    "# Ended up running this in a separate script to leave it running in the background.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Nice! I've processed all 2,147,248 recipes in under 17 hours. Let's do a cost comparison with GPT-3.5 and GPT-4. I'll use the GPT-4 latency/cost numbers based on the 5000 samples used to generate our model's training data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Model</th>\n",
-       "      <th>Cost to Classify One Recipe</th>\n",
-       "      <th>Cost to Classify Entire Dataset</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>Llama 2 7B (finetuned)</td>\n",
-       "      <td>0.000009</td>\n",
-       "      <td>18.86</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>GPT-3.5</td>\n",
-       "      <td>0.000481</td>\n",
-       "      <td>1,033.26</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>GPT-3.5 (finetuned)</td>\n",
-       "      <td>0.004044</td>\n",
-       "      <td>8,683.47</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>GPT-4</td>\n",
-       "      <td>0.010800</td>\n",
-       "      <td>23,190.28</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "                    Model  Cost to Classify One Recipe  \\\n",
-       "0  Llama 2 7B (finetuned)                     0.000009   \n",
-       "1                 GPT-3.5                     0.000481   \n",
-       "2     GPT-3.5 (finetuned)                     0.004044   \n",
-       "3                   GPT-4                     0.010800   \n",
-       "\n",
-       "  Cost to Classify Entire Dataset  \n",
-       "0                           18.86  \n",
-       "1                        1,033.26  \n",
-       "2                        8,683.47  \n",
-       "3                       23,190.28  "
-      ]
-     },
-     "execution_count": 19,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import pandas as pd\n",
-    "\n",
-    "# I used an on-demand Nvidia L40 on RunPod for this, at an hourly cost of $1.14.\n",
-    "finetuned_hourly_cost = 1.14\n",
-    "\n",
-    "finetuned_total_hours = 16.5\n",
-    "\n",
-    "finetuned_avg_cost = finetuned_hourly_cost * finetuned_total_hours / len(all_recipes)\n",
-    "\n",
-    "# The average input and output tokens for OpenAI, based on the 5000 recipes I\n",
-    "# sent them when generating training data.\n",
-    "avg_input_tokens = 276\n",
-    "avg_output_tokens = 42\n",
-    "\n",
-    "# Token pricing from https://openai.com/pricing\n",
-    "gpt_4_avg_cost = avg_input_tokens * 0.03 / 1000 + avg_output_tokens * 0.06 / 1000\n",
-    "\n",
-    "gpt_35_avg_cost = avg_input_tokens * 0.0015 / 1000 + avg_output_tokens * 0.0016 / 1000\n",
-    "\n",
-    "gpt_35_finetuned_avg_cost = (\n",
-    "    avg_input_tokens * 0.012 / 1000 + avg_output_tokens * 0.016 / 1000 + 0.06 / 1000\n",
-    ")\n",
-    "\n",
-    "costs = pd.DataFrame(\n",
-    "    {\n",
-    "        \"Model\": [\n",
-    "            \"Llama 2 7B (finetuned)\",\n",
-    "            \"GPT-3.5\",\n",
-    "            \"GPT-3.5 (finetuned)\",\n",
-    "            \"GPT-4\",\n",
-    "        ],\n",
-    "        \"Cost to Classify One Recipe\": [\n",
-    "            finetuned_avg_cost,\n",
-    "            gpt_35_avg_cost,\n",
-    "            gpt_35_finetuned_avg_cost,\n",
-    "            gpt_4_avg_cost,\n",
-    "        ],\n",
-    "    }\n",
-    ")\n",
-    "\n",
-    "costs[\"Cost to Classify Entire Dataset\"] = (\n",
-    "    costs[\"Cost to Classify One Recipe\"] * len(all_recipes)\n",
-    ").map(lambda x: f\"{x:,.2f}\")\n",
-    "\n",
-    "\n",
-    "costs\n"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/examples/classify-recipes/evaluate.ipynb
+++ b/examples/classify-recipes/evaluate.ipynb
@@ -1,663 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "I have a model in `./models/run1/merged` that was trained on GPT-4's outputs to classify recipes. I need to figure out whether it does a good job at classifying recipes. I'll install dependencies first."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Requirement already satisfied: vllm==0.1.3 in /usr/local/lib/python3.10/dist-packages (0.1.3)\n",
-      "Requirement already satisfied: pandas==2.0.3 in /usr/local/lib/python3.10/dist-packages (2.0.3)\n",
-      "Requirement already satisfied: ninja in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.11.1)\n",
-      "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (5.9.5)\n",
-      "Requirement already satisfied: ray>=2.5.1 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (2.6.3)\n",
-      "Requirement already satisfied: sentencepiece in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.1.99)\n",
-      "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.24.4)\n",
-      "Requirement already satisfied: torch>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (2.0.1+cu118)\n",
-      "Requirement already satisfied: transformers>=4.31.0 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (4.33.0.dev0)\n",
-      "Requirement already satisfied: xformers>=0.0.19 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.0.21)\n",
-      "Requirement already satisfied: fastapi in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.101.1)\n",
-      "Requirement already satisfied: uvicorn in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (0.23.2)\n",
-      "Requirement already satisfied: pydantic<2 in /usr/local/lib/python3.10/dist-packages (from vllm==0.1.3) (1.10.12)\n",
-      "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas==2.0.3) (2.8.2)\n",
-      "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==2.0.3) (2023.3)\n",
-      "Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas==2.0.3) (2023.3)\n",
-      "Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<2->vllm==0.1.3) (4.7.1)\n",
-      "Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas==2.0.3) (1.16.0)\n",
-      "Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (8.1.7)\n",
-      "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (3.9.0)\n",
-      "Requirement already satisfied: jsonschema in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (4.18.0)\n",
-      "Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.0.5)\n",
-      "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (23.1)\n",
-      "Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (4.24.1)\n",
-      "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (6.0)\n",
-      "Requirement already satisfied: aiosignal in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.3.1)\n",
-      "Requirement already satisfied: frozenlist in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.4.0)\n",
-      "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (2.28.1)\n",
-      "Requirement already satisfied: grpcio>=1.42.0 in /usr/local/lib/python3.10/dist-packages (from ray>=2.5.1->vllm==0.1.3) (1.57.0)\n",
-      "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (1.11.1)\n",
-      "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (3.0)\n",
-      "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (3.1.2)\n",
-      "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=2.0.0->vllm==0.1.3) (2.0.0)\n",
-      "Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=2.0.0->vllm==0.1.3) (3.25.0)\n",
-      "Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=2.0.0->vllm==0.1.3) (15.0.7)\n",
-      "Requirement already satisfied: huggingface-hub<1.0,>=0.15.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.16.4)\n",
-      "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (2023.8.8)\n",
-      "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.13.3)\n",
-      "Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (0.3.2)\n",
-      "Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->vllm==0.1.3) (4.66.1)\n",
-      "Requirement already satisfied: starlette<0.28.0,>=0.27.0 in /usr/local/lib/python3.10/dist-packages (from fastapi->vllm==0.1.3) (0.27.0)\n",
-      "Requirement already satisfied: h11>=0.8 in /usr/local/lib/python3.10/dist-packages (from uvicorn->vllm==0.1.3) (0.14.0)\n",
-      "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.15.1->transformers>=4.31.0->vllm==0.1.3) (2023.6.0)\n",
-      "Requirement already satisfied: anyio<5,>=3.4.0 in /usr/local/lib/python3.10/dist-packages (from starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (3.7.1)\n",
-      "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=2.0.0->vllm==0.1.3) (2.1.2)\n",
-      "Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (23.1.0)\n",
-      "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (2023.6.1)\n",
-      "Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (0.29.1)\n",
-      "Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray>=2.5.1->vllm==0.1.3) (0.8.10)\n",
-      "Requirement already satisfied: charset-normalizer<3,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->ray>=2.5.1->vllm==0.1.3) (2.1.1)\n",
-      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->ray>=2.5.1->vllm==0.1.3) (3.4)\n",
-      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->ray>=2.5.1->vllm==0.1.3) (1.26.13)\n",
-      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->ray>=2.5.1->vllm==0.1.3) (2022.12.7)\n",
-      "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=2.0.0->vllm==0.1.3) (1.2.1)\n",
-      "Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (1.3.0)\n",
-      "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.3) (1.1.2)\n",
-      "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
-      "\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.1.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.2.1\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython3.10 -m pip install --upgrade pip\u001b[0m\n",
-      "Note: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
-   "source": [
-    "%pip install vllm==0.1.3 pandas==2.0.3"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Remember I got a \"test.jsonl\" file from OpenPipe back in [./prepare.ipynb](./prepare.ipynb)? That's data from our dataset that we didn't use in training, so we can use it to check our model's performance."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "\n",
-    "test_data = pd.read_json(\"./data/test.jsonl\", lines=True)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "During the training process Axolotl transformed our data into an instruction/response format known as the \"Alpaca format\" based on [the project that introduced it](https://github.com/tatsu-lab/stanford_alpaca). I need to transform my test data into the same format for best results."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Sample prompt:\n",
-      "--------------\n",
-      "### Instruction:\n",
-      "[{\"role\":\"system\",\"content\":\"Your goal is to classify a recipe along several dimensions.Pay attention to the instructions.\"},{\"role\":\"user\",\"content\":\"Pan Gravy\\n\\nIngredients:\\n- 1/3 cup all purpose flour\\n- 1/3 cup turkey drippings\\n- 3 cup water or broth\\n- 1/8 to 1/4 teaspoon salt\\n- 1/8 tsp pepper\\n\\nDirections:\\n- In a skillet or roasting pan, add flour to drippings; blend well.\\n- Cook over medium heat 2 to 3 minutes until smooth and light brown, stirring constantly.\\n- Add water; cook until mixture boils and thickens, stirring constantly.\\n- Stir in salt and pepper.\\n- *Flour and drippings can be decreased to 1/4 cup each for thinner gravy.\\n- *\"}]\n",
-      "\n",
-      "### Response:\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "from axolotl.prompters import UnpromptedPrompter\n",
-    "\n",
-    "prompter = UnpromptedPrompter()\n",
-    "\n",
-    "\n",
-    "def format_prompt(input: str) -> str:\n",
-    "    return next(prompter.build_prompt(input))\n",
-    "\n",
-    "\n",
-    "prompts = test_data[\"instruction\"].apply(format_prompt)\n",
-    "\n",
-    "print(f\"Sample prompt:\\n--------------\\n{prompts[0]}\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Next up, I'll use [vLLM](https://vllm.readthedocs.io/en/latest/) to efficiently process all the prompts in our test data with our own model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "INFO 08-25 03:58:49 llm_engine.py:70] Initializing an LLM engine with config: model='./models/run1/merged', tokenizer='./models/run1/merged', tokenizer_mode=auto, trust_remote_code=False, dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)\n",
-      "INFO 08-25 03:59:40 llm_engine.py:196] # GPU blocks: 3419, # CPU blocks: 512\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Processed prompts: 100%|██████████| 500/500 [00:37<00:00, 13.42it/s]"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Sample output:\n",
-      "--------------\n",
-      "{\"role\":\"assistant\",\"content\":null,\"function_call\":{\"name\":\"classify\",\"arguments\":\"{\\n\\\"has_non_fish_meat\\\": true,\\n\\\"requires_oven\\\": false,\\n\\\"requires_stove\\\": true,\\n\\\"cook_time_over_30_mins\\\": false,\\n\\\"main_dish\\\": false\\n}\"}}\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "from vllm import LLM, SamplingParams\n",
-    "\n",
-    "llm = LLM(model=\"./models/run1/merged\", max_num_batched_tokens=4096)\n",
-    "\n",
-    "sampling_params = SamplingParams(\n",
-    "    # 120 should be fine for the work we're doing here.\n",
-    "    max_tokens=120,\n",
-    "    # This is a deterministic task so temperature=0 is best.\n",
-    "    temperature=0,\n",
-    ")\n",
-    "\n",
-    "my_outputs = llm.generate(prompts, sampling_params=sampling_params)\n",
-    "my_outputs = [o.outputs[0].text for o in my_outputs]\n",
-    "\n",
-    "test_data[\"my_outputs\"] = my_outputs\n",
-    "\n",
-    "print(f\"Sample output:\\n--------------\\n{my_outputs[0]}\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Ok, we have our outputs! There are 5 categories we classify each recipe on, so let's check what percentage of the time our model's output matches GPT-4's. I'll write a quick eval function for that:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Overall accuracy: 0.95\n"
-     ]
-    }
-   ],
-   "source": [
-    "import json\n",
-    "\n",
-    "\n",
-    "def parse_fn_call(str):\n",
-    "    \"\"\"Parse the function call arguments from the response\"\"\"\n",
-    "    response_dict = json.loads(str)\n",
-    "    args_dict = json.loads(response_dict[\"function_call\"][\"arguments\"])\n",
-    "\n",
-    "    return args_dict\n",
-    "\n",
-    "\n",
-    "def calculate_accuracy(row):\n",
-    "    \"\"\"Calculate the fraction of my model's outputs that match the reference outputs\"\"\"\n",
-    "    true_outputs = parse_fn_call(row[\"output\"])\n",
-    "    my_outputs = parse_fn_call(row[\"my_outputs\"])\n",
-    "\n",
-    "    num_matching_outputs = 0\n",
-    "    for key in true_outputs.keys():\n",
-    "        if key in my_outputs and true_outputs[key] == my_outputs[key]:\n",
-    "            num_matching_outputs += 1\n",
-    "\n",
-    "    return num_matching_outputs / len(true_outputs)\n",
-    "\n",
-    "\n",
-    "test_data[\"accuracy\"] = test_data.apply(calculate_accuracy, axis=1)\n",
-    "\n",
-    "print(f\"Overall accuracy: {test_data['accuracy'].mean():.2f}\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Not bad! However, there are still a few rows where the model outputs don't match. Let's take a closer look."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Alligator Sauce Piquant\n",
-      "\n",
-      "Ingredients:\n",
-      "- 2 lb. alligator, boneless and cubed *\n",
-      "- 4 onions, diced\n",
-      "- 1 c. parsley, chopped\n",
-      "- 4 stalks celery, chopped\n",
-      "- 1 bell pepper, diced\n",
-      "- 1 c. catsup\n",
-      "- 2 Tbsp. Heinz steak sauce\n",
-      "- 2 Tbsp. soy sauce\n",
-      "- 2 Tbsp. Louisiana hot sauce\n",
-      "- 2 Tbsp. cornstarch\n",
-      "- 1 tsp. salt\n",
-      "- 2 tsp. red pepper (ground)\n",
-      "- 1/4 c. cooking oil\n",
-      "\n",
-      "Directions:\n",
-      "- *Alligator must be free of all fat; also dark meat is the best (leg and body meat), boneless.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>GPT-4</th>\n",
-       "      <th>My model</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>cook_time_over_30_mins</th>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>main_dish</th>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "                        GPT-4  My model\n",
-       "cook_time_over_30_mins   True     False\n",
-       "main_dish                True     False"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Veggie Casserole\n",
-      "\n",
-      "Ingredients:\n",
-      "- 1 (8 oz.) bag mixed veggies (corn, peas, carrots, green beans), steamed\n",
-      "- 1 c. celery\n",
-      "- 1 c. onions\n",
-      "- 1 c. Cheddar cheese\n",
-      "- 1 c. mayonnaise\n",
-      "\n",
-      "Directions:\n",
-      "- Mix above ingredients.\n",
-      "- Bake at 350° for 30 minutes, until bubbly.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>GPT-4</th>\n",
-       "      <th>My model</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>main_dish</th>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "           GPT-4  My model\n",
-       "main_dish  False      True"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Rhonda'S Butter Chess Pie\n",
-      "\n",
-      "Ingredients:\n",
-      "- 5 eggs\n",
-      "- 1 stick melted butter\n",
-      "- 2 c. sugar\n",
-      "- 1 tsp. vanilla\n",
-      "- 1 Tbsp. cornstarch\n",
-      "- 1/2 c. buttermilk\n",
-      "- unbaked 9-inch deep dish pie shell\n",
-      "\n",
-      "Directions:\n",
-      "- Mix eggs with sugar and cornstarch until smooth.\n",
-      "- Add melted butter, vanilla and buttermilk.\n",
-      "- Bake at 350° for 30 minutes or until done.\n",
-      "- Let cool and chill.\n",
-      "- Similar to Furr's Butter Chess Pie.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>GPT-4</th>\n",
-       "      <th>My model</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>cook_time_over_30_mins</th>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "                        GPT-4  My model\n",
-       "cook_time_over_30_mins  False      True"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Broccoli Gorgonzola Cream Soup\n",
-      "\n",
-      "Ingredients:\n",
-      "- 2 heads Broccoli\n",
-      "- 700 milliliters Water\n",
-      "- 1 Onion, Peeled And Cut Into Chunks\n",
-      "- 1 pinch Salt\n",
-      "- 1 teaspoon Oregano\n",
-      "- 1 Potato, Peeled And Cut Into Chunks\n",
-      "- 200 grams Crumbled Gorgonzola\n",
-      "- 1 Tablespoon Finely Grated Parmesan\n",
-      "\n",
-      "Directions:\n",
-      "- Cut off the hard trunks of the broccoli and cut it into small pieces. Prepare a pot with water, add broccoli, onion, salt and oregano and boil for about 30 minutes.\n",
-      "- Add the peeled potato and boil for another 20 minutes. When vegetables are cooked, strain and save the stock.\n",
-      "- Using a hand blender, puree vegetables, adding as much stock as desired. Bring soup back to heat over low heat, and sir in gorgonzola. Remove from heat and add Parmesan.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>GPT-4</th>\n",
-       "      <th>My model</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>main_dish</th>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "           GPT-4  My model\n",
-       "main_dish  False      True"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Wild Rice With Cucumber And Feta\n",
-      "\n",
-      "Ingredients:\n",
-      "- 1 (8.5-ounce) package precooked wild rice (such as Archer Farms)\n",
-      "- 1 cup diced English cucumber\n",
-      "- 1 1/2 tablespoons olive oil\n",
-      "- 1 tablespoon fresh lemon juice\n",
-      "- 2 ounces crumbled feta cheese\n",
-      "- 1/2 teaspoon pepper\n",
-      "- 1/4 teaspoon salt\n",
-      "\n",
-      "Directions:\n",
-      "- Prepare rice according to the package directions.\n",
-      "- Combine cooked rice, cucumber, olive oil, lemon juice, and crumbled feta cheese in a medium bowl; toss to coat. Stir in pepper and salt.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>GPT-4</th>\n",
-       "      <th>My model</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>main_dish</th>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "           GPT-4  My model\n",
-       "main_dish   True     False"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    }
-   ],
-   "source": [
-    "import numpy as np\n",
-    "\n",
-    "np.random.seed(42)\n",
-    "\n",
-    "for row in test_data[test_data.accuracy < 1].sample(5).itertuples():\n",
-    "    print(json.loads(row.instruction)[1][\"content\"])\n",
-    "\n",
-    "    gpt4_output = parse_fn_call(row.output)\n",
-    "    my_output = parse_fn_call(row.my_outputs)\n",
-    "\n",
-    "    table = pd.DataFrame(\n",
-    "        {\n",
-    "            \"GPT-4\": gpt4_output,\n",
-    "            \"My model\": my_output,\n",
-    "        }\n",
-    "    )\n",
-    "\n",
-    "    table = table[table[\"GPT-4\"] != table[\"My model\"]]\n",
-    "    display(table)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Looking at the outputs, it's clear that our model still makes some mistakes. But at the same time, there are plenty of examples like \"Rhonda's Butter Chess Pie\" where our model gets it right, even though GPT-4 got it wrong! And there are also cases like the \"Veggie Casserole\", where the \"right\" answer is truly ambiguous and really both answers are defensible.\n",
-    "\n",
-    "Interested in cost/latency benchmarking? You can check out [./benchmarking.ipynb](./benchmarking.ipynb) for an overview of my findings!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
-  },
-  "orig_nbformat": 4
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/examples/classify-recipes/generate-data.ipynb
+++ b/examples/classify-recipes/generate-data.ipynb
@@ -1,353 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "In this notebook I'm using the OpenPipe client to capture a set of calls to the OpenAI API.\n",
-    "\n",
-    "For this example I'll blithely throw engineering best practices to the wind and use the notebook itself to manage dependencies. 😁"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Requirement already satisfied: openpipe==3.0.3 in /usr/local/lib/python3.10/dist-packages (3.0.3)\n",
-      "Requirement already satisfied: python-dotenv==1.0.0 in /usr/local/lib/python3.10/dist-packages (1.0.0)\n",
-      "Requirement already satisfied: joblib==1.3.2 in /usr/local/lib/python3.10/dist-packages (1.3.2)\n",
-      "Requirement already satisfied: attrs<24.0.0,>=23.1.0 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (23.1.0)\n",
-      "Requirement already satisfied: httpx<0.25.0,>=0.24.1 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (0.24.1)\n",
-      "Requirement already satisfied: openai<0.28.0,>=0.27.8 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (0.27.9)\n",
-      "Requirement already satisfied: python-dateutil<3.0.0,>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (2.8.2)\n",
-      "Requirement already satisfied: toml<0.11.0,>=0.10.2 in /usr/local/lib/python3.10/dist-packages (from openpipe==3.0.3) (0.10.2)\n",
-      "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (2022.12.7)\n",
-      "Requirement already satisfied: httpcore<0.18.0,>=0.15.0 in /usr/local/lib/python3.10/dist-packages (from httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (0.17.3)\n",
-      "Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (3.4)\n",
-      "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (1.3.0)\n",
-      "Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.8->openpipe==3.0.3) (2.28.1)\n",
-      "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.8->openpipe==3.0.3) (4.66.1)\n",
-      "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.8->openpipe==3.0.3) (3.8.5)\n",
-      "Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.8.2->openpipe==3.0.3) (1.16.0)\n",
-      "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore<0.18.0,>=0.15.0->httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (0.14.0)\n",
-      "Requirement already satisfied: anyio<5.0,>=3.0 in /usr/local/lib/python3.10/dist-packages (from httpcore<0.18.0,>=0.15.0->httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (3.7.1)\n",
-      "Requirement already satisfied: charset-normalizer<3,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (2.1.1)\n",
-      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (1.26.13)\n",
-      "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (6.0.4)\n",
-      "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (4.0.3)\n",
-      "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (1.9.2)\n",
-      "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (1.4.0)\n",
-      "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->openpipe==3.0.3) (1.3.1)\n",
-      "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5.0,>=3.0->httpcore<0.18.0,>=0.15.0->httpx<0.25.0,>=0.24.1->openpipe==3.0.3) (1.1.2)\n",
-      "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
-      "\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.1.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.2.1\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython3.10 -m pip install --upgrade pip\u001b[0m\n",
-      "Note: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
-   "source": [
-    "%pip install openpipe==3.0.3 python-dotenv==1.0.0 joblib==1.3.2 datasets==2.14.4"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "When working with remote datasets (or any data, really), it's a good idea to visually inspect some samples to make sure it looks like you expect. I'll print a recipe."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Recipe dataset shape:\n",
-      "------------------\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "Dataset({\n",
-       "    features: ['recipe'],\n",
-       "    num_rows: 5000\n",
-       "})"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "First recipe:\n",
-      "------------------ Shrimp Creole\n",
-      "\n",
-      "Ingredients:\n",
-      "- 20 shrimp (8 oz.)\n",
-      "- 2 c. (16 oz. can) tomato sauce\n",
-      "- 1 small onion, chopped\n",
-      "- 1 celery stalk, chopped\n",
-      "- 1/4 green bell pepper, diced\n",
-      "- 1/4 c. sliced mushrooms\n",
-      "- 3 Tbsp. parsley\n",
-      "- 1/2 tsp. pepper\n",
-      "- 1 to 1-1/2 c. brown rice, prepared according to pkg. directions (not included in exchanges)\n",
-      "\n",
-      "Directions:\n",
-      "- Peel, devein and wash shrimp; set aside.\n",
-      "- (If shrimp are frozen, let thaw first in the refrigerator.)\n",
-      "- Simmer tomato sauce, onion, celery, green pepper, mushrooms, parsley and pepper in skillet for 30 minutes.\n",
-      "- Add shrimp and cook 10 to 15 minutes more, until shrimp are tender.\n",
-      "- Serve over brown rice.\n",
-      "- Serves 2.\n"
-     ]
-    }
-   ],
-   "source": [
-    "from datasets import load_dataset\n",
-    "\n",
-    "recipes = load_dataset(\"corbt/unlabeled-recipes\")[\"train\"]\n",
-    "print(\"Recipe dataset shape:\\n------------------\")\n",
-    "display(recipes)\n",
-    "print(\"First recipe:\\n------------------\", recipes[\"recipe\"][0])\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Mm, delicious. Anyway, we need to generate a training dataset. We'll call GPT-4 on each of our examples.\n",
-    "\n",
-    "In this case, I'll ask GPT-4 to classify each recipe along 5 dimensions:\n",
-    " - has_non_fish_meat\n",
-    " - requires_oven\n",
-    " - requires_stove\n",
-    " - cook_time_over_30_mins\n",
-    " - main_dish\n",
-    "\n",
-    "That looks like a pretty random list, but there's actually an important unifying thread: I'm looking for meals that my pescatarian brother/co-founder can make in his kitchen-less, near-window-less basement apartment in San Francisco! (If you haven't tried to get an apartment in SF you probably think I'm joking 😂.)\n",
-    "\n",
-    "I'll use [OpenPipe](https://github.com/openpipe/openpipe) to track the API calls and form a training dataset. To follow along you'll need to create a free OpenPipe account, then copy your API key from https://app.openpipe.ai/project/settings into a file called `.env`. You can see an example in [./.env.example](./.env.example)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Classifying first recipe:\n",
-      "------------------\n",
-      "{'has_non_fish_meat': False, 'requires_oven': False, 'requires_stove': True, 'cook_time_over_30_mins': True, 'main_dish': True}\n"
-     ]
-    }
-   ],
-   "source": [
-    "from openpipe import openai, configure_openpipe\n",
-    "import json\n",
-    "import os\n",
-    "import dotenv\n",
-    "\n",
-    "# Use `dotenv` to load the contents of the `.env` file into the environment\n",
-    "dotenv.load_dotenv()\n",
-    "\n",
-    "# Configure OpenPipe using the API key from the environment\n",
-    "configure_openpipe(api_key=os.environ[\"OPENPIPE_API_KEY\"])\n",
-    "\n",
-    "# Configure OpenAI using the API key from the environment\n",
-    "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n",
-    "\n",
-    "\n",
-    "def classify_recipe(recipe: str):\n",
-    "    completion = openai.ChatCompletion.create(\n",
-    "        model=\"gpt-4\",\n",
-    "        messages=[\n",
-    "            {\n",
-    "                \"role\": \"system\",\n",
-    "                \"content\": \"Your goal is to classify a recipe along several dimensions.Pay attention to the instructions.\",\n",
-    "            },\n",
-    "            {\n",
-    "                \"role\": \"user\",\n",
-    "                \"content\": recipe,\n",
-    "            },\n",
-    "        ],\n",
-    "        functions=[\n",
-    "            {\n",
-    "                \"name\": \"classify\",\n",
-    "                \"parameters\": {\n",
-    "                    \"type\": \"object\",\n",
-    "                    \"properties\": {\n",
-    "                        \"has_non_fish_meat\": {\n",
-    "                            \"type\": \"boolean\",\n",
-    "                            \"description\": \"True if the recipe contains any meat or meat products (eg. chicken broth) besides fish\",\n",
-    "                        },\n",
-    "                        \"requires_oven\": {\n",
-    "                            \"type\": \"boolean\",\n",
-    "                            \"description\": \"True if the recipe requires an oven\",\n",
-    "                        },\n",
-    "                        \"requires_stove\": {\n",
-    "                            \"type\": \"boolean\",\n",
-    "                            \"description\": \"True if the recipe requires a stove\",\n",
-    "                        },\n",
-    "                        \"cook_time_over_30_mins\": {\n",
-    "                            \"type\": \"boolean\",\n",
-    "                            \"description\": \"True if the recipe takes over 30 minutes to prepare and cook, including waiting time\",\n",
-    "                        },\n",
-    "                        \"main_dish\": {\n",
-    "                            \"type\": \"boolean\",\n",
-    "                            \"description\": \"True if the recipe can be served as a main dish\",\n",
-    "                        },\n",
-    "                    },\n",
-    "                    \"required\": [\n",
-    "                        \"has_non_fish_meat\",\n",
-    "                        \"requires_oven\",\n",
-    "                        \"requires_stove\",\n",
-    "                        \"cook_time_over_30_mins\",\n",
-    "                        \"main_dish\",\n",
-    "                    ],\n",
-    "                },\n",
-    "            }\n",
-    "        ],\n",
-    "        function_call={\n",
-    "            \"name\": \"classify\",\n",
-    "        },\n",
-    "        openpipe={\"tags\": {\"prompt_id\": \"classify-recipe\"}, \"cache\": True},\n",
-    "    )\n",
-    "    return json.loads(completion.choices[0].message.function_call.arguments)\n",
-    "\n",
-    "\n",
-    "print(\"Classifying first recipe:\\n------------------\")\n",
-    "print(classify_recipe(recipes[\"recipe\"][0]))\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "That's working, so I'll go ahead and classify all 5000 recipes with GPT-4. Using GPT-4 for this is slowwww and costs about $40. The model I'm fine-tuning will be much faster -- we'll see if we can make it as good!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Classifying recipe 0/5000: Shrimp Creole\n",
-      "Classifying recipe 100/5000: Spoon Bread\n",
-      "Classifying recipe 200/5000: Quadrangle Grille'S Pumpkin-Walnut Cheesecake\n",
-      "Classifying recipe 300/5000: Broccoli Casserole\n",
-      "Classifying recipe 400/5000: Paal Payasam (3-Ingredient Rice Pudding)\n",
-      "Classifying recipe 500/5000: Dirt Dessert\n",
-      "Classifying recipe 600/5000: Dolma, Stuffed Dried Peppers And Eggplants\n",
-      "Classifying recipe 700/5000: Party Pecan Pies\n",
-      "Classifying recipe 800/5000: Pie Crust\n",
-      "Classifying recipe 900/5000: Russian Dressing(Salad Dressing)  \n",
-      "Classifying recipe 1000/5000: O'Brien Potatoes\n",
-      "Classifying recipe 1100/5000: Monster Cookies\n",
-      "Classifying recipe 1200/5000: Striped Fruit Pops\n",
-      "Classifying recipe 1300/5000: Cute Heart-Shaped Fried Egg\n",
-      "Classifying recipe 1400/5000: Steak Marinade\n",
-      "Classifying recipe 1500/5000: Bbq Sauce For Fish Recipe\n",
-      "Classifying recipe 1600/5000: Barbecue Ranch Salad\n",
-      "Classifying recipe 1700/5000: White Fudge\n",
-      "Classifying recipe 1800/5000: Seaton Chocolate Chip Cookies\n",
-      "Classifying recipe 1900/5000: Beef Stroganoff\n",
-      "Classifying recipe 2000/5000: Lemon Delight\n",
-      "Classifying recipe 2100/5000: Cream Cheese Chicken Chili\n",
-      "Classifying recipe 2200/5000: Bean Salad\n",
-      "Classifying recipe 2300/5000: Green Beans Almondine\n",
-      "Classifying recipe 2400/5000: Radish-And-Avocado Salad\n",
-      "Classifying recipe 2500/5000: Salsa Rojo\n",
-      "Classifying recipe 2600/5000: Pepperoni Bread\n",
-      "Classifying recipe 2700/5000: Sabzi Polow\n",
-      "Classifying recipe 2800/5000: Italian Vegetable Pizzas\n",
-      "Classifying recipe 2900/5000: Hot Fudge Sauce, Soda Shop Style\n",
-      "Classifying recipe 3000/5000: Meatball Soup With Vegetables And Brown Rice\n",
-      "Classifying recipe 3100/5000: Herbed Potatoes And Onions\n",
-      "Classifying recipe 3200/5000: Apple Crunch Pie (2 Extra Servings)\n",
-      "Classifying recipe 3300/5000: Pineapple-Orange Punch\n",
-      "Classifying recipe 3400/5000: Turkey Veggie Burgers With Avocado Mayo\n",
-      "Classifying recipe 3500/5000: Pear & Goat Cheese Salad\n",
-      "Classifying recipe 3600/5000: Triple Chocolate Cookies\n",
-      "Classifying recipe 3700/5000: Strawberry Banana Yogurt Pops\n",
-      "Classifying recipe 3800/5000: Chicken Croquettes\n",
-      "Classifying recipe 3900/5000: Mushroom Casserole\n",
-      "Classifying recipe 4000/5000: Vegetarian Summer Roll\n",
-      "Classifying recipe 4100/5000: Prune Cake\n",
-      "Classifying recipe 4200/5000: Strawberry Sorbet\n",
-      "Classifying recipe 4300/5000: Lemonade Chicken\n",
-      "Classifying recipe 4400/5000: Crock-Pot Vegetarian Chili\n",
-      "Classifying recipe 4500/5000: Grandma Dickrell'S Molasses Cake - 1936\n",
-      "Classifying recipe 4600/5000: Creamed Corn Casserole\n",
-      "Classifying recipe 4700/5000: Homemade Croutons\n",
-      "Classifying recipe 4800/5000: Potatoes With Leeks And Gruyere\n",
-      "Classifying recipe 4900/5000: Chocolate Oatmeal Cookie\n"
-     ]
-    }
-   ],
-   "source": [
-    "for i, recipe in enumerate(recipes[\"recipe\"]):\n",
-    "    if i % 100 == 0:\n",
-    "        recipe_title = recipe.split(\"\\n\")[0]\n",
-    "        print(f\"Classifying recipe {i}/{len(recipes)}: {recipe_title}\")\n",
-    "    try:\n",
-    "        classify_recipe(recipe)\n",
-    "    except Exception as e:\n",
-    "        print(f\"Error classifying recipe {i}: {e}\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Ok, now that my recipes are classified I'll download the training data. \n",
-    "\n",
-    "Next up I'll train the model -- check out [./train.ipynb](./train.ipynb) for details! Just go to https://app.openpipe.ai/request-logs, select all the logs you created, and click \"Export\". The default 10% testing split is fine for this dataset size.\n",
-    "\n",
-    "I got two files from that: `train.jsonl` and `test.jsonl`. I moved both of them into this repository under `./data/`."
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
-  },
-  "orig_nbformat": 4
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/examples/classify-recipes/train.ipynb
+++ b/examples/classify-recipes/train.ipynb
--- a/examples/classify-recipes/training-config.yaml
+++ b/examples/classify-recipes/training-config.yaml
@@ -1,73 +0,0 @@
-# This file is used by the training script in train.ipynb. You can read more about
-# the format and see more examples at https://github.com/OpenAccess-AI-Collective/axolotl.
-# One of the parameters you might want to play around with is `num_epochs`: if you have a
-# smaller dataset size, making that large can have good results.
-
-base_model: meta-llama/Llama-2-7b-hf
-base_model_config: meta-llama/Llama-2-7b-hf
-model_type: LlamaForCausalLM
-tokenizer_type: LlamaTokenizer
-is_llama_derived_model: true
-
-load_in_8bit: true
-load_in_4bit: false
-strict: false
-
-datasets:
-  - path: ./data/train.jsonl
-    type: alpaca_instruct.load_no_prompt
-dataset_prepared_path: ./data/last_run_prepared
-val_set_size: 0.05
-output_dir: ./models/run1
-
-sequence_len: 4096
-sample_packing: true
-
-adapter: lora
-lora_model_dir:
-lora_r: 32
-lora_alpha: 16
-lora_dropout: 0.05
-lora_target_linear: true
-lora_fan_in_fan_out:
-
-# This will report stats from your training run to https://wandb.ai/. If you don't want to create a wandb account you can comment this section out.
-wandb_project: classify-recipes
-wandb_entity:
-wandb_watch:
-wandb_run_id: run1
-wandb_log_model:
-
-gradient_accumulation_steps: 4
-micro_batch_size: 2
-num_epochs: 5
-optimizer: adamw_bnb_8bit
-lr_scheduler: cosine
-learning_rate: 0.0002
-
-train_on_inputs: false
-group_by_length: false
-bf16: true
-fp16: false
-tf32: false
-
-gradient_checkpointing: true
-early_stopping_patience:
-resume_from_checkpoint:
-local_rank:
-logging_steps: 1
-xformers_attention:
-flash_attention: true
-
-warmup_steps: 10
-eval_steps: 20
-save_steps: 60
-debug:
-deepspeed:
-weight_decay: 0.0
-fsdp:
-fsdp_config:
-special_tokens:
-  bos_token: "<s>"
-  eos_token: "</s>"
-  unk_token: "<unk>"
--- a/examples/classify-recipes/utils.py
+++ b/examples/classify-recipes/utils.py
@@ -1,37 +0,0 @@
-import yaml
-from transformers import AutoModelForCausalLM, AutoTokenizer
-import torch
-from peft import PeftModel
-import os
-
-
-def merge_lora_model(config_file: str):
-    config = yaml.load(open(config_file, "r"), Loader=yaml.FullLoader)
-
-    base_model = config["base_model"]
-    lora_model = config["output_dir"]
-    merged_model = f"{lora_model}/merged"
-
-    if os.path.exists(merged_model):
-        print(f"Model {merged_model} already exists, skipping")
-        return merged_model
-
-    print("Loading base model")
-    model = AutoModelForCausalLM.from_pretrained(
-        base_model,
-        return_dict=True,
-        torch_dtype=torch.float16,
-    )
-
-    print("Loading PEFT model")
-    model = PeftModel.from_pretrained(model, lora_model)
-    print(f"Running merge_and_unload")
-    model = model.merge_and_unload()
-
-    tokenizer = AutoTokenizer.from_pretrained(base_model)
-
-    model.save_pretrained(merged_model)
-    tokenizer.save_pretrained(merged_model)
-    print(f"Model saved to {merged_model}")
-
-    return merged_model
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -80,9 +80,6 @@ importers:
      '@vercel/og':
        specifier: ^0.5.9
        version: 0.5.9
-      archiver:
-        specifier: ^6.0.0
-        version: 6.0.0
      ast-types:
        specifier: ^0.14.2
        version: 0.14.2
@@ -174,10 +171,7 @@ importers:
        specifier: 4.0.0-beta.7
        version: 4.0.0-beta.7(encoding@0.1.13)
      openpipe:
-        specifier: ^0.3.0
-        version: 0.3.0
-      openpipe-dev:
-        specifier: workspace:^
+        specifier: workspace:*
        version: link:../client-libs/typescript
      pg:
        specifier: ^8.11.2
@@ -239,9 +233,6 @@ importers:
      socket.io-client:
        specifier: ^4.7.1
        version: 4.7.1
-      stream-buffers:
-        specifier: ^3.0.2
-        version: 3.0.2
      superjson:
        specifier: 1.12.2
        version: 1.12.2
@@ -273,9 +264,6 @@ importers:
      '@openapi-contrib/openapi-schema-to-json-schema':
        specifier: ^4.0.5
        version: 4.0.5
-      '@types/archiver':
-        specifier: ^5.3.2
-        version: 5.3.2
      '@types/babel__core':
        specifier: ^7.20.1
        version: 7.20.1
@@ -324,9 +312,6 @@ importers:
      '@types/react-syntax-highlighter':
        specifier: ^15.5.7
        version: 15.5.7
-      '@types/stream-buffers':
-        specifier: ^3.0.4
-        version: 3.0.4
      '@types/uuid':
        specifier: ^9.0.2
        version: 9.0.2
@@ -2965,12 +2950,6 @@ packages:
    resolution: {integrity: sha512-+Wt0NFAeflVSNiUnHIDNN3C8jP7XIRmYrcgJ6IsAnm0lK4p/FkpCpeu1aig5qxrgZx30PHNDLZ/3FttVSEW2aQ==}
    dev: false

-  /@types/archiver@5.3.2:
-    resolution: {integrity: sha512-IctHreBuWE5dvBDz/0WeKtyVKVRs4h75IblxOACL92wU66v+HGAfEYAOyXkOFphvRJMhuXdI9huDXpX0FC6lCw==}
-    dependencies:
-      '@types/readdir-glob': 1.1.1
-    dev: true
-
  /@types/babel__core@7.20.1:
    resolution: {integrity: sha512-aACu/U/omhdk15O4Nfb+fHgH/z3QsfQzpnvRZhYhThms83ZnAOZz7zZAWO7mn2yyNQaA4xTO8GLK3uqFU4bYYw==}
    dependencies:
@@ -3286,12 +3265,6 @@ packages:
      '@types/scheduler': 0.16.3
      csstype: 3.1.2

-  /@types/readdir-glob@1.1.1:
-    resolution: {integrity: sha512-ImM6TmoF8bgOwvehGviEj3tRdRBbQujr1N+0ypaln/GWjaerOB26jb93vsRHmdMtvVQZQebOlqt2HROark87mQ==}
-    dependencies:
-      '@types/node': 20.4.10
-    dev: true
-
  /@types/request@2.48.8:
    resolution: {integrity: sha512-whjk1EDJPcAR2kYHRbFl/lKeeKYTi05A15K9bnLInCVroNDCtXce57xKdI0/rQaA3K+6q0eFyUBPmqfSndUZdQ==}
    dependencies:
@@ -3323,12 +3296,6 @@ packages:
      '@types/node': 20.4.10
    dev: true

-  /@types/stream-buffers@3.0.4:
-    resolution: {integrity: sha512-qU/K1tb2yUdhXkLIATzsIPwbtX6BpZk0l3dPW6xqWyhfzzM1ECaQ/8faEnu3CNraLiQ9LHyQQPBGp7N9Fbs25w==}
-    dependencies:
-      '@types/node': 20.4.10
-    dev: true
-
  /@types/tough-cookie@4.0.2:
    resolution: {integrity: sha512-Q5vtl1W5ue16D+nIaW8JWebSSraJVlK+EthKn7e7UcD4KWsaSJ8BqGPXNaPghgtcn/fhvrN17Tv8ksUsQpiplw==}
    dev: false
@@ -3732,51 +3699,6 @@ packages:
      picomatch: 2.3.1
    dev: false

-  /archiver-utils@2.1.0:
-    resolution: {integrity: sha512-bEL/yUb/fNNiNTuUz979Z0Yg5L+LzLxGJz8x79lYmR54fmTIb6ob/hNQgkQnIUDWIFjZVQwl9Xs356I6BAMHfw==}
-    engines: {node: '>= 6'}
-    dependencies:
-      glob: 7.2.3
-      graceful-fs: 4.2.11
-      lazystream: 1.0.1
-      lodash.defaults: 4.2.0
-      lodash.difference: 4.5.0
-      lodash.flatten: 4.4.0
-      lodash.isplainobject: 4.0.6
-      lodash.union: 4.6.0
-      normalize-path: 3.0.0
-      readable-stream: 2.3.8
-    dev: false
-
-  /archiver-utils@3.0.3:
-    resolution: {integrity: sha512-fXzpEZTKgBJMWy0eUT0/332CAQnJ27OJd7sGcvNZzxS2Yzg7iITivMhXOm+zUTO4vT8ZqlPCqiaLPmB8qWhWRA==}
-    engines: {node: '>= 10'}
-    dependencies:
-      glob: 7.2.3
-      graceful-fs: 4.2.11
-      lazystream: 1.0.1
-      lodash.defaults: 4.2.0
-      lodash.difference: 4.5.0
-      lodash.flatten: 4.4.0
-      lodash.isplainobject: 4.0.6
-      lodash.union: 4.6.0
-      normalize-path: 3.0.0
-      readable-stream: 3.6.2
-    dev: false
-
-  /archiver@6.0.0:
-    resolution: {integrity: sha512-EPGa+bYaxaMiCT8DCbEDqFz8IjeBSExrJzyUOJx2FBkFJ/OZzJuso3lMSk901M50gMqXxTQcumlGajOFlXhVhw==}
-    engines: {node: '>= 12.0.0'}
-    dependencies:
-      archiver-utils: 3.0.3
-      async: 3.2.4
-      buffer-crc32: 0.2.13
-      readable-stream: 3.6.2
-      readdir-glob: 1.1.3
-      tar-stream: 2.2.0
-      zip-stream: 4.1.0
-    dev: false
-
  /argparse@2.0.1:
    resolution: {integrity: sha512-8+9WqebbFzpX9OR+Wa6O29asIogeRMzcGtAINdpMHHyAg10f05aSFVBbcEqGf/PXw1EjAZ+q2/bEBg3DvurK3Q==}

@@ -3915,10 +3837,6 @@ packages:
      tslib: 2.6.1
    dev: false

-  /async@3.2.4:
-    resolution: {integrity: sha512-iAB+JbDEGXhyIUavoDl9WP/Jj106Kz9DEn1DPgYw5ruDn0e3Wgi3sKFm55sASdGBNOQB8F59d9qQ7deqrHA8wQ==}
-    dev: false
-
  /asynckit@0.4.0:
    resolution: {integrity: sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q==}

@@ -4038,14 +3956,6 @@ packages:
    engines: {node: '>=8'}
    dev: false

-  /bl@4.1.0:
-    resolution: {integrity: sha512-1W07cM9gS6DcLperZfFSj+bWLtaPGSOHWhPiGzXmvVJbRLdG82sH/Kn8EtW1VqWVA54AKf2h5k5BbnIbwF3h6w==}
-    dependencies:
-      buffer: 5.7.1
-      inherits: 2.0.4
-      readable-stream: 3.6.2
-    dev: false
-
  /bluebird@3.7.2:
    resolution: {integrity: sha512-XpNj6GDQzdfW+r2Wnn7xiSAd7TM3jzkxGXBGTtWKuSXv1xUV+azxAm8jdWZN06QTQk+2N2XB9jRDkvbmQmcRtg==}
    dev: false
@@ -4098,10 +4008,6 @@ packages:
      node-releases: 2.0.13
      update-browserslist-db: 1.0.11(browserslist@4.21.10)

-  /buffer-crc32@0.2.13:
-    resolution: {integrity: sha512-VO9Ht/+p3SN7SKWqcrgEzjGbRSJYTx+Q1pTQC0wrWqHx0vpJraQ6GtHx8tvcg1rlK1byhU5gccxgOgj7B0TDkQ==}
-    dev: false
-
  /buffer-from@0.1.2:
    resolution: {integrity: sha512-RiWIenusJsmI2KcvqQABB83tLxCByE3upSP8QU3rJDMVFGPWLvPQJt/O1Su9moRWeH7d+Q2HYb68f6+v+tw2vg==}
    dev: false
@@ -4114,13 +4020,6 @@ packages:
    engines: {node: '>=4'}
    dev: false

-  /buffer@5.7.1:
-    resolution: {integrity: sha512-EHcyIPBQ4BSGlvjB16k5KgAJ27CIsHY/2JBmCRReo48y9rQ3MaUzWX3KVlBa4U7MyX02HdVj0K7C3WaB3ju7FQ==}
-    dependencies:
-      base64-js: 1.5.1
-      ieee754: 1.2.1
-    dev: false
-
  /busboy@1.6.0:
    resolution: {integrity: sha512-8SFQbg/0hQ9xy3UNTB0YEnsNBbWfhf7RtnzpL7TkBiTBRfrQ9Fxcnz7VJsleJpyp6rVLvXiuORqjlHi5q+PYuA==}
    engines: {node: '>=10.16.0'}
@@ -4347,16 +4246,6 @@ packages:
    resolution: {integrity: sha512-W9pAhw0ja1Edb5GVdIF1mjZw/ASI0AlShXM83UUGe2DVr5TdAPEA1OA8m/g8zWp9x6On7gqufY+FatDbC3MDQg==}
    dev: false

-  /compress-commons@4.1.1:
-    resolution: {integrity: sha512-QLdDLCKNV2dtoTorqgxngQCMA+gWXkM/Nwu7FpeBhk/RdkzimqC3jueb/FDmaZeXh+uby1jkBqE3xArsLBE5wQ==}
-    engines: {node: '>= 10'}
-    dependencies:
-      buffer-crc32: 0.2.13
-      crc32-stream: 4.0.2
-      normalize-path: 3.0.0
-      readable-stream: 3.6.2
-    dev: false
-
  /compute-scroll-into-view@1.0.20:
    resolution: {integrity: sha512-UCB0ioiyj8CRjtrvaceBLqqhZCVP+1B8+NWQhmdsm0VXOJtobBCf1dBQmebCCo34qZmUwZfIH2MZLqNHazrfjg==}
    dev: false
@@ -4464,20 +4353,6 @@ packages:
      yaml: 1.10.2
    dev: false

-  /crc-32@1.2.2:
-    resolution: {integrity: sha512-ROmzCKrTnOwybPcJApAA6WBWij23HVfGVNKqqrZpuyZOHqK2CwHSvpGuyt/UNNvaIjEd8X5IFGp4Mh+Ie1IHJQ==}
-    engines: {node: '>=0.8'}
-    hasBin: true
-    dev: false
-
-  /crc32-stream@4.0.2:
-    resolution: {integrity: sha512-DxFZ/Hk473b/muq1VJ///PMNLj0ZMnzye9thBpmjpJKCc5eMgB95aK8zCGrGfQ90cWo561Te6HK9D+j4KPdM6w==}
-    engines: {node: '>= 10'}
-    dependencies:
-      crc-32: 1.2.2
-      readable-stream: 3.6.2
-    dev: false
-
  /create-emotion@10.0.27:
    resolution: {integrity: sha512-fIK73w82HPPn/RsAij7+Zt8eCE8SptcJ3WoRMfxMtjteYxud8GDTKKld7MYwAX2TVhrw29uR1N/bVGxeStHILg==}
    dependencies:
@@ -4860,12 +4735,6 @@ packages:
      iconv-lite: 0.6.3
    dev: false

-  /end-of-stream@1.4.4:
-    resolution: {integrity: sha512-+uw1inIHVPQoaVuHzRyXd21icM+cnt4CzD5rW+NC1wjOUSTOs+Te7FOv7AhN7vS9x/oIyhLP5PR1H+phQAHu5Q==}
-    dependencies:
-      once: 1.4.0
-    dev: false
-
  /engine.io-client@6.5.2:
    resolution: {integrity: sha512-CQZqbrpEYnrpGqC07a9dJDz4gePZUgTPMU3NKJPSeQOyw27Tst4Pl3FemKoFGAlHzgZmKjoRmiJvbWfhCXUlIg==}
    dependencies:
@@ -5708,10 +5577,6 @@ packages:
    engines: {node: '>= 0.6'}
    dev: false

-  /fs-constants@1.0.0:
-    resolution: {integrity: sha512-y6OAwoSIf7FyjMIv94u+b5rdheZEjzR63GTyZJm5qh4Bi+2YgwLCcI/fPFZkL5PSixOt6ZNKm+w+Hfp/Bciwow==}
-    dev: false
-
  /fs-extra@11.1.1:
    resolution: {integrity: sha512-MGIE4HOvQCeUCzmlHs0vXpih4ysz4wg9qiSAu6cd42lVwPbTM1TjV7RusoyQqMmk/95gdQZX72u+YW+c3eEpFQ==}
    engines: {node: '>=14.14'}
@@ -6104,10 +5969,6 @@ packages:
      safer-buffer: 2.1.2
    dev: false

-  /ieee754@1.2.1:
-    resolution: {integrity: sha512-dcyqhDvX1C46lXZcVqCpK+FtMRQVdIMN6/Df5js2zouUsqG7I6sFxitIC+7KYK29KdXOLHdu9zL4sFnoVQnqaA==}
-    dev: false
-
  /ignore@5.2.4:
    resolution: {integrity: sha512-MAb38BcSbH0eHNBxn7ql2NH/kX33OkB3lZ1BNdh7ENeRChHTYsTvWrMubiIAMNS2llXEEgZ1MUOBtXChP3kaFQ==}
    engines: {node: '>= 4'}
@@ -6573,13 +6434,6 @@ packages:
      language-subtag-registry: 0.3.22
    dev: true

-  /lazystream@1.0.1:
-    resolution: {integrity: sha512-b94GiNHQNy6JNTrt5w6zNyffMrNkXZb3KTkCZJb2V1xaEGCk093vkZ2jk3tpaeP33/OiXC+WvK9AxUebnf5nbw==}
-    engines: {node: '>= 0.6.3'}
-    dependencies:
-      readable-stream: 2.3.8
-    dev: false
-
  /levn@0.4.1:
    resolution: {integrity: sha512-+bT2uH4E5LGE7h/n3evcS/sQlJXCpIp6ym8OWJ5eV6+67Dsql/LaaT7qJBAt2rzfoa/5QBGBhxDix1dMt2kQKQ==}
    engines: {node: '>= 0.8.0'}
@@ -6648,22 +6502,6 @@ packages:
    resolution: {integrity: sha512-/u14pXGviLaweY5JI0IUzgzF2J6Ne8INyzAZjImcryjgkZ+ebruBxy2/JaOOkTqScddcYtakjhSaeemV8lR0tA==}
    dev: false

-  /lodash.defaults@4.2.0:
-    resolution: {integrity: sha512-qjxPLHd3r5DnsdGacqOMU6pb/avJzdh9tFX2ymgoZE27BmjXrNy/y4LoaiTeAb+O3gL8AfpJGtqfX/ae2leYYQ==}
-    dev: false
-
-  /lodash.difference@4.5.0:
-    resolution: {integrity: sha512-dS2j+W26TQ7taQBGN8Lbbq04ssV3emRw4NY58WErlTO29pIqS0HmoT5aJ9+TUQ1N3G+JOZSji4eugsWwGp9yPA==}
-    dev: false
-
-  /lodash.flatten@4.4.0:
-    resolution: {integrity: sha512-C5N2Z3DgnnKr0LOpv/hKCgKdb7ZZwafIrsesve6lmzvZIRZRGaZ/l6Q8+2W7NaT+ZwO3fFlSCzCzrDCFdJfZ4g==}
-    dev: false
-
-  /lodash.isplainobject@4.0.6:
-    resolution: {integrity: sha512-oSXzaWypCMHkPC3NvBEaPHf0KsA5mvPrOPgQWDsbg8n7orZ290M0BmC/jgRZ4vcJ6DTAhjrsSYgdsW/F+MFOBA==}
-    dev: false
-
  /lodash.merge@4.6.2:
    resolution: {integrity: sha512-0KpjqXRVvrYyCsX1swR/XTK0va6VQkQM6MNo7PqW77ByjAhoARA8EfrP1N4+KlKj8YS0ZUCtRT/YUuhyYDujIQ==}
    dev: true
@@ -6672,10 +6510,6 @@ packages:
    resolution: {integrity: sha512-GK3g5RPZWTRSeLSpgP8Xhra+pnjBC56q9FZYe1d5RN3TJ35dbkGy3YqBSMbyCrlbi+CM9Z3Jk5yTL7RCsqboyQ==}
    dev: false

-  /lodash.union@4.6.0:
-    resolution: {integrity: sha512-c4pB2CdGrGdjMKYLA+XiRDO7Y0PRQbm/Gzg8qMj+QH+pFVAoTp5sBpO0odL3FjoPCGjK96p6qsP+yQoiLoOBcw==}
-    dev: false
-
  /lodash@4.17.21:
    resolution: {integrity: sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==}
    dev: false
@@ -7250,19 +7084,6 @@ packages:
      oidc-token-hash: 5.0.3
    dev: false

-  /openpipe@0.3.0:
-    resolution: {integrity: sha512-0hhk3Aq0kUxzvNb36vm9vssxMHYZvgJOg5wKeepRhVthW4ygBWftHZjR4PHyOtvjcRmnJ/v4h8xd/IINu5ypnQ==}
-    dependencies:
-      encoding: 0.1.13
-      form-data: 4.0.0
-      lodash-es: 4.17.21
-      node-fetch: 2.6.12(encoding@0.1.13)
-      openai-beta: /openai@4.0.0-beta.7(encoding@0.1.13)
-      openai-legacy: /openai@3.3.0
-    transitivePeerDependencies:
-      - debug
-    dev: false
-
  /optionator@0.9.3:
    resolution: {integrity: sha512-JjCoypp+jKn1ttEFExxhetCKeJt9zhAgAve5FXHixTvFDW/5aEktX9bufBKLRRMdU7bNtpLfcGu94B3cdEJgjg==}
    engines: {node: '>= 0.8.0'}
@@ -8034,21 +7855,6 @@ packages:
      util-deprecate: 1.0.2
    dev: false

-  /readable-stream@3.6.2:
-    resolution: {integrity: sha512-9u/sniCrY3D5WdsERHzHE4G2YCXqoG5FTHUiCC4SIbr6XcLZBY05ya9EKjYek9O5xOAwjGq+1JdGBAS7Q9ScoA==}
-    engines: {node: '>= 6'}
-    dependencies:
-      inherits: 2.0.4
-      string_decoder: 1.1.1
-      util-deprecate: 1.0.2
-    dev: false
-
-  /readdir-glob@1.1.3:
-    resolution: {integrity: sha512-v05I2k7xN8zXvPD9N+z/uhXPaj0sUFCe2rcWZIpBsqxfP7xXFQ0tipAd/wjj1YxWyWtUS5IDJpOG82JKt2EAVA==}
-    dependencies:
-      minimatch: 5.1.6
-    dev: false
-
  /readdirp@3.6.0:
    resolution: {integrity: sha512-hOS089on8RduqdbhvQ5Z37A0ESjsqz6qnRcffsMU3495FuTdqSm+7bhJ29JvIOsBDEEnan5DPu9t3To9VRlMzA==}
    engines: {node: '>=8.10.0'}
@@ -8504,11 +8310,6 @@ packages:
    resolution: {integrity: sha512-Rz6yejtVyWnVjC1RFvNmYL10kgjC49EOghxWn0RFqlCHGFpQx+Xe7yW3I4ceK1SGrWIGMjD5Kbue8W/udkbMJg==}
    dev: true

-  /stream-buffers@3.0.2:
-    resolution: {integrity: sha512-DQi1h8VEBA/lURbSwFtEHnSTb9s2/pwLEaFuNhXwy1Dx3Sa0lOuYT2yNUr4/j2fs8oCAMANtrZ5OrPZtyVs3MQ==}
-    engines: {node: '>= 0.10.0'}
-    dev: false
-
  /streamsearch@1.1.0:
    resolution: {integrity: sha512-Mcc5wHehp9aXz1ax6bZUyY5afg9u2rv5cqQI3mRrYkGC8rW2hM02jWuwjtL++LS5qinSyhj2QfLyNsuc+VsExg==}
    engines: {node: '>=10.0.0'}
@@ -8656,17 +8457,6 @@ packages:
    resolution: {integrity: sha512-GNzQvQTOIP6RyTfE2Qxb8ZVlNmw0n88vp1szwWRimP02mnTsx3Wtn5qRdqY9w2XduFNUgvOwhNnQsjwCp+kqaQ==}
    engines: {node: '>=6'}

-  /tar-stream@2.2.0:
-    resolution: {integrity: sha512-ujeqbceABgwMZxEJnk2HDY2DlnUZ+9oEcb1KzTVfYHio0UE6dG71n60d8D2I4qNvleWrrXpmjpt7vZeF1LnMZQ==}
-    engines: {node: '>=6'}
-    dependencies:
-      bl: 4.1.0
-      end-of-stream: 1.4.4
-      fs-constants: 1.0.0
-      inherits: 2.0.4
-      readable-stream: 3.6.2
-    dev: false
-
  /terser-webpack-plugin@5.3.9(webpack@5.88.2):
    resolution: {integrity: sha512-ZuXsqE07EcggTWQjXUj+Aot/OMcD0bMKGgF63f7UxYcu5/AJF53aIpK1YoP5xR9l6s/Hy2b+t1AM0bLNPRuhwA==}
    engines: {node: '>= 10.13.0'}
@@ -9551,15 +9341,6 @@ packages:
    resolution: {integrity: sha512-N+d4UJSJbt/R3wqY7Coqs5pcV0aUj2j9IaQ3rNj9bVCLld8tTGKRa2USARjnvZJWVx1NDmQev8EknoczaOQDOA==}
    dev: false

-  /zip-stream@4.1.0:
-    resolution: {integrity: sha512-zshzwQW7gG7hjpBlgeQP9RuyPGNxvJdzR8SUM3QhxCnLjWN2E7j3dOvpeDcQoETfHx0urRS7EtmVToql7YpU4A==}
-    engines: {node: '>= 10'}
-    dependencies:
-      archiver-utils: 2.1.0
-      compress-commons: 4.1.1
-      readable-stream: 3.6.2
-    dev: false
-
  /zod-to-json-schema@3.21.4(zod@3.21.4):
    resolution: {integrity: sha512-fjUZh4nQ1s6HMccgIeE0VP4QG/YRGPmyjO9sAh890aQKPEk3nqbfUXhMFaC+Dr5KvYBm8BCyvfpZf2jY9aGSsw==}
    peerDependencies: