mirror of https://github.com/MadcowD/ell.git synced 2024-09-22 16:14:36 +03:00

Files

William Guss a52f7ab488 expired discord link

2024-07-29 13:53:53 -07:00

7.2 KiB

Raw Blame History

`ell` [WIP, unreleased, experimental]

Important

: This repository is currently pre-v1.0, highly experimental, and not yet packaged for general use. It contains numerous bugs, and the schemas are subject to frequent changes. While we welcome contributions, please be aware that submitting pull requests at this stage is at your own discretion, as the codebase is rapidly evolving.

JOIN THE DISCORD We are developing this in public and I want all hands on deck regarding design decisions! Join us at https://discord.gg/Yqyga6Qr

What is `ell`?

ell is a lightweight, functional prompt engineering framework built on a few core principles:

1. Prompts are programs not strings.

Prompts aren't just strings; they are all the code that leads to strings being sent to a language model. In ell we think of one particular way of using a language model as a discrete subroutine called a language model program.

import ell

@ell.lm(model="gpt-4o")
def hello(world : str):
    """You are a helpful assistant that writes in lower case.""" # System Message
    return f"Say hello to {world[::-1]} with a poem."    # User Message

hello("sama")

2. Prompts are actually parameters of a machine learning model.

Add notes on serialization and lexical closures ...

3. Every call to a language model is worth its weight in credits.

...

Installation

To install ell, follow these steps:

Clone the repository:

git clone https://github.com/MadcowD/ell.git

Navigate to the cloned directory:
```
cd ell
```
Install the package using pip:
```
pip install .
```

This will install ell and its dependencies, including building the necessary frontend components for ell-studio.

Note: Ensure you have Node.js and npm installed on your system, as they are required for building the frontend components during the installation process.

Using `ell-studio`

To visaulize and track your promtps over the course of prompt engineering you need to install a store into your project.


# The entry point to your program
if __name__ == "__main__":

    store = SQLiteStore('sqlite_example') # <---- automatically stores all LMP's invocations, and version history in a sqlite database in ./sqlite_example
    store.install(autocommit=True) # <---- installs the store into your project so that tracking happens and change log commit messages are automatically generated using gpt-4o-mini

    your_language_model_program()
    print(greeting[::-1])

Now you can visualize your prompt engineering process just like tensorboard by using ell.studio:

python3 -m ell.studio --storage-dir ./sqlite_example

Then visit http://localhost:8000 in your browser to see your prompt engineering process visualized.

Contributing

To contribute, install ell in development mode using:

pip install -e .

To run ell-studio you must seperately run the backend and the front-end. To start the front-end:

cd ell-studio
npm run dev

To start the backend:

python -m ell.studio --storage-dir ./sqlite_example --dev # the dev flag is important!

You can then visualize your promtps by visiting the frontend on http://localhost:3000

Todos

Metrics

Design the metrics functionality. (maybe link WandB lol)

Bugs

Fix weird rehashing issue of the main prompt whenever subprompt changes? Or just make commits more of a background deal.
Trace not writing on first invoc.
Rewrite lexical closures
Serialize lkstrs in the jkson dumps in pyhton the same way as the db serializers them for the frontend (__lstr vs SerialziedLstr) <- these are pydantic models and so we can reuse them
handle failure to serialize.

Tests

Add comprehensive unit tests for all core functionalities.
Implement integration tests for end-to-end workflows.
Optimize backend performance and run benchmarks.

Trace Functionality

Visualize trace in graph
Implement Langsmith-style invocations and traces
Improve UX on traces
[o] Complete full trace implementation on invocation page
Enhance UX around traces in dependency graphs
[o] Implement argument pass-through functionality
Working trace graph thing

Version History

Auto-document commit changes
Implement version history diff view (possibly with automatic commit messages using GPT-4)
Add interactive diff view
Highlight changes in source when switching versions

LM Functionality

Freezing and unfreezing LMPs
Support multimodal inputs
Implement function calling
Add persistent chatting capability
Integrate with various LLM providers

Use Cases

Develop RAG (Retrieval-Augmented Generation) example
Implement embeddings functionality
Create examples for tool use and agents
Demonstrate Chain of Thought (CoT) reasoning
Showcase optimization techniques

Store

Improve developer experience around logging mechanisms
Drastically improve query runtime and performance test.

DX (Developer Experience)

Enhance UX for the LMP details page
Improve layout for the differnet parts
Add Dependency Graph on LMP page
Implement VSCode-style file explorer
Ensure and test Jupyter compatibility
Continue UI/UX improvements for the visualization component
Update LMP Details to be function-based for easier result viewing
Implement easy navigation similar to VSCode (cmd+shift+p or spotlight)
Optimize display of dependencies on prompt pages
Explore alternative backend options
Automatic API key management if not specified as a nice experience. Should ask for the api key and store it in the ~/.ell/api_keys directory for the user's convenience.
Need a cli for managing api keys etc

Packaging

Write comprehensive documentation
Prepare package for distribution
Refine and organize examples
Create production build for ell studio
Draft contribution guidelines

Miscellaneous

Implement metric tracking system
Add built-ins for classifiers (e.g., logit debiasing)
Develop evaluator framework
Create timeline visualization
Implement comment system
Add easy-to-use human evaluation tools
Implement keyboard shortcuts for navigating invocations
Ensure all components are linkable
Add comparison mode for language models and double-blind setup for evaluations
Integrate AI-assisted evaluations and metrics
Consider developing as a VSCode plugin
Implement organization system for multiple prompts (e.g., by module)
Add live updates and new content indicators
Update stores to use schema type hints and serialize to model dump in Flask (or consider switching to FastAPI)

7.2 KiB Raw Blame History

ell [WIP, unreleased, experimental]

What is ell?