* adding code to fetch and convert devin's output for evaluation
* update README.md
* update code for fetching and processing devin's outputs
* update code for fetching and processing devin's outputs
* minimal docker sandbox
* make container_image as an argument (fall back to ubuntu);
increase timeout to avoid return too early for long running commands;
* add a minimal working (imperfect) example
* fix typo
* change default container name
* attempt to fix "Bad file descriptor" error
* handle ctrl+D
* add Python gitignore
* push sandbox to shared dockerhub for ease of use
* move codeact example into research folder
* add README for opendevin
* change container image name to opendevin dockerhub
* move folder; change example to a more general agent
* update Message and Role
* update docker sandbox to support mounting folder and switch to user with correct permission
* make network as host
* handle erorrs when attrs are not set yet
* convert codeact agent into a compatible agent
* add workspace to gitignore
* make sure the agent interface adjustment works for langchain_agent
* move agent to langchains_agent
* remove old .env
* remove the old agent folder
* add preliminary version of Agent abstraction
* add preliminary version of the main.py
* merge controlloop and main into a Agent class
* add init
* fix json import
* fix missing arg
* get langchains_agent working after abstraction
* rename `research` to `agenthub`
* rename: rename research to agenthub
---------
Co-authored-by: huybery <huybery@gmail.com>
* Creating the folder for experiments, adding initial analysis of Devin's outputs on SWE-bench
* typo fixed
* update devin's evaluation outputs analysis
* remove data files from commit
* add logistics for managing the experiment folder in README
* Change folder naming and update logistics.
---------
Co-authored-by: Bowen Li <libowen.ne@gmail.com>
* initialize control loop
* add todo
* more todo
* add dockerignore
* add notes to prompt
* encourage llm to finish
* add debug env
* update prompts a bit
* fix task prompts
* add basic regression framework
* add hello-world regression case
* add hello-name test case
* fix workspace ignore
* document regression script
* add python-cli test case
* add default git config
* add help regression test
* add node rewrite test case
* add react-todo test case
* fix dockerfile
* add ability to run background commands
* add client-server test case
* update regression readme
* better support for background commands
* update tests
* fix bug in command removal