mirror of
https://github.com/serge-chat/serge.git
synced 2024-01-15 09:32:12 +03:00
Updates to README.md (#414)
This commit is contained in:
committed by
GitHub
parent
dfb98c6885
commit
b1fb7009e7
@@ -0,0 +1,46 @@
|
||||
.DS_Store
|
||||
.github
|
||||
.gitignore
|
||||
.idea
|
||||
.jekyll-cache
|
||||
.jekyll-metadata
|
||||
.sass-cache
|
||||
tests
|
||||
_releaser
|
||||
_site
|
||||
CONTRIBUTING.md
|
||||
Dockerfile
|
||||
docker-compose.yml
|
||||
docker-compose.dev.yml
|
||||
/vendor
|
||||
.vscode/
|
||||
|
||||
**/node_modules/
|
||||
**/dist
|
||||
.git
|
||||
npm-debug.log
|
||||
.coverage
|
||||
.coverage.*
|
||||
.env
|
||||
.aws
|
||||
|
||||
__pycache__
|
||||
*.pyc
|
||||
*.pyo
|
||||
*.pyd
|
||||
.Python
|
||||
env
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
.tox
|
||||
.coverage
|
||||
.coverage.*
|
||||
.cache
|
||||
nosetests.xml
|
||||
coverage.xml
|
||||
*.cover
|
||||
*.log
|
||||
.git
|
||||
.mypy_cache
|
||||
.pytest_cache
|
||||
.hypothesis
|
||||
2
.github/workflows/ci.yml
vendored
2
.github/workflows/ci.yml
vendored
@@ -6,6 +6,7 @@ on:
|
||||
- "main"
|
||||
paths-ignore:
|
||||
- "**.md"
|
||||
- LICENSE
|
||||
- "docker-compose.yml"
|
||||
- "docker-compose.dev.yml"
|
||||
- ".github/ISSUE_TEMPLATE/*.yml"
|
||||
@@ -15,6 +16,7 @@ on:
|
||||
- "*"
|
||||
paths-ignore:
|
||||
- "**.md"
|
||||
- LICENSE
|
||||
- "docker-compose.yml"
|
||||
- "docker-compose.dev.yml"
|
||||
- ".github/ISSUE_TEMPLATE/*.yml"
|
||||
|
||||
4
LICENSE
4
LICENSE
@@ -1,6 +1,6 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2023 Georgi Gerganov
|
||||
Copyright (c) 2023 Nathan Sarrazin
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
@@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
SOFTWARE.
|
||||
217
README.md
217
README.md
@@ -1,125 +1,130 @@
|
||||
# Serge - LLaMA made easy 🦙
|
||||
# 🦙 Serge - LLaMA made easy
|
||||
|
||||

|
||||
[](https://discord.gg/62Hc6FEYQH)
|
||||
|
||||
A chat interface based on [llama.cpp](https://github.com/ggerganov/llama.cpp) for running Alpaca models. Entirely self-hosted, no API keys needed. Fits on 4GB of RAM and runs on the CPU.
|
||||
Serge is a chat interface crafted with [llama.cpp](https://github.com/ggerganov/llama.cpp) for running Alpaca models. No API keys, entirely self-hosted!
|
||||
|
||||
- **SvelteKit** frontend
|
||||
- **Redis** for storing chat history & parameters
|
||||
- **FastAPI + langchain** for the API, wrapping calls to [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [python bindings](https://github.com/abetlen/llama-cpp-python)
|
||||
🌐 **SvelteKit** frontend
|
||||
💾 **Redis** for storing chat history & parameters
|
||||
⚙️ **FastAPI + LangChain** for the API, wrapping calls to [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [python bindings](https://github.com/abetlen/llama-cpp-python)
|
||||
|
||||
[demo.webm](https://user-images.githubusercontent.com/25119303/226897188-914a6662-8c26-472c-96bd-f51fc020abf6.webm)
|
||||
🎥 Demo: [demo.webm](https://user-images.githubusercontent.com/25119303/226897188-914a6662-8c26-472c-96bd-f51fc020abf6.webm)
|
||||
|
||||
## Getting started
|
||||
## ⚡️ Quick start
|
||||
|
||||
Setting up Serge is very easy. Starting it up can be done in a single command:
|
||||
|
||||
```
|
||||
🐳 Docker:
|
||||
```bash
|
||||
docker run -d \
|
||||
--name serge \
|
||||
-v weights:/usr/src/app/weights \
|
||||
-v datadb:/data/db/ \
|
||||
-p 8008:8008 \
|
||||
ghcr.io/serge-chat/serge:main
|
||||
ghcr.io/serge-chat/serge:latest
|
||||
```
|
||||
|
||||
Then just go to http://localhost:8008/ and you're good to go!
|
||||
|
||||
The API documentation can be found at http://localhost:8008/api/docs
|
||||
|
||||
#### Windows
|
||||
|
||||
Make sure you have docker desktop installed, WSL2 configured and enough free RAM to run models. (see below)
|
||||
|
||||
#### Kubernetes & docker compose
|
||||
|
||||
Setting up Serge on Kubernetes or docker compose can be found in the wiki: https://github.com/serge-chat/serge/wiki/Integrating-Serge-in-your-orchestration#kubernetes-example
|
||||
|
||||
## Models
|
||||
|
||||
Currently the following models are supported:
|
||||
|
||||
#### Alpaca
|
||||
- Alpaca-LoRA-65B
|
||||
- GPT4-Alpaca-LoRA-30B
|
||||
|
||||
#### GPT4All
|
||||
- GPT4All-13B
|
||||
|
||||
#### Guanaco
|
||||
- Guanaco-7B
|
||||
- Guanaco-13B
|
||||
- Guanaco-33B
|
||||
- Guanaco-65B
|
||||
|
||||
#### Koala
|
||||
- Koala-7B
|
||||
- Koala-13B
|
||||
|
||||
#### Lazarus
|
||||
- Lazarus-30B
|
||||
|
||||
#### Nous
|
||||
- Nous-Hermes-13B
|
||||
|
||||
#### OpenAssistant
|
||||
- OpenAssistant-30B
|
||||
|
||||
#### Samantha
|
||||
- Samantha-7B
|
||||
- Samantha-13B
|
||||
- Samantha-33B
|
||||
|
||||
#### Stable
|
||||
- Stable-Vicuna-13B
|
||||
|
||||
#### Vicuna
|
||||
- Vicuna-CoT-7B
|
||||
- Vicuna-CoT-13B
|
||||
- Vicuna-v1.1-7B
|
||||
- Vicuna-v1.1-13B
|
||||
|
||||
#### Wizard
|
||||
- Wizard-Mega-13B
|
||||
- Wizard-Vicuna-Uncensored-7B
|
||||
- Wizard-Vicuna-Uncensored-13B
|
||||
- Wizard-Vicuna-Uncensored-30B
|
||||
- WizardLM-30B
|
||||
- WizardLM-Uncensored-7B
|
||||
- WizardLM-Uncensored-13B
|
||||
- WizardLM-Uncensored-30B
|
||||
|
||||
If you have existing weights from another project you can add them to the `serge_weights` volume using `docker cp`.
|
||||
|
||||
### :warning: A note on _memory usage_
|
||||
|
||||
LLaMA will just crash if you don't have enough available memory for your model.
|
||||
|
||||
- 7B requires about 4.5GB of free RAM
|
||||
- 7B-q6_K requires about 8.03 GB of free RAM
|
||||
- 13B requires about 12GB free
|
||||
- 13B-q6_K requires about 13.18 GB free
|
||||
- 30B requires about 20GB free
|
||||
- 30B-q6_K requires about 29.19 GB free
|
||||
|
||||
## Support
|
||||
|
||||
Feel free to join the discord if you need help with the setup: https://discord.gg/62Hc6FEYQH
|
||||
|
||||
## Contributing
|
||||
|
||||
Serge is always open for contributions! If you catch a bug or have a feature idea, feel free to open an issue or a PR.
|
||||
|
||||
If you want to run Serge in development mode (with hot-module reloading for svelte & autoreload for FastAPI) you can do so like this:
|
||||
🐙 Docker Compose:
|
||||
```yaml
|
||||
services:
|
||||
serge:
|
||||
image: ghcr.io/serge-chat/serge:latest
|
||||
container_name: serge
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- 8008:8008
|
||||
volumes:
|
||||
- weights:/usr/src/app/weights
|
||||
- datadb:/data/db/
|
||||
|
||||
volumes:
|
||||
weights:
|
||||
datadb:
|
||||
```
|
||||
|
||||
Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs
|
||||
|
||||
## 🖥️ Windows Setup
|
||||
|
||||
Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.
|
||||
|
||||
## ☁️Kubernetes & Docker Compose Setup
|
||||
|
||||
Instructions for setting up Serge on Kubernetes can be found in the [wiki](https://github.com/serge-chat/serge/wiki/Integrating-Serge-in-your-orchestration#kubernetes-example).
|
||||
|
||||
## 🧠 Supported Models
|
||||
|
||||
We currently support the following models:
|
||||
|
||||
- Alpaca 🦙
|
||||
- Alpaca-LoRA-65B
|
||||
- GPT4-Alpaca-LoRA-30B
|
||||
- GPT4All 🌍
|
||||
- GPT4All-13B
|
||||
- Guanaco 🦙
|
||||
- Guanaco-7B
|
||||
- Guanaco-13B
|
||||
- Guanaco-33B
|
||||
- Guanaco-65B
|
||||
- Koala 🐨
|
||||
- Koala-7B
|
||||
- Koala-13B
|
||||
- Lazarus 💀
|
||||
- Lazarus-30B
|
||||
- Nous 🧠
|
||||
- Nous-Hermes-13B
|
||||
- OpenAssistant 🎙️
|
||||
- OpenAssistant-30B
|
||||
- Samantha 👩
|
||||
- Samantha-7B
|
||||
- Samantha-13B
|
||||
- Samantha-33B
|
||||
- Stable 🐎
|
||||
- Stable-Vicuna-13B
|
||||
- Vicuna 🦙
|
||||
- Vicuna-CoT-7B
|
||||
- Vicuna-CoT-13B
|
||||
- Vicuna-v1.1-7B
|
||||
- Vicuna-v1.1-13B
|
||||
- Wizard 🧙
|
||||
- Wizard-Mega-13B
|
||||
- Wizard-Vicuna-Uncensored-7B
|
||||
- Wizard-Vicuna-Uncensored-13B
|
||||
- Wizard-Vicuna-Uncensored-30B
|
||||
- WizardLM-30B
|
||||
- WizardLM-Uncensored-7B
|
||||
- WizardLM-Uncensored-13B
|
||||
- WizardLM-Uncensored-30B
|
||||
|
||||
Additional weights can be added to the `serge_weights` volume using `docker cp`:
|
||||
|
||||
```bash
|
||||
docker cp ./my_weight.bin serge:/usr/src/app/weights/
|
||||
```
|
||||
|
||||
## ⚠️ Memory Usage
|
||||
|
||||
LLaMA will crash if you don't have enough available memory for the model:
|
||||
|
||||
| Model | RAM Required |
|
||||
|----------|-----------------|
|
||||
| 7B | 4.5GB |
|
||||
| 7B-q6_K | 8.03GB |
|
||||
| 13B | 12GB |
|
||||
| 13B-q6_K | 13.18GB |
|
||||
| 30B | 20GB |
|
||||
| 30B-q6_K | 29.19GB |
|
||||
|
||||
## 💬 Support
|
||||
|
||||
Need help? Join our [Discord](https://discord.gg/62Hc6FEYQH)
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
If you discover a bug or have a feature idea, feel free to open an issue or PR.
|
||||
|
||||
To run Serge in development mode:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/serge-chat/serge.git
|
||||
DOCKER_BUILDKIT=1 docker compose -f docker-compose.dev.yml up -d --build
|
||||
```
|
||||
|
||||
You can test the production image with
|
||||
|
||||
```
|
||||
DOCKER_BUILDKIT=1 docker compose up -d --build
|
||||
```
|
||||
```
|
||||
@@ -1,4 +1,3 @@
|
||||
version: "3.9"
|
||||
services:
|
||||
serge:
|
||||
restart: on-failure
|
||||
|
||||
@@ -1,18 +1,14 @@
|
||||
version: "3.9"
|
||||
services:
|
||||
serge:
|
||||
restart: on-failure
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
target: release
|
||||
volumes:
|
||||
- datadb:/data/db
|
||||
- weights:/usr/src/app/weights/
|
||||
- /etc/localtime:/etc/localtime:ro
|
||||
image: ghcr.io/serge-chat/serge:latest
|
||||
container_name: serge
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8008:8008"
|
||||
- 8008:8008
|
||||
volumes:
|
||||
- weights:/usr/src/app/weights
|
||||
- datadb:/data/db/
|
||||
|
||||
volumes:
|
||||
datadb:
|
||||
weights:
|
||||
weights:
|
||||
datadb:
|
||||
Reference in New Issue
Block a user