mirror of https://github.com/serge-chat/serge.git synced 2024-01-15 09:32:12 +03:00

Go to file

dependabot[bot] 82ea1dfe6d Bump prettier-plugin-tailwindcss from 0.5.6 to 0.5.7 in /web (#864 )

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

2023-11-10 18:52:03 -05:00

.github

Bump helm/chart-testing-action from 2.6.0 to 2.6.1 (#853 )

2023-11-07 09:04:20 -05:00

api

Bump huggingface-hub from 0.18.0 to 0.19.0 in /api (#863 )

2023-11-10 18:51:53 -05:00

charts

Add Kubernetes helm charts for Serge (#500 )

2023-08-10 23:02:14 -04:00

docs

Add Kubernetes helm charts for Serge (#500 )

2023-08-10 23:02:14 -04:00

scripts

Fix llama-cpp-python build for Apple Silicon (#763 )

2023-09-20 08:35:50 -04:00

web

Bump prettier-plugin-tailwindcss from 0.5.6 to 0.5.7 in /web (#864 )

2023-11-10 18:52:03 -05:00

.dockerignore

Refactor production Dockerfile, Add development Dockerfile (#485 )

2023-07-01 22:47:29 -04:00

.gitattributes

added .gitattributes

2023-03-25 16:14:37 +01:00

.gitignore

Refactor production Dockerfile, Add development Dockerfile (#485 )

2023-07-01 22:47:29 -04:00

CODE_OF_CONDUCT.md

Create CODE_OF_CONDUCT.md (#88 )

2023-03-27 18:41:32 +02:00

docker-compose.dev.yml

Fixes to startup scripts and Dockerfiles (#517 )

2023-07-09 18:28:33 -04:00

docker-compose.yml

Support for DragonflyDB (#598 )

2023-08-06 22:54:42 -04:00

Dockerfile

Add support for latest debian release (#755 )

2023-09-19 22:55:19 -04:00

Dockerfile.dev

Add support for latest debian release (#755 )

2023-09-19 22:55:19 -04:00

LICENSE-APACHE

Add support for dual-license (#852 )

2023-11-05 09:51:25 -05:00

LICENSE-MIT

Add support for dual-license (#852 )

2023-11-05 09:51:25 -05:00

README.md

Add support for latest debian release (#755 )

2023-09-19 22:55:19 -04:00

README.md

Serge - LLaMA made easy 🦙

Serge is a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!

🌐 SvelteKit frontend
💾 Redis for storing chat history & parameters
⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

🎥 Demo:

demo.webm

⚡️ Quick start

🐳 Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs

🖥️ Windows Setup

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes & Docker Compose Setup

Instructions for setting up Serge on Kubernetes can be found in the wiki.

🧠 Supported Models

Category	Models
Alpaca 🦙	Alpaca-LoRA-65B, GPT4-Alpaca-LoRA-30B
Chronos 🌑	Chronos-13B, Chronos-33B, Chronos-Hermes-13B
GPT4All 🌍	GPT4All-13B
Koala 🐨	Koala-7B, Koala-13B
LLaMA 🦙	FinLLaMA-33B, LLaMA-Supercot-30B, LLaMA2 7B, LLaMA2 13B, LLaMA2 70B
Lazarus 💀	Lazarus-30B
Nous 🧠	Nous-Hermes-13B
OpenAssistant 🎙️	OpenAssistant-30B
Orca 🐬	Orca-Mini-v2-7B, Orca-Mini-v2-13B, OpenOrca-Preview1-13B
Samantha 👩	Samantha-7B, Samantha-13B, Samantha-33B
Vicuna 🦙	Stable-Vicuna-13B, Vicuna-CoT-7B, Vicuna-CoT-13B, Vicuna-v1.1-7B, Vicuna-v1.1-13B, VicUnlocked-30B, VicUnlocked-65B
Wizard 🧙	Wizard-Mega-13B, WizardLM-Uncensored-7B, WizardLM-Uncensored-13B, WizardLM-Uncensored-30B, WizardCoder-Python-13B-V1.0

Additional weights can be added to the serge_weights volume using docker cp:

docker cp ./my_weight.bin serge:/usr/src/app/weights/

⚠️ Memory Usage

LLaMA will crash if you don't have enough available memory for the model:

Model	Max RAM Required
7B	4.5GB
7B-q2_K	5.37GB
7B-q3_K_L	6.10GB
7B-q4_1	6.71GB
7B-q4_K_M	6.58GB
7B-q5_1	7.56GB
7B-q5_K_M	7.28GB
7B-q6_K	8.03GB
7B-q8_0	9.66GB
13B	12GB
13B-q2_K	8.01GB
13B-q3_K_L	9.43GB
13B-q4_1	10.64GB
13B-q4_K_M	10.37GB
13B-q5_1	12.26GB
13B-q5_K_M	11.73GB
13B-q6_K	13.18GB
13B-q8_0	16.33GB
33B	20GB
33B-q2_K	16.21GB
33B-q3_K_L	19.78GB
33B-q4_1	22.83GB
33B-q4_K_M	22.12GB
33B-q5_1	26.90GB
33B-q5_K_M	25.55GB
33B-q6_K	29.19GB
33B-q8_0	37.06GB
65B	50GB
65B-q2_K	29.95GB
65B-q3_K_L	37.15GB
65B-q4_1	43.31GB
65B-q4_K_M	41.85GB
65B-q5_1	51.47GB
65B-q5_K_M	48.74GB
65B-q6_K	56.06GB
65B-q8_0	71.87GB

💬 Support

Need help? Join our Discord

⭐️ Stargazers

🧾 License

Nathan Sarrazin and Contributors. Serge is free and open-source software licensed under the MIT License.

🤝 Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
docker compose -f docker-compose.dev.yml up -d --build

Languages

Svelte 60%

Python 23.4%

CSS 6.2%

Shell 3.4%

TypeScript 2.5%

Other 4.5%

README.md Unescape Escape

Serge - LLaMA made easy 🦙

⚡️ Quick start

🖥️ Windows Setup

☁️ Kubernetes & Docker Compose Setup

🧠 Supported Models

⚠️ Memory Usage

💬 Support

⭐️ Stargazers

🧾 License

🤝 Contributing

README.md