mirror of
https://github.com/exo-explore/exo.git
synced 2025-10-23 02:57:14 +03:00
update README with docs on exo run command
This commit is contained in:
15
README.md
15
README.md
@@ -130,13 +130,13 @@ exo starts a ChatGPT-like WebUI (powered by [tinygrad tinychat](https://github.c
|
||||
|
||||
For developers, exo also starts a ChatGPT-compatible API endpoint on http://localhost:8000/v1/chat/completions. Examples with curl:
|
||||
|
||||
#### Llama 3.1 8B:
|
||||
#### Llama 3.2 3B:
|
||||
|
||||
```sh
|
||||
curl http://localhost:8000/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "llama-3.1-8b",
|
||||
"model": "llama-3.2-3b",
|
||||
"messages": [{"role": "user", "content": "What is the meaning of exo?"}],
|
||||
"temperature": 0.7
|
||||
}'
|
||||
@@ -201,6 +201,17 @@ Linux devices will automatically default to using the **tinygrad** inference eng
|
||||
|
||||
You can read about tinygrad-specific env vars [here](https://docs.tinygrad.org/env_vars/). For example, you can configure tinygrad to use the cpu by specifying `CLANG=1`.
|
||||
|
||||
### Example Usage on a single device with "exo run" command
|
||||
|
||||
```sh
|
||||
exo run llama-3.2-3b
|
||||
```
|
||||
|
||||
With a custom prompt:
|
||||
|
||||
```sh
|
||||
exo run llama-3.2-3b --prompt "What is the meaning of exo?"
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
|
||||
Reference in New Issue
Block a user