llama_cpp server: add missing top_k param to CreateChatCompletionRequest

`llama.create_chat_completion` definitely has a `top_k` argument, but its missing from `CreateChatCompletionRequest`. decision: add it
2023-09-07 17:34:22 +03:00 · 2023-04-29 11:52:20 -07:00
parent 1e42913599
commit a5aa6c1478
1 changed files with 1 additions and 0 deletions
--- a/llama_cpp/server/app.py
+++ b/llama_cpp/server/app.py
@@ -169,6 +169,7 @@ class CreateChatCompletionRequest(BaseModel):
    model: str = model_field

    # llama.cpp specific parameters
+    top_k: int = 40,
    repeat_penalty: float = 1.1

    class Config: