Automatic CPU speed & power optimizer for Linux
Updated 2024-09-12 20:32:30 +03:00
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
Updated 2024-01-23 02:05:47 +03:00
GPU fan control for headless Linux
Updated 2023-05-03 01:05:48 +03:00