alihan/LLMLingua: To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss. - LLMLingua - gitea-ailhan-registry

alihan/LLMLingua

mirror of https://github.com/microsoft/LLMLingua.git synced 2024-01-23 02:05:46 +03:00

Go to file

Microsoft Open Source 3df35280a6 LICENSE committed

2023-07-06 23:26:06 -07:00

.gitignore

Initial commit

2023-07-07 06:26:00 +00:00

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md committed

2023-07-06 23:26:06 -07:00

LICENSE

LICENSE committed

2023-07-06 23:26:06 -07:00