mirror of
https://github.com/YerbaPage/LongCodeZip.git
synced 2025-10-22 23:19:46 +03:00
update
This commit is contained in:
21
LICENSE
Normal file
21
LICENSE
Normal file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2025 Yuling Shi and contributors
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
61
README.md
61
README.md
@@ -1,5 +1,16 @@
|
||||
<div align="center">
|
||||
<img src="assets/logo.png" alt="LongCodeZip Logo" width="200"/>
|
||||
|
||||
[](https://arxiv.org/abs/2510.00446)
|
||||
|
||||
[](https://conf.researchr.org/home/ase-2025)
|
||||
|
||||
[](https://www.python.org/)
|
||||
|
||||
[](https://github.com/YerbaPage/LongCodeZip)
|
||||
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
|
||||
</div>
|
||||
|
||||
# LongCodeZip
|
||||
@@ -130,54 +141,4 @@ Navigate to the `long-code-completion` directory:
|
||||
```bash
|
||||
cd long-code-completion
|
||||
bash run.sh
|
||||
```
|
||||
|
||||
This evaluates LongCodeZip on long-context code completion tasks with various configurations including different target token limits, fine-grained compression ratios, and importance beta values.
|
||||
|
||||
**Key Parameters:**
|
||||
- `--code_compressor_target_token`: Target token budget
|
||||
- `--code_compressor_fine_ratio`: Fine-grained compression ratio
|
||||
- `--importance_beta`: Importance weighting parameter
|
||||
|
||||
### Code Summarization
|
||||
|
||||
Navigate to the `module-summarization` directory:
|
||||
|
||||
```bash
|
||||
cd module-summarization
|
||||
bash run.sh
|
||||
```
|
||||
|
||||
This runs code summarization experiments with fine-grained compression and various beta values for importance weighting.
|
||||
|
||||
**Key Parameters:**
|
||||
- `--code_compressor_target_token`: Target token budget
|
||||
- `--code_compressor_fine_ratio`: Fine-grained compression ratio
|
||||
- `--importance_beta`: Importance weighting parameter
|
||||
|
||||
## Configuration
|
||||
|
||||
Each task can be customized by modifying the respective `run.sh` file or by directly calling the main scripts with custom parameters. Key configuration options include:
|
||||
|
||||
- **Model Selection**: Compatible with various code LLMs (default: Qwen2.5-Coder-7B-Instruct)
|
||||
- **Compression Ratios**: Adjustable compression levels for different use cases
|
||||
- **Token Budgets**: Configurable target token limits
|
||||
- **GPU Configuration**: Multi-GPU support for parallel experiments
|
||||
|
||||
## Performance
|
||||
|
||||
LongCodeZip achieves up to **5.6× compression ratio** without sacrificing task performance across code completion, summarization, and retrieval tasks. And even when using a 0.5B Qwen model as the compressor, it can also achieve competitive performance.
|
||||
|
||||
## Citation
|
||||
|
||||
If you find this work useful, please cite:
|
||||
|
||||
```bibtex
|
||||
@article{shi2025longcodezip,
|
||||
title={LongCodeZip: Compress Long Context for Code Language Models},
|
||||
author={Shi, Yuling and Qian, Yichun and Zhang, Hongyu and Shen, Beijun and Gu, Xiaodong},
|
||||
journal={arXiv preprint arXiv:2510.00446},
|
||||
year={2025},
|
||||
doi={10.48550/arXiv.2510.00446}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user