packaging

2025-10-22 23:19:46 +03:00 · 2025-10-11 21:33:12 +08:00
parent 60201d365f
commit a391badfe1
37 changed files with 282 additions and 98 deletions
--- a/README.md
+++ b/README.md
@@ -21,37 +21,20 @@ LongCodeZip introduces a two-stage code compression framework specifically desig

 The method is plug-and-play and can be integrated with existing code LLMs to achieve significant compression ratios while maintaining or improving task performance.

-## Repository Structure
-
-This repository contains implementations and experiments for three code-related tasks:
-
-```
-LongCodeZip/
-├── repo-qa/                   # Code Retrieval Task
-│   ├── main.py               # Main evaluation script
-│   ├── run.sh                # Experiment runner
-│   ├── code_compressor.py    # Core compression implementation
-│   ├── compute_score.py      # Evaluation metrics
-│   └── ...
-├── long-code-completion/      # Code Completion Task
-│   ├── main.py               # Main evaluation script
-│   ├── run.sh                # Experiment runner
-│   ├── code_compressor.py    # Core compression implementation
-│   ├── utils.py              # Utility functions
-│   └── ...
-├── module-summarization/      # Code Summarization Task
-│   ├── main.py               # Main evaluation script
-│   ├── run.sh                # Experiment runner
-│   ├── code_compressor.py    # Core compression implementation
-│   ├── utils.py              # Utility functions
-│   └── ...
-└── README.md
-```
-
 ## Installation

+You can install directly from the GitHub repository:
+
 ```bash
-pip install -r requirements.txt
+pip install git+https://github.com/YerbaPage/LongCodeZip.git
+```
+
+Or clone and install in development mode:
+
+```bash
+git clone https://github.com/YerbaPage/LongCodeZip.git
+cd LongCodeZip
+pip install -e .
 ```

 ## Quick Demo
@@ -62,36 +45,21 @@ We provide a simple demo (`demo.py`) to help you get started with LongCodeZip:
 python demo.py
 ```

-This demo showcases the core compression functionality by compressing a simple code snippet containing multiple functions (add, quick_sort, search_with_binary_search) based on a query about quick sort. The compressor will:
-1. Rank functions by relevance to the query
-2. Apply fine-grained compression to maximize information density
-3. Generate a compressed prompt suitable for code LLMs
-
-**Example output:**
-```python
-# Original: ~150 tokens
-# Compressed: ~64 tokens (target)
-# Selected: quick_sort function (most relevant to query)
-```
-
-## Core API Usage
-
-LongCodeZip provides a simple and powerful API for compressing long code contexts. Here's how to use it:
-
-### Basic Example
+## Basic Example

 ```python
-from longcodezip import CodeCompressor
+from longcodezip import LongCodeZip

 # Initialize the compressor
-compressor = CodeCompressor(model_name="Qwen/Qwen2.5-Coder-7B-Instruct")
+compressor = LongCodeZip(model_name="Qwen/Qwen2.5-Coder-7B-Instruct")

 # Compress code with a query
 result = compressor.compress_code_file(
-    code=your_code_string,
-    query="What does this function do?",
-    instruction="Answer the question based on the code.",
+    code=<your_code_string>,
+    query=<your_query>,
+    instruction=<your_instruction>,
    rate=0.5,  # Keep 50% of tokens
+    rank_only=False, # Set to True to only rank and select contexts without fine-grained compression
 )

 # Access compressed results
@@ -99,41 +67,6 @@ compressed_code = result['compressed_code']
 compressed_prompt = result['compressed_prompt']  # Full prompt with instruction
 compression_ratio = result['compression_ratio']
 ```
-## Usage
-
-### Quick Start
-
-Each task directory contains a `run.sh` script for easy experimentation. Simply navigate to the desired task directory and run:
-
-```bash
-cd <task_directory>
-bash run.sh
-```
-
-### Code Retrieval (RepoQA)
-
-Navigate to the `repo-qa` directory and run experiments with different compression ratios:
-
-```bash
-cd repo-qa
-bash run.sh
-```
-
-The script will evaluate LongCodeZip on the RepoQA dataset with compression ratios, running experiments in parallel on multiple GPUs.
-
-**Key Parameters:**
- `--compression-ratio`: Controls the compression level
- `--model`: Specifies the base LLM model
- `--backend`: Backend for model inference (vllm)
-
-### Code Completion
-
-Navigate to the `long-code-completion` directory:
-
-```bash
-cd long-code-completion
-bash run.sh
-```

 ## References