Update .gitignore and README for improved project structure and voice management

- Added voice model files and patterns to .gitignore to prevent unnecessary tracking.
- Enhanced README to include details about the new 'uv' package manager for faster dependency management.
- Clarified setup instructions, emphasizing automatic installation of required tools and voice files.
- Updated voice file organization in the documentation to reflect on-demand downloading, improving user understanding of voice availability.
This commit is contained in:
Pierre Bruno
2025-01-24 20:05:52 +01:00
parent 29827d99d5
commit cb039d661e
2 changed files with 18 additions and 66 deletions

5
.gitignore vendored
View File

@@ -33,4 +33,7 @@ ENV/
# Project specific # Project specific
output*.wav output*.wav
*.pth *.pth
*.onnx *.onnx
voices/
voices/*.pt
voices/**/*.pt

View File

@@ -13,6 +13,7 @@ The project has been updated with:
- Interactive CLI interface - Interactive CLI interface
- Cross-platform setup scripts - Cross-platform setup scripts
- Web interface with Gradio - Web interface with Gradio
- Fast package management with uv
## Features ## Features
@@ -40,7 +41,7 @@ The project has been updated with:
- Git (for cloning the repository) - Git (for cloning the repository)
- Internet connection (for initial model download) - Internet connection (for initial model download)
- FFmpeg (required for MP3/AAC conversion): - FFmpeg (required for MP3/AAC conversion):
- Windows: Automatically installed with pydub - Windows: Automatically installed during setup
- Linux: `sudo apt-get install ffmpeg` - Linux: `sudo apt-get install ffmpeg`
- macOS: `brew install ffmpeg` - macOS: `brew install ffmpeg`
@@ -78,17 +79,16 @@ pydub # For audio format conversion
## Setup ## Setup
We use the modern `uv` package manager for faster and more reliable dependency management.
### Windows ### Windows
```powershell ```powershell
# Clone the repository # Clone the repository
git clone https://github.com/PierrunoYT/Kokoro-TTS-Local.git git clone https://github.com/PierrunoYT/Kokoro-TTS-Local.git
cd Kokoro-TTS-Local cd Kokoro-TTS-Local
# Run the setup script # Run the setup script (will install uv if not present)
.\setup.ps1 .\setup.ps1
# Download initial model and voices
python tts_demo.py --list-voices
``` ```
### Linux/macOS ### Linux/macOS
@@ -97,58 +97,16 @@ python tts_demo.py --list-voices
git clone https://github.com/PierrunoYT/Kokoro-TTS-Local.git git clone https://github.com/PierrunoYT/Kokoro-TTS-Local.git
cd Kokoro-TTS-Local cd Kokoro-TTS-Local
# Run the setup script # Run the setup script (will install uv if not present)
chmod +x setup.sh chmod +x setup.sh
./setup.sh ./setup.sh
# Install FFmpeg (if needed)
# Linux:
sudo apt-get install ffmpeg
# macOS:
brew install ffmpeg
# Download initial model and voices
python tts_demo.py --list-voices
``` ```
### Manual Setup The setup scripts will:
If you prefer to set up manually: 1. Install the `uv` package manager if not present
2. Create a virtual environment
1. Create a virtual environment: 3. Install all dependencies using `uv`
```bash 4. Install system requirements (espeak-ng, FFmpeg)
# Windows
python -m venv venv
.\venv\Scripts\activate
# Linux/macOS
python3 -m venv venv
source venv/bin/activate
```
2. Install dependencies:
```bash
python -m pip install --upgrade pip
pip install -r requirements.txt
```
3. Install system dependencies:
```bash
# Windows
# FFmpeg is automatically installed with pydub
# Linux
sudo apt-get update
sudo apt-get install espeak-ng ffmpeg
# macOS
brew install espeak ffmpeg
```
4. Download initial model and voices:
```bash
# This will download the model and voices from Hugging Face
python tts_demo.py --list-voices
```
## Usage ## Usage
@@ -200,17 +158,8 @@ The script will:
│ ├── output.wav # Default output file │ ├── output.wav # Default output file
│ ├── output.mp3 # MP3 converted files │ ├── output.mp3 # MP3 converted files
│ └── output.aac # AAC converted files │ └── output.aac # AAC converted files
├── voices/ # Voice model files ├── voices/ # Voice model files (downloaded on demand)
── af_bella/ # American Female - Bella voice ── ... # Voice files are downloaded when needed
│ ├── af_nicole/ # American Female - Nicole voice
│ ├── af_sarah/ # American Female - Sarah voice
│ ├── af_sky/ # American Female - Sky voice
│ ├── am_adam/ # American Male - Adam voice
│ ├── am_michael/ # American Male - Michael voice
│ ├── bf_emma/ # British Female - Emma voice
│ ├── bf_isabella/ # British Female - Isabella voice
│ ├── bm_george/ # British Male - George voice
│ └── bm_lewis/ # British Male - Lewis voice
├── venv/ # Python virtual environment ├── venv/ # Python virtual environment
├── LICENSE # Apache 2.0 License file ├── LICENSE # Apache 2.0 License file
├── README.md # Project documentation ├── README.md # Project documentation
@@ -227,7 +176,7 @@ The script will:
The project uses the Kokoro-82M model from Hugging Face: The project uses the Kokoro-82M model from Hugging Face:
- Repository: [hexgrad/Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) - Repository: [hexgrad/Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M)
- Model file: `kokoro-v0_19.pth` - Model file: `kokoro-v0_19.pth`
- Voice files: Located in the `voices/` directory - Voice files: Located in the `voices/` directory (downloaded automatically when needed)
- Available voices: - Available voices:
- American Female: `af_bella`, `af_nicole`, `af_sarah`, `af_sky` - American Female: `af_bella`, `af_nicole`, `af_sarah`, `af_sky`
- American Male: `am_adam`, `am_michael` - American Male: `am_adam`, `am_michael`