alihan/Kokoro-TTS-Local

Fork 0

mirror of https://github.com/PierrunoYT/Kokoro-TTS-Local.git synced 2025-01-27 02:30:25 +03:00

Go to file

Pierre Bruno 829df3f1ba Add Apache 2.0 LICENSE with copyright notice

2025-01-14 16:08:22 +01:00

.gitignore

Initial commit: Kokoro TTS Local implementation

2025-01-14 15:38:03 +01:00

LICENSE

Add Apache 2.0 LICENSE with copyright notice

2025-01-14 16:08:22 +01:00

models.py

Fix: Add plbert module dependency

2025-01-14 15:56:27 +01:00

README.md

Add interactive CLI and command-line options for custom text input

2025-01-14 15:59:37 +01:00

requirements.txt

Update to use espeakng-loader and phonemizer-fork

2025-01-14 15:48:44 +01:00

tts_demo.py

Add interactive CLI and command-line options for custom text input

2025-01-14 15:59:37 +01:00

README.md

Kokoro TTS Local

A local implementation of the Kokoro Text-to-Speech model.

Current Status

⚠️ WORK IN PROGRESS ⚠️

The project is currently being updated to use better dependency management and improved module loading.

Features

Local text-to-speech synthesis using the Kokoro model
Automatic espeak-ng setup using espeakng-loader
Multiple voice support
Phoneme output support
Interactive CLI for custom text input

Dependencies

torch
phonemizer-fork
transformers
scipy
munch
soundfile
huggingface-hub
espeakng-loader

Setup

Create a virtual environment:

python -m venv venv
.\venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Usage

Run the demo script with default text:

python tts_demo.py

Or specify your own text:

python tts_demo.py --text "Your custom text here"

You can also choose a different voice:

python tts_demo.py --voice "af" --text "Custom text with specific voice"

If you run without any arguments, you'll be prompted to enter text interactively.

The script will:

Download necessary model files from Hugging Face
Set up espeak-ng automatically
Generate speech from your text
Save the output as 'output.wav'

Project Structure

models.py: Core model loading and speech generation functionality
tts_demo.py: Demo script showing basic usage
requirements.txt: Project dependencies

Model Information

The project uses the Kokoro-82M model from Hugging Face:

Repository: hexgrad/Kokoro-82M
Model file: kokoro-v0_19.pth
Voice files: Located in the voices/ directory

Contributing

Feel free to contribute by:

Opening issues for bugs or feature requests
Submitting pull requests with improvements
Helping with documentation

License

This project is licensed under the Apache 2.0 License.