Pierre Bruno c6c4b0b39d Add voice listing functionality
- Add list_available_voices() function to models.py
- Add --list-voices argument to tts_demo.py
- Enable users to view all available voice options
2025-01-14 16:14:48 +01:00
2025-01-14 16:14:48 +01:00
2025-01-14 16:14:48 +01:00

Kokoro TTS Local

A local implementation of the Kokoro Text-to-Speech model.

Current Status

⚠️ WORK IN PROGRESS ⚠️

The project is currently being updated to use better dependency management and improved module loading.

Features

  • Local text-to-speech synthesis using the Kokoro model
  • Automatic espeak-ng setup using espeakng-loader
  • Multiple voice support
  • Phoneme output support
  • Interactive CLI for custom text input

Dependencies

torch
phonemizer-fork
transformers
scipy
munch
soundfile
huggingface-hub
espeakng-loader

Setup

  1. Create a virtual environment:
python -m venv venv
.\venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Usage

Run the demo script with default text:

python tts_demo.py

Or specify your own text:

python tts_demo.py --text "Your custom text here"

You can also choose a different voice:

python tts_demo.py --voice "af" --text "Custom text with specific voice"

If you run without any arguments, you'll be prompted to enter text interactively.

The script will:

  1. Download necessary model files from Hugging Face
  2. Set up espeak-ng automatically
  3. Generate speech from your text
  4. Save the output as 'output.wav'

Project Structure

  • models.py: Core model loading and speech generation functionality
  • tts_demo.py: Demo script showing basic usage
  • requirements.txt: Project dependencies

Model Information

The project uses the Kokoro-82M model from Hugging Face:

  • Repository: hexgrad/Kokoro-82M
  • Model file: kokoro-v0_19.pth
  • Voice files: Located in the voices/ directory

Contributing

Feel free to contribute by:

  1. Opening issues for bugs or feature requests
  2. Submitting pull requests with improvements
  3. Helping with documentation

License

This project is licensed under the Apache 2.0 License.

Description
No description provided
Readme 3.4 MiB
Languages
Python 92%
Shell 4.2%
PowerShell 3.8%