34 Commits

Author SHA1 Message Date
Pierre Bruno
cb039d661e Update .gitignore and README for improved project structure and voice management
- Added voice model files and patterns to .gitignore to prevent unnecessary tracking.
- Enhanced README to include details about the new 'uv' package manager for faster dependency management.
- Clarified setup instructions, emphasizing automatic installation of required tools and voice files.
- Updated voice file organization in the documentation to reflect on-demand downloading, improving user understanding of voice availability.
2025-01-24 20:05:52 +01:00
Pierre Bruno
29827d99d5 Remove deprecated voice files from the repository
- Deleted multiple voice files including 'af_bella.pt', 'af_nicole.pt', 'af_sarah.pt', 'af_sky.pt', 'af.pt', 'am_adam.pt', 'am_michael.pt', 'bf_emma.pt', 'bf_isabella.pt', 'bm_george.pt', and 'bm_lewis.pt'.
- This cleanup is part of the ongoing effort to streamline voice management and improve project organization.
2025-01-24 20:04:44 +01:00
Pierre Bruno
23f0ffb7da Enhance setup scripts for TTS environment initialization
- Updated setup.ps1 and setup.sh to check for and install the 'uv' package manager if not already present.
- Modified virtual environment creation to use 'uv' for consistency across platforms.
- Improved activation instructions for the virtual environment in both scripts.
- Added FFmpeg installation for Windows and system dependencies for Linux and macOS to ensure all necessary tools are available for TTS functionality.
- Streamlined dependency installation process to utilize 'uv' for package management.
2025-01-24 20:02:11 +01:00
Pierre Bruno
1fc7602e1b Refactor README to enhance project documentation and clarify file structure
- Streamlined the project structure section by removing redundant details and focusing on key files and their purposes.
- Updated the file structure to include new directories and files, such as the Gradio SSL certificate and various audio output formats.
- Clarified voice categories in the documentation, ensuring accurate representation of available voices.
- Improved overall readability and organization of the README to enhance user understanding of the project layout.
2025-01-22 10:48:23 +01:00
Pierre Bruno
69eb58927a Update README to correct voice categories from African to American
- Changed voice category names in the README from "African" to "American" for clarity and accuracy.
- Ensured that the list of available voices reflects the correct regional designations, enhancing user understanding of voice options.
2025-01-22 10:44:12 +01:00
Pierre Bruno
d79d897014 Update README to include detailed file structure and enhance project documentation
- Added a comprehensive file structure section to the README, outlining the organization of project files and directories.
- Improved clarity on the purpose of key files, such as the Gradio interface, model implementation, and setup scripts, enhancing user understanding of the project layout.
2025-01-21 14:22:08 +01:00
Pierre Bruno
df828f0409 Enhance TTS functionality and improve voice management
- Refactored the TTS generation process to initialize the model globally and load voices dynamically, improving efficiency and usability.
- Introduced a new load_and_validate_voice function to ensure requested voices exist before loading, enhancing error handling.
- Updated generate_tts_with_logs to provide real-time logging during speech generation, including phoneme processing and audio saving.
- Improved audio conversion process with better error handling and temporary file management.
- Set default voice to 'af_bella' in the Gradio interface for improved user experience.
2025-01-16 17:03:54 +01:00
Pierre Bruno
3ae6e74c57 Refactor speech generation process to utilize dynamic module loading
- Updated the generate_speech function to download and import the Kokoro module dynamically from Hugging Face, enhancing flexibility and maintainability.
- Improved error handling during speech generation to provide clearer feedback in case of failures.
2025-01-16 16:46:33 +01:00
Pierre Bruno
07d8ded530 Refactor voice management and enhance TTS demo functionality
- Improved voice file organization by storing them in a local 'voices' directory and ensuring its automatic creation if missing.
- Enhanced load_voice function to download missing voice files, defaulting to 'af_bella' for better usability.
- Updated command-line argument defaults in tts_demo.py to align with new voice management features.
- Enhanced README to clarify voice availability and usage instructions, improving user experience.
2025-01-16 16:41:30 +01:00
Pierre Bruno
49379b98e5 Refactor voice management and enhance TTS demo functionality
- Updated voice file retrieval to store voices in a local 'voices' directory, improving organization and accessibility.
- Implemented automatic creation of the voices directory if it doesn't exist, ensuring smoother user experience.
- Enhanced load_voice function to download missing voice files locally, defaulting to 'af_bella' for better usability.
- Adjusted tqdm import for improved compatibility with Windows consoles and configured it to prevent encoding issues.
- Updated command-line argument defaults in tts_demo.py to reflect changes in voice management.
2025-01-16 16:40:24 +01:00
Pierre Bruno
ff9c696065 Refactor voice downloading logic and update README with available voices
- Enhanced the list_available_voices function to dynamically download voice files from Hugging Face based on the repository contents.
- Updated the fallback voice list to include additional voice options for better user experience.
- Revised README to clearly list available voices by category, improving clarity for users.
2025-01-16 16:35:30 +01:00
Pierre Bruno
65090f811a Enhance model building process and update README for Windows setup
- Added functionality to download additional required files, including config.json, during model initialization in models.py.
- Improved module import process for better clarity and organization.
- Included a test for the phonemizer to ensure functionality.
- Updated README to provide specific instructions for enabling Developer Mode or running Python as Administrator on Windows for optimal performance and symlink support.
2025-01-16 16:32:05 +01:00
Pierre Bruno
3b842a2d93 Refactor voice management and enhance README instructions
- Removed the get_default_voices_path function and replaced it with a more robust list_available_voices function that utilizes the new list_available_voices method from models.py.
- Introduced get_voices_path in models.py to streamline voice file retrieval across platforms.
- Improved voice downloading logic to ensure availability of voice files from Hugging Face.
- Updated README to include instructions for downloading initial models and voices, enhancing user setup experience.
2025-01-16 16:30:04 +01:00
Pierre Bruno
817bc814a1 Update README to clarify web interface usage and port handling
- Revised instructions for launching the Gradio web interface, emphasizing the need to visit the correct URL.
- Added note regarding automatic port selection if port 7860 is in use, enhancing user experience and troubleshooting.
2025-01-16 16:27:25 +01:00
Pierre Bruno
f7753ccb62 Enhance Gradio interface and audio conversion capabilities
- Added audio format conversion functionality using pydub, supporting WAV, MP3, and AAC formats.
- Improved error handling for voice directory access and audio conversion processes.
- Updated README to reflect new web interface features and installation requirements, including FFmpeg.
- Enhanced the TTS generation function to utilize the correct Python interpreter across platforms.
- Documented new features in the README, including real-time progress monitoring and network sharing capabilities.
2025-01-16 16:19:31 +01:00
Pierre Bruno
49e19f0c51 Merge remote-tracking branch 'teslanaut/feature/gradio-interface' into gradio-ui 2025-01-16 16:07:54 +01:00
Pierre Bruno
1e875aba99 Enhance TTS demo with voice validation and progress indicators
- Introduced load_and_validate_voice function to ensure requested voice exists before loading.
- Added command-line options for model path, output file, and language code with default values.
- Implemented progress indicators using tqdm for model and voice loading, as well as speech generation.
- Updated default text handling and ensured proper cleanup of resources after execution.
2025-01-15 18:22:54 +01:00
Pip
1e61270e6c Added Gradio web UI & updated models.py to work with the UI. Tested on Linux Mint & MacOS Sequoia, not tested on Windows 2025-01-15 09:14:18 -08:00
Pierre Bruno
26ee1128a5 Add config.json download in build_model function 2025-01-15 09:18:46 +01:00
Pierre Bruno
34809a8073 Update README with cross-platform setup instructions and scripts 2025-01-14 16:20:48 +01:00
Pierre Bruno
36c5c7f85d Update README with comprehensive technical details and current status 2025-01-14 16:19:15 +01:00
Pierre Bruno
2871b2b47d Update README with voice listing feature and improved documentation 2025-01-14 16:18:10 +01:00
Pierre Bruno
c6c4b0b39d Add voice listing functionality
- Add list_available_voices() function to models.py
- Add --list-voices argument to tts_demo.py
- Enable users to view all available voice options
2025-01-14 16:14:48 +01:00
Pierre Bruno
829df3f1ba Add Apache 2.0 LICENSE with copyright notice 2025-01-14 16:08:22 +01:00
Pierre Bruno
79f2285a6a Add interactive CLI and command-line options for custom text input 2025-01-14 15:59:37 +01:00
Pierre Bruno
7d32fb6543 Update README with current project status and improved documentation 2025-01-14 15:58:05 +01:00
Pierre Bruno
2d81e92b33 Fix: Add plbert module dependency 2025-01-14 15:56:27 +01:00
Pierre Bruno
9ab9ad1f59 Fix: Reorder module imports to handle dependencies correctly 2025-01-14 15:55:42 +01:00
Pierre Bruno
40ce7ecb4f Fix: Resolve circular imports and improve module loading 2025-01-14 15:54:54 +01:00
Pierre Bruno
f0e8343a7d Update to use espeakng-loader and phonemizer-fork 2025-01-14 15:48:44 +01:00
Pierre Bruno
43fe839629 Update to use official Kokoro implementation 2025-01-14 15:46:50 +01:00
Pierre Bruno
6f38c34998 Update to use espeakng-loader for automatic espeak-ng installation 2025-01-14 15:43:16 +01:00
Pierre Bruno
6745624ca0 Update README: Add current status and call for help 2025-01-14 15:39:25 +01:00
Pierre Bruno
9eb71b699d Initial commit: Kokoro TTS Local implementation 2025-01-14 15:38:03 +01:00