mirror of
https://github.com/maudoin/ollama-voice.git
synced 2024-04-20 16:48:11 +03:00
96c3b0feb9707c54511e075cf7b9236c42321aea
ollama-voice
Plug whisper audio transcription to a local ollama server and ouput tts audio responses
This is just a simple combination of three tools in offline mode:
- Speech recognition: whisper running local models in offline mode
- Large Language Mode: ollama running local models in offline mode
- Offline Text To Speech: pyttsx3
Prerequisites
whisper dependencies are setup to run on GPU so Install Cuda before running pip install.
Running
Install ollama and ensure server is started locally first (in WLS under windows) (e.g. curl https://ollama.ai/install.sh | sh)
Download a whisper model and place it in the whisper subfolder (e.g. e5b1a55b89/large-v3.pt)
Configure assistant.yaml settings. (It is setup to work in french with ollama mistral model by default...)
Run assistant.py
Todo
- Allow a full conversation with a "press to talk" function between requests
- Process ollama json responses in stream mode to generate voice at the end of each sentence.
Languages
Python
100%