mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2023-11-04 02:52:44 +03:00
whisper : add option to speed up the audio tempo by x2
Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example.
This commit is contained in:
@@ -202,6 +202,9 @@ extern "C" {
|
||||
float thold_ptsum; // timestamp token sum probability threshold (~0.01)
|
||||
int max_len; // max segment length in characters
|
||||
|
||||
// [EXPERIMENTAL] speed-up techniques
|
||||
bool speed_up; // speed-up the audio by 2x using Phase Vocoder
|
||||
|
||||
const char * language;
|
||||
|
||||
struct {
|
||||
|
||||
Reference in New Issue
Block a user