whisper : fixed Beam Search Strategy and exposed whisper_pcm_to_mel_phase_vocoder (#474)

Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>
This commit is contained in:
sandrohanea
2023-02-08 08:01:47 +01:00
committed by GitHub
parent 4dd7119deb
commit 2bfe0ebc0f
2 changed files with 12 additions and 2 deletions

View File

@@ -113,6 +113,16 @@ extern "C" {
int n_samples,
int n_threads);
// Convert RAW PCM audio to log mel spectrogram but applies a Phase Vocoder to speed up the audio x2.
// The resulting spectrogram is stored inside the provided whisper context.
// Returns 0 on success
WHISPER_API int whisper_pcm_to_mel_phase_vocoder(
struct whisper_context* ctx,
const float* samples,
int n_samples,
int n_threads);
// This can be used to set a custom log mel spectrogram inside the provided whisper context.
// Use this instead of whisper_pcm_to_mel() if you want to provide your own log mel spectrogram.
// n_mel must be 80