Add option to speed up the audio tempo by x2 #143

ggerganov · 2022-11-12T19:20:34Z

Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain.
Use the -su or --speed-up command line argument to enable it:

./main -m ./models/ggml-small.en.bin -f samples/gb0.wav -su

This reduces the computation in the Encoder by a factor of 2.
The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good.

I think this can find application for real-time transcription - i.e. the "stream" example.

Similar result is achieved with ffmpeg's atempo filter:

# speed-up tempo by factor of 2
ffmpeg -i samples/gb0.wav -filter_complex "[0:a]atempo=2.0[a]" -map "[a]" samples/gb0-fast.wav

./main -m ./models/ggml-small.en.bin -f samples/gb0-fast.wav

Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example.

ggerganov merged commit 83c742f into master Nov 13, 2022

ggerganov deleted the tempo branch November 13, 2022 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to speed up the audio tempo by x2 #143

Add option to speed up the audio tempo by x2 #143

ggerganov commented Nov 12, 2022

Add option to speed up the audio tempo by x2 #143

Add option to speed up the audio tempo by x2 #143

Conversation

ggerganov commented Nov 12, 2022