Commit Graph

22 Commits

Author SHA1 Message Date
Georgi Gerganov
72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Topping1
50b5fe964c Update main.cpp 2022-10-09 23:35:10 -05:00
Georgi Gerganov
4a6bf11db3 Minor 2022-10-08 18:13:26 +03:00
Georgi Gerganov
9bbca3110f ref #9 : add API documentation in whisper.h 2022-10-08 18:09:56 +03:00
Georgi Gerganov
2ca8cc77b2 ref #17 : print whisper logs to stderr
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov
8c7c018893 ref #17 : add options to output result to file
Support for:

- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov
7787b878e1 ref #16, #22 : add "offset" argument
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2022-10-07 22:00:40 +03:00
Georgi Gerganov
700898e6ed ref #22 : add option to provide multiple input .wav files 2022-10-05 23:44:10 +03:00
Georgi Gerganov
ce1fe95902 wip : improve makefile 2022-10-05 23:03:46 +03:00
Артём Земляк
495b81b367 Fix: main get n_threads from cli 2022-10-05 09:47:48 +07:00
Артём Земляк
f007e186fe Fix: main get language from cli args 2022-10-05 09:24:53 +07:00
Georgi Gerganov
6814cc9b02 Improve result printing 2022-10-04 23:18:15 +03:00
Georgi Gerganov
eba33adadd Extend C-style API with full inference methods 2022-10-04 23:18:15 +03:00
Georgi Gerganov
6b77124e01 Initial C-style interface for whisper.cpp 2022-10-04 23:18:15 +03:00
Georgi Gerganov
77d929f603 Fix bug in FFT
The FFT routine does not work for odd N
Solution is to add DFT and use it when N is odd
2022-10-02 17:46:21 +03:00
Georgi Gerganov
6d654d192a Fix reading of stereo WAV files 2022-10-01 08:41:57 +03:00
Georgi Gerganov
15b49e8baf Bug fix
Longer prompts could cause out-of-bounds access
2022-09-30 20:37:29 +03:00
Georgi Gerganov
3bcdbdfc32 Reduce memory usage even more + better sampling
- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
  force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample
2022-09-30 19:35:27 +03:00
Georgi Gerganov
5877c3578e ref #4 : added transcription timestamps
Can be turned off with "-nt" argument.
Performance has also improved.
2022-09-29 23:09:39 +03:00
Georgi Gerganov
f888c2373d Flash + language support (ref #2)
- Achieved big performance improvement + memory usage reduction
- Can now translate / transcribe different languages
2022-09-28 21:07:32 +03:00
Georgi Gerganov
476182e439 Update README.md and simplify usage 2022-09-26 09:36:51 +03:00
Georgi Gerganov
b0a11594ae Initial release 2022-09-25 22:13:49 +03:00