whisper-api is a lightweight asynchronous client for OpenAI’s Whisper API.
Make sure you have ffmpeg installed on your system. For example, on Ubuntu:
$ sudo apt update && sudo apt install ffmpeg
Using straight.el and general.el (if you use them), add this to your configuration:
(use-package whisper-api
:straight (:host github :repo "ileixe/whisper-api")
:config
(setq whisper-api-openai-token "YOUR-API-KEY") ;; or you can uses .authinfo file
:general
;; Global bindings for normal, visual and insert states
(:states '(normal visual insert)
"C-c w" 'whisper-api-record-dwim
"C-c c" 'whisper-api-cancel)
;; Bindings for the minibuffer-keymaps so that they also work inside the minibuffer.
;; (:keymaps '(minibuffer-local-map minibuffer-local-ns-map)
;; "C-c w" 'whisper-api-record-dwim
;; "C-c c" 'whisper-api-cancel)
)
- Non-blocking recording and transcription.
- Graceful cancellation of recording or pending API requests.
- Supports reading the API key from the user variable or from authinfo.
- Customizable ffmpeg command (default uses PulseAudio, but can be adjusted to ALSA or other sources).
- Set your API key, for example in your init file:
(setq whisper-api-openai-token "sk-...")
If not set, the package will fallback on .authinfo file:
machine api.openai.com login apikey password YOUR_API_KEY
- To start recording, run:
M-x whisper-api-record-dwim
The temporary file path is displayed in the minibuffer.
- To stop recording (which sends the audio file to the API asynchronously), run:
M-x whisper-api-record-dwim
(again)The transcription is then inserted into your current buffer.
- To cancel an active recording or pending API call, run:
M-x whisper-api-cancel
- whisper-api-ffmpeg-command
The format string used to invoke ffmpeg. By default it is:
ffmpeg -y -t %d -f pulse -i default -ar 16000 %s
You can modify this if you use another input device; for instance, if you prefer ALSA use:
(setq whisper-api-ffmpeg-command "ffmpeg -y -t %d -f alsa -i default -ar 16000 %s")
- whisper-api-base-url
The base URL for the Whisper API. By default it is set to OpenAI’s endpoint:
https://api.openai.com/v1/audio/transcriptions
To use a local inference endpoint or alternative service, set this value accordingly. For example:
(setq whisper-api-base-url "http://localhost:8000/my-endpoint")
- The package prints debug messages in the Messages buffer, which can help diagnose issues with the recording or API call.
Enjoy!