Skip to content

ileixe/whisper-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

whisper-api: Asynchronous Speech-to-Text for Emacs

whisper-api is a lightweight asynchronous client for OpenAI’s Whisper API.

Installation

Make sure you have ffmpeg installed on your system. For example, on Ubuntu:

$ sudo apt update && sudo apt install ffmpeg

Using straight.el and general.el (if you use them), add this to your configuration:

(use-package whisper-api
  :straight (:host github :repo "ileixe/whisper-api")
  :config
  (setq whisper-api-openai-token "YOUR-API-KEY") ;; or you can uses .authinfo file
  :general
  ;; Global bindings for normal, visual and insert states
  (:states '(normal visual insert)
           "C-c w" 'whisper-api-record-dwim
           "C-c c" 'whisper-api-cancel)
  ;; Bindings for the minibuffer-keymaps so that they also work inside the minibuffer.
  ;; (:keymaps '(minibuffer-local-map minibuffer-local-ns-map)
  ;;          "C-c w" 'whisper-api-record-dwim
  ;;          "C-c c" 'whisper-api-cancel)
  )

Features

  • Non-blocking recording and transcription.
  • Graceful cancellation of recording or pending API requests.
  • Supports reading the API key from the user variable or from authinfo.
  • Customizable ffmpeg command (default uses PulseAudio, but can be adjusted to ALSA or other sources).

Usage

  1. Set your API key, for example in your init file:

    (setq whisper-api-openai-token "sk-...")

    If not set, the package will fallback on .authinfo file:

    machine api.openai.com login apikey password YOUR_API_KEY

  2. To start recording, run:

    M-x whisper-api-record-dwim

    The temporary file path is displayed in the minibuffer.

  3. To stop recording (which sends the audio file to the API asynchronously), run:

    M-x whisper-api-record-dwim (again)

    The transcription is then inserted into your current buffer.

  4. To cancel an active recording or pending API call, run:

    M-x whisper-api-cancel

Customization

  • whisper-api-ffmpeg-command

    The format string used to invoke ffmpeg. By default it is:

    ffmpeg -y -t %d -f pulse -i default -ar 16000 %s

    You can modify this if you use another input device; for instance, if you prefer ALSA use:

    (setq whisper-api-ffmpeg-command "ffmpeg -y -t %d -f alsa -i default -ar 16000 %s")

  • whisper-api-base-url

    The base URL for the Whisper API. By default it is set to OpenAI’s endpoint:

    https://api.openai.com/v1/audio/transcriptions

    To use a local inference endpoint or alternative service, set this value accordingly. For example:

    (setq whisper-api-base-url "http://localhost:8000/my-endpoint")

  • The package prints debug messages in the Messages buffer, which can help diagnose issues with the recording or API call.

Enjoy!

About

Use Whisper in Emacs!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published