Skip to content

Releases: argmaxinc/WhisperKitAndroid

v0.3.2

02 Jun 17:24
8a6b737
Compare
Choose a tag to compare

Contributors:
@ZachNagengast @keith4ever @dylanangus @chen-argmax @flashno @bpkeene

What's Changed

From v0.3.0:

  • Performance improvements
  • Imports the standard HuggingFace tokenizer via C bindings from https://github.com/FL33TW00D/tokenizers-sys
  • Updated example app
    • Automatic model downloads from HuggingFace
    • Transcribe from file or recording from the microphone
    • Selectable compute units between NPU, GPU, and CPU
    • Thank you to @Acs176 for kickstarting this initiative
  • Publishable Maven repository for Kotlin library
  • Code style rules with auto-formatting via make format

with v0.3.2:

  • Add multilingual model support with PerLayerKVDecoder, handling diverse encoder output names, int64 token bindings, and timestamp postprocessing; lifts KV-cache logic into decoder for improved modularity.

  • Various SDK and build improvements: conditional Bazel download, updated API with segments, improved model download/handling, version bumps (0.3.2), and README cleanup.

  • Example app published for open testing: https://play.google.com/store/apps/details?id=com.argmaxinc.whisperax

    • Demo:
      v0 3 2-example-app-whisperkitandroid

v0.1.0

14 Jan 19:41
81fcb02
Compare
Choose a tag to compare

Beta release for WhisperKitAndroid!

What's Changed

  • Added Linux support (CPU only) for model pipeline execution
  • Testing coverage across earnings-22, librispeech datasets
  • Initial C-API and internal runtime
  • whisperkit-cli command line utility, with subset of features as from WhisperKit
  • Reduced exported symbols, improved build system, and other improvements