Skip to content

More explicit and fine tuning command line arguments #7

@porg

Description

@porg
  • I do not know the exact current capabilities of the app.
  • Nor what features you may be willing to implement or make user accessible.
  • If I could sketch out the future of lyrics-transcriber then its manpage would read like this:

Synopsis

lyrics-transcriber [ options ] [ unsynchronized-lyrics-file ] <audio-file>

  • audio-file — Must be supplied. Needed for base operation.
  • unsynchronized-lyrics-file — If supplied then spelling, punctuation, line-wrappings are preserved as-is and the service of the app is to create the correlated timestamps.
    • Helpful for lesser known languages, aid it further by specifying --language-input.
    • Or for strict editorial control for well supported languages to get exactly the orthography, punctuation and line-wrappings you want.

Optional arguments, multiple values are comma separated

  • -i --language-input (one or more: IETF Language Tags)
    • Aid processing by explicitly stating what language(s) and/or regional variant(s) and/or dialect(s) occur(s) in the input.
    • Order of supplied arguments carries no meaning/priority.
  • -v --voices [ <n> || VoiceName1,VoiceName2,…,VoiceName3 ]
    • Aid processing by explicitly stating the amount of voices in the input
    • Either just as a number, or by naming the voices (in order of occurrence in audio)
  • -V --voice-isolation <n>
    • How radical it tries to separate the voice track
    • 1 moderately - 99 extreme
    • 0 disables voice isolation entirely, use with voice only audio-file to expedite processing
  • -c --correlate-lyrics (one or more services: all, genius, spotify)
    • Correlate initial speech-to-text results against lyrics databases to further improve the results.
  • -k --api-key-openai <key>
    • Required for online features such as --correlate-lyrics.
    • Supply as argument or environment variable API_KEY_OPENAI or in ~/.conf/lyrics-transcriber.cfg
  • -e --export (one or more of these: all, ass, json, lrc-midico, mp4, srt)
  • -o --language-output (one or more: IETF Language Tags)
    • Machine translations to be included in each of the --export formats
    • All-in-one file formats get all translations integrated in one file
      • JSON
      • MP4 (embedded as subtitle tracks)
    • Lyrics/subtitle files get one file per each machine translation
      • e.g. file.en.ass, file.en.ass, file.fr.ass and file.fr.lrc
        according to --filename template containing a %language% token.

Filename pattern for the output file(s)

  • --filename "<filename template>"
    • Default: %basename%.%language%.%ext%
    • Literal characters are used as-is.
    • Variables are wrapped within percentage symbols "%"
    • %basename% — The input file's basename, that is the name without the file extension.
    • Literal characters strings like a "." or "--" or " karaoke " get inserted as-is.
    • %language% — The language of the machine translation version.
    • %ext% — The file extension which is to be used for the respective --export file format.
    • %hash% — The hash checksum of the input file.

Shorthand arguments

  • -a → --export=all→ exports all available file formats (=default)
  • -A → --export=ass
  • -J → --export=json
  • -L → --export=lrc-midico
  • -M → --export=mp4
  • -S → --export=srt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions