Skip to content

Language selection #2

Closed
Closed
@ArtyomZemlyak

Description

@ArtyomZemlyak

I'm glad you shared this implementation.
A steep increase in performance relative to the torch on the CPU.

It is possible that you already know, but found how to enable recognition of a certain language.
We just can put in line 2012 main.cpp this:

std::vector<whisper_vocab::id> prompt = { vocab.token_sot, vocab.token_lang, vocab.token_task };  

This 3 tokens formed here:
https://github.com/openai/whisper/blob/8cf36f3508c9acd341a45eb2364239a3d81458b9/whisper/tokenizer.py#L324-L331

For specific use in main.cpp, you can simply specify the desired index manually. But for regular users, it would be cool to specify which language they would prefer to see in the output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions