-
Notifications
You must be signed in to change notification settings - Fork 32
Is this could be used for audio synthesis? #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
FYI. https://github.com/gpt-omni/mini-omni uses snac codec to generate audio. |
Thanks for the hint, how about Chinese? |
https://github.com/MrWaterZhou/viitor-voice I tried, and it works well :) |
Woo, does it support Madrian and Japanese? |
Not yet, but we are working on Madrian and will release it soon. |
Our Chinese model has been updated—feel free to give it a try! FYI, we’ve noticed that the 24kHz SNAC model doesn’t perform well on Chinese audio, especially with higher-pitched samples. We’re currently experimenting with fine-tuning the decoder using vocos, and so far, the results look promising. |
For instance, LLM out produce snac tokens, and decode into audio?
The text was updated successfully, but these errors were encountered: