Is this could be used for audio synthesis? #24

MonolithFoundation · 2024-09-18T03:23:40Z

For instance, LLM out produce snac tokens, and decode into audio?

itsliupeng · 2024-09-25T07:44:43Z

FYI. https://github.com/gpt-omni/mini-omni uses snac codec to generate audio.

MonolithFoundation · 2024-09-25T09:07:37Z

Thanks for the hint, how about Chinese?

MrWaterZhou · 2024-11-21T10:41:01Z

https://github.com/MrWaterZhou/viitor-voice

I tried, and it works well :)

MonolithFoundation · 2024-11-21T11:45:13Z

Woo, does it support Madrian and Japanese?

MrWaterZhou · 2024-11-21T13:26:14Z

Woo, does it support Madrian and Japanese?

Not yet, but we are working on Madrian and will release it soon.

MrWaterZhou · 2024-11-28T12:29:28Z

Woo, does it support Madrian and Japanese?

Our Chinese model has been updated—feel free to give it a try!
https://github.com/viitor-ai/viitor-voice/tree/main

FYI, we’ve noticed that the 24kHz SNAC model doesn’t perform well on Chinese audio, especially with higher-pitched samples. We’re currently experimenting with fine-tuning the decoder using vocos, and so far, the results look promising.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is this could be used for audio synthesis? #24

Is this could be used for audio synthesis? #24

MonolithFoundation commented Sep 18, 2024

itsliupeng commented Sep 25, 2024

MonolithFoundation commented Sep 25, 2024

MrWaterZhou commented Nov 21, 2024

MonolithFoundation commented Nov 21, 2024

MrWaterZhou commented Nov 21, 2024

MrWaterZhou commented Nov 28, 2024 •

edited

Loading

Is this could be used for audio synthesis? #24

Is this could be used for audio synthesis? #24

Comments

MonolithFoundation commented Sep 18, 2024

itsliupeng commented Sep 25, 2024

MonolithFoundation commented Sep 25, 2024

MrWaterZhou commented Nov 21, 2024

MonolithFoundation commented Nov 21, 2024

MrWaterZhou commented Nov 21, 2024

MrWaterZhou commented Nov 28, 2024 • edited Loading

MrWaterZhou commented Nov 28, 2024 •

edited

Loading