Open
Description
As far as I understood, the perplexity used in this repo's VQ-VAE is kind of "meaningfully used codebook token numbers".
When only one codebook token is used, perplexity is 1.
When all codebook tokens appear uniformly, the perplexity equals the codebook nums.
So I was wondering, for good output quality, what is the minimum threshold of "perplexity divided by codebook nums"?
(I guess this should be found experimentally. If you have any results related to this question, it would be great to know.)
Metadata
Metadata
Assignees
Labels
No labels