Replies: 1 comment 1 reply
-
many of the required torch ops do not have kernels for fp8. all-in-all, it will be supported when actual frameworks we rely on support it. latest nunchaku would be exception as it actually provides fp4 kernels, but its support is limited to flux.1 as it requires carefully pre-quantized model - because of exactly what i spoke above. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The new 50XX series of cards support FP4 just like 40XX series supported FP8, which have vastly superior performance at these low bits compared to fp16 or bf16 while having minimal quality degradation.
Would it be possible to support these compute types? Even FP8 would nearly double performance over FP16.
Beta Was this translation helpful? Give feedback.
All reactions