Execution Precision fp8 and fp4 #3886

ttuleyb · 2025-04-26T13:54:08Z

ttuleyb
Apr 26, 2025

The new 50XX series of cards support FP4 just like 40XX series supported FP8, which have vastly superior performance at these low bits compared to fp16 or bf16 while having minimal quality degradation.

Would it be possible to support these compute types? Even FP8 would nearly double performance over FP16.

vladmandic · 2025-04-26T14:00:44Z

vladmandic
Apr 26, 2025
Maintainer

many of the required torch ops do not have kernels for fp8.
so it would require to do very painful mapping of execute this-in-fp8 and then upcast-to-fp16 to execute-that then downcast back to fp-8, etc.
and then it would break because you cannot upcast/downcast without massive loss of precision so you'd end up with broken math and typical "black images". oversimplified, model needs to be trained in fp8 so it doesn't overflow when it runs in fp8 (yes, that's not the case for llms which can deal with loss-of-precision better)

all-in-all, it will be supported when actual frameworks we rely on support it.

latest nunchaku would be exception as it actually provides fp4 kernels, but its support is limited to flux.1 as it requires carefully pre-quantized model - because of exactly what i spoke above.

1 reply

ttuleyb Apr 27, 2025
Author

Thanks for explaining it so well, I'm really excited for when torch does support it as it will be anywhere between 1.5-4X speed up

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Execution Precision fp8 and fp4 #3886

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Execution Precision fp8 and fp4 #3886

Uh oh!

ttuleyb Apr 26, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

vladmandic Apr 26, 2025 Maintainer

Uh oh!

ttuleyb Apr 27, 2025 Author

ttuleyb
Apr 26, 2025

Replies: 1 comment 1 reply

vladmandic
Apr 26, 2025
Maintainer

ttuleyb Apr 27, 2025
Author