Add SD3 fine-tuning scripts #1966

dsocek · 2025-05-07T00:22:30Z

What does this PR do?

Adds Gaudi optimized Stable Diffusion 3 and 3.5 (SD3) fine-tuning/training scripts.

Features:

Training with Gaudi optimized attention with Fused SDPA kernel
Embeddings padded to Gaudi TPC optimal size
Both LoRA and Full Model fine-tuning enabled
Updated README doc with tested examples
Fast SD3 train CI test added

Example of SD3 Training:

Train images:

Inference images after training SD3 LoRA with 1000 train steps on Gaudi
prompt="A picture of sks dog in a bucket"

Signed-off-by: Daniel Socek <[email protected]> Co-authored-by: Deepak Gowda Doddbele Aswatha Narayana <[email protected]> Co-authored-by: Pavel Evsikov <[email protected]>

imangohari1

Hi Daniel,
Thanks for the work here. I just have some suggestion on this PR, and am getting an error on one of the examples.
Let me know what you think.
thanks.

imangohari1 · 2025-05-07T19:49:07Z

examples/stable-diffusion/training/train_dreambooth_lora_sd3.py

+    import wandb
+
+# Will error if the minimal version of diffusers is not installed. Remove at your own risks.
+check_min_version("0.29.0")


Suggested change

check_min_version("0.29.0")

check_min_version("0.32.0")

imangohari1 · 2025-05-07T19:54:01Z

examples/stable-diffusion/training/train_dreambooth_lora_sd3.py

+        choices=["no", "fp32", "fp16", "bf16"],
+        help=(
+            "Choose prior generation precision between fp32, fp16 and bf16 (bfloat16). Bf16 requires PyTorch >="
+            " 1.10.and an Nvidia Ampere GPU.  Default to  fp16 if a GPU is available else fp32."


need to revisit the nvidia Ampre GPU comment here.

Nah we should remove this option

We need to leave the option but we removed ampere comment and code and default to bf16.

imangohari1 · 2025-05-07T19:55:47Z

examples/stable-diffusion/training/train_dreambooth_lora_sd3.py

+
+
+class PromptDataset(Dataset):
+    "A simple dataset to prepare the prompts to generate class images on multiple GPUs."


Suggested change

"A simple dataset to prepare the prompts to generate class images on multiple GPUs."

"A simple dataset to prepare the prompts to generate class images on multiple HPUs."

examples/stable-diffusion/training/train_dreambooth_lora_sd3.py

imangohari1 · 2025-05-07T20:03:13Z

examples/stable-diffusion/training/train_text_to_image_sd3.py

+
+
+class PromptDataset(Dataset):
+    "A simple dataset to prepare the prompts to generate class images on multiple GPUs."


Suggested change

"A simple dataset to prepare the prompts to generate class images on multiple GPUs."

"A simple dataset to prepare the prompts to generate class images on multiple HPUs."

tests/test_diffusers.py

examples/stable-diffusion/training/README.md

dsocek · 2025-05-08T20:00:08Z

@imangohari1 thanks for thorough review! We fixed the issue you saw in multi-card training, we forgot to pass training mode properly in validation. Could you help do your final review?

Signed-off-by: Daniel Socek <[email protected]>

imangohari1

@dsocek LGTM but I think test_dreambooth_lora_sd3 should be @slow. @regisss WDYT?

regisss

I just left a few comments;
Can you also update the table in the README to tick the training column for SD3? Here:

optimum-habana/README.md

Line 299 in f08e27a

    
           | Stable Diffusion 3  |            | <li>Single card</li> | <li>[text-to-image generation](https://github.com/huggingface/optimum-habana/tree/main/examples/stable-diffusion#stable-diffusion-3-and-35-sd3)</li> |

examples/stable-diffusion/training/README.md

tests/test_diffusers.py

Signed-off-by: Daniel Socek <[email protected]>

dsocek · 2025-05-12T17:46:51Z

@regisss thanks for review!

Added new test for full sd3 training and fixed READMEs as per your suggestions.

$ python -m pytest tests/test_diffusers.py -v -s -k "test_dreambooth_sd3"
...
PASSED
======================= 1 passed, 152 deselected in 125.27s (0:02:05)

Test takes ~2mins on G2, I left it in fast tests category, LMK if we should move to slow

Add SD3 fine-tuning scripts

1da5638

Signed-off-by: Daniel Socek <[email protected]> Co-authored-by: Deepak Gowda Doddbele Aswatha Narayana <[email protected]> Co-authored-by: Pavel Evsikov <[email protected]>

dsocek requested a review from regisss as a code owner May 7, 2025 00:22

imangohari1 suggested changes May 7, 2025

View reviewed changes

dsocek requested a review from imangohari1 May 8, 2025 19:58

Fix sd3 train script validation

43f524c

Signed-off-by: Daniel Socek <[email protected]>

imangohari1 approved these changes May 9, 2025

View reviewed changes

regisss reviewed May 10, 2025

View reviewed changes

examples/stable-diffusion/training/README.md Outdated Show resolved Hide resolved

examples/stable-diffusion/training/README.md Outdated Show resolved Hide resolved

tests/test_diffusers.py Show resolved Hide resolved

Add full sd3 train test

84d11a2

Signed-off-by: Daniel Socek <[email protected]>

dsocek requested a review from regisss May 12, 2025 17:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SD3 fine-tuning scripts #1966

Add SD3 fine-tuning scripts #1966

dsocek commented May 7, 2025

imangohari1 left a comment

imangohari1 May 7, 2025

dsocek May 8, 2025

imangohari1 May 7, 2025

dsocek May 8, 2025

dsocek May 8, 2025

imangohari1 May 7, 2025

imangohari1 May 7, 2025

dsocek commented May 8, 2025

imangohari1 left a comment

regisss left a comment

dsocek commented May 12, 2025



		class PromptDataset(Dataset):
		"A simple dataset to prepare the prompts to generate class images on multiple GPUs."

Add SD3 fine-tuning scripts #1966

Are you sure you want to change the base?

Add SD3 fine-tuning scripts #1966

Conversation

dsocek commented May 7, 2025

What does this PR do?

Example of SD3 Training:

imangohari1 left a comment

Choose a reason for hiding this comment

imangohari1 May 7, 2025

Choose a reason for hiding this comment

dsocek May 8, 2025

Choose a reason for hiding this comment

imangohari1 May 7, 2025

Choose a reason for hiding this comment

dsocek May 8, 2025

Choose a reason for hiding this comment

dsocek May 8, 2025

Choose a reason for hiding this comment

imangohari1 May 7, 2025

Choose a reason for hiding this comment

imangohari1 May 7, 2025

Choose a reason for hiding this comment

dsocek commented May 8, 2025

imangohari1 left a comment

Choose a reason for hiding this comment

regisss left a comment

Choose a reason for hiding this comment

dsocek commented May 12, 2025