feat: enable IP-Adapter (XLabs-AI/flux-ip-adapter-v2) support (#418)

Bluear7878 · lmxyy · web-flow · commit 06b7a51850a3 · 2025-07-24T00:51:14.000-07:00
* feat: support IP-adapter

* FBCache and comfyUI

* fixing conflicts

* update

* update example

* update example

* style: make linter happy

* update

* update ipa test

* add docs and rename IP to ip

* docs: add docs for ipa

* docs: add docs for ipa

* add an example for pulid

* update

* save gpu memory

* change the threshold to 0.8

---------

Co-authored-by: Muyang Li &lt;lmxyy1999@foxmail.com&gt;
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -24,6 +24,7 @@ Check out `DeepCompressor <github_deepcompressor_>`_ for the quantization librar
     usage/attention.rst
     usage/fbcache.rst
     usage/pulid.rst
+    usage/ip_adapter.rst
 
 .. toctree::
     :maxdepth: 1
diff --git a/docs/source/links/huggingface.txt b/docs/source/links/huggingface.txt
@@ -8,3 +8,4 @@
 .. _hf_nunchaku-flux1-dev-int4: https://huggingface.co/mit-han-lab/nunchaku-flux.1-dev/blob/main/svdq-int4_r32-flux.1-dev.safetensors
 .. _hf_depth_anything: https://huggingface.co/LiheYoung/depth-anything-large-hf
 .. _hf_nunchaku_wheels: https://huggingface.co/nunchaku-tech/nunchaku
+.. _hf_ip-adapterv2: https://huggingface.co/XLabs-AI/flux-ip-adapter-v2
diff --git a/docs/source/python_api/nunchaku.models.ip_adapter.diffusers_adapters.flux.rst b/docs/source/python_api/nunchaku.models.ip_adapter.diffusers_adapters.flux.rst
@@ -0,0 +1,7 @@
+nunchaku.models.ip_adapter.diffusers_adapters.flux
+==================================================
+
+.. automodule:: nunchaku.models.ip_adapter.diffusers_adapters.flux
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/docs/source/python_api/nunchaku.models.ip_adapter.diffusers_adapters.rst b/docs/source/python_api/nunchaku.models.ip_adapter.diffusers_adapters.rst
@@ -0,0 +1,12 @@
+nunchaku.models.ip_adapter.diffusers_adapters
+=============================================
+
+.. automodule:: nunchaku.models.ip_adapter.diffusers_adapters
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+.. toctree::
+   :maxdepth: 4
+
+   nunchaku.models.ip_adapter.diffusers_adapters.flux
diff --git a/docs/source/python_api/nunchaku.models.ip_adapter.rst b/docs/source/python_api/nunchaku.models.ip_adapter.rst
@@ -0,0 +1,8 @@
+nunchaku.models.ip_adapter
+==========================
+
+.. toctree::
+   :maxdepth: 4
+
+   nunchaku.models.ip_adapter.diffusers_adapters
+   nunchaku.models.ip_adapter.utils
diff --git a/docs/source/python_api/nunchaku.models.ip_adapter.utils.rst b/docs/source/python_api/nunchaku.models.ip_adapter.utils.rst
@@ -0,0 +1,7 @@
+nunchaku.models.ip_adapter.utils
+================================
+
+.. automodule:: nunchaku.models.ip_adapter.utils
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/docs/source/python_api/nunchaku.models.rst b/docs/source/python_api/nunchaku.models.rst
@@ -7,4 +7,5 @@ nunchaku.models
    nunchaku.models.transformers
    nunchaku.models.text_encoders
    nunchaku.models.pulid
+   nunchaku.models.ip_adapter
    nunchaku.models.safety_checker
diff --git a/docs/source/usage/ip_adapter.rst b/docs/source/usage/ip_adapter.rst
@@ -0,0 +1,38 @@
+IP Adapter
+==========
+
+Nunchaku supports `IP Adapter <hf_ip-adapterv2_>`_, an adapter achieving image prompt capability for the FLUX.1-dev
+
+.. literalinclude:: ../../../examples/flux.1-dev-IP-adapter.py
+   :language: python
+   :caption: IP Adapter Example (`examples/flux.1-dev-IP-adapter.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-IP-adapter.py>`__)
+   :linenos:
+
+The IP Adapter integration in Nunchaku follows these main steps:
+
+**Model Initialization**:
+
+- Load a Nunchaku FLUX.1-dev transformer model using :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.from_pretrained`.
+- Initialize the FLUX pipeline with :class:`diffusers.FluxPipeline`, passing the transformer and setting the appropriate precision.
+
+**IP Adapter Loading**:
+
+- Use ``pipeline.load_ip_adapter`` to load the IP Adapter weights and the CLIP image encoder.
+
+  - ``pretrained_model_name_or_path_or_dict``: Hugging Face repo or local path for the IP Adapter weights.
+  - ``weight_name``: Name of the weights file (e.g., ``ip_adapter.safetensors``).
+  - ``image_encoder_pretrained_model_name_or_path``: Name or path of the CLIP image encoder.
+  - Apply the IP Adapter to the pipeline with :func:`~nunchaku.models.ip_adapter.diffusers_adapters.apply_IPA_on_pipe`, specifying the adapter scale and repo ID.
+
+**Caching (Optional)**:
+
+Enable caching for faster inference and reduced memory usage with :func:`~nunchaku.caching.diffusers_adapters.apply_cache_on_pipe`. See :doc:`fbcache` for more details.
+
+**Image Generation**:
+
+- Load the image to be used as the image prompt (IP Adapter reference).
+- Call the pipeline with:
+
+  - ``prompt``: The text prompt for generation.
+  - ``ip_adapter_image``: The reference image (must be RGB).
+- The output image will reflect both the text prompt and the visual style/content of the reference image.
diff --git a/docs/source/usage/pulid.rst b/docs/source/usage/pulid.rst
@@ -1,21 +1,20 @@
 PuLID
 =====
 
-Nunchaku integrates `PuLID <_pulid_paper>`_, a tuning-free identity customization method for text-to-image generation.
+.. image:: https://huggingface.co/datasets/nunchaku-tech/cdn/resolve/main/ComfyUI-nunchaku/workflows/nunchaku-flux.1-dev-pulid.png
+
+Nunchaku integrates `PuLID <paper_pulid_>`_, a tuning-free identity customization method for text-to-image generation.
 This feature allows you to generate images that maintain specific identity characteristics from reference photos.
 
 .. literalinclude:: ../../../examples/flux.1-dev-pulid.py
    :language: python
    :caption: PuLID Example (`examples/flux.1-dev-pulid.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-pulid.py>`__)
    :linenos:
 
-Implementation Overview
------------------------
-
 The PuLID integration follows these key steps:
 
 **Model Initialization** (lines 12-20):
-Load a Nunchaku FLUX.1-dev model using :class:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel`
+Load a Nunchaku FLUX.1-dev model using :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.from_pretrained`
 and initialize the FLUX PuLID pipeline with :class:`~nunchaku.pipeline.pipeline_flux_pulid.PuLIDFluxPipeline`.
 
 **Forward Method Override** (line 22):
diff --git a/examples/flux.1-canny-dev.py b/examples/flux.1-canny-dev.py
@@ -8,7 +8,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-canny-dev/svdq-{precision}_r32-flux.1-canny-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-canny-dev/svdq-{precision}_r32-flux.1-canny-dev.safetensors"
 )
 pipe = FluxControlPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-Canny-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-depth-dev-lora.py b/examples/flux.1-depth-dev-lora.py
@@ -8,7 +8,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-depth-dev/svdq-{precision}_r32-flux.1-depth-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-depth-dev/svdq-{precision}_r32-flux.1-depth-dev.safetensors"
 )
 pipe = FluxControlPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-depth-dev.py b/examples/flux.1-depth-dev.py
@@ -8,7 +8,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-depth-dev/svdq-{precision}_r32-flux.1-depth-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-depth-dev/svdq-{precision}_r32-flux.1-depth-dev.safetensors"
 )
 
 pipe = FluxControlPipeline.from_pretrained(
diff --git a/examples/flux.1-dev-IP-adapter.py b/examples/flux.1-dev-IP-adapter.py
@@ -0,0 +1,43 @@
+import torch
+from diffusers import FluxPipeline
+from diffusers.utils import load_image
+
+from nunchaku import NunchakuFluxTransformer2dModel
+from nunchaku.caching.diffusers_adapters import apply_cache_on_pipe
+from nunchaku.models.ip_adapter.diffusers_adapters import apply_IPA_on_pipe
+from nunchaku.utils import get_precision
+
+precision = get_precision()
+transformer = NunchakuFluxTransformer2dModel.from_pretrained(
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+)
+pipeline = FluxPipeline.from_pretrained(
+    "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
+).to("cuda")
+
+pipeline.load_ip_adapter(
+    pretrained_model_name_or_path_or_dict="XLabs-AI/flux-ip-adapter-v2",
+    weight_name="ip_adapter.safetensors",
+    image_encoder_pretrained_model_name_or_path="openai/clip-vit-large-patch14",
+)
+
+apply_IPA_on_pipe(pipeline, ip_adapter_scale=1.1, repo_id="XLabs-AI/flux-ip-adapter-v2")
+
+apply_cache_on_pipe(
+    pipeline,
+    use_double_fb_cache=True,
+    residual_diff_threshold_multi=0.09,
+    residual_diff_threshold_single=0.12,
+)
+
+IP_image = load_image(
+    "https://huggingface.co/datasets/nunchaku-tech/test-data/resolve/main/ComfyUI-nunchaku/inputs/monalisa.jpg"
+)
+
+image = pipeline(
+    prompt="holding an sign saying 'SVDQuant is fast!'",
+    ip_adapter_image=IP_image.convert("RGB"),
+    num_inference_steps=50,
+).images[0]
+
+image.save(f"flux.1-dev-IP-adapter-{precision}.png")
diff --git a/examples/flux.1-dev-cache.py b/examples/flux.1-dev-cache.py
@@ -7,7 +7,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-dev-controlnet-union-pro.py b/examples/flux.1-dev-controlnet-union-pro.py
@@ -15,7 +15,7 @@
 precision = get_precision()
 need_offload = get_gpu_memory() < 36
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors",
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors",
     torch_dtype=torch.bfloat16,
     offload=need_offload,
 )
diff --git a/examples/flux.1-dev-double_cache.py b/examples/flux.1-dev-double_cache.py
@@ -8,7 +8,7 @@
 precision = get_precision()
 
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 
 pipeline = FluxPipeline.from_pretrained(
diff --git a/examples/flux.1-dev-double_cache_offloading.py b/examples/flux.1-dev-double_cache_offloading.py
@@ -8,7 +8,7 @@
 precision = get_precision()
 
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors",
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors",
     offload=True,
 )
 
diff --git a/examples/flux.1-dev-fp16attn.py b/examples/flux.1-dev-fp16attn.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 transformer.set_attention_impl("nunchaku-fp16")  # set attention implementation to fp16
 pipeline = FluxPipeline.from_pretrained(
diff --git a/examples/flux.1-dev-lora.py b/examples/flux.1-dev-lora.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-dev-multiple-lora.py b/examples/flux.1-dev-multiple-lora.py
@@ -7,7 +7,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-dev-offload.py b/examples/flux.1-dev-offload.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors", offload=True
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors", offload=True
 )  # set offload to False if you want to disable offloading
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-dev-pulid.py b/examples/flux.1-dev-pulid.py
@@ -10,7 +10,7 @@
 
 precision = get_precision()
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 
 pipeline = PuLIDFluxPipeline.from_pretrained(
diff --git a/examples/flux.1-dev-qencoder.py b/examples/flux.1-dev-qencoder.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/nunchaku-t5/awq-int4-flux.1-t5xxl.safetensors")
 pipeline = FluxPipeline.from_pretrained(
diff --git a/examples/flux.1-dev-teacache.py b/examples/flux.1-dev-teacache.py
@@ -9,7 +9,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-dev-turing.py b/examples/flux.1-dev-turing.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors",
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors",
     offload=True,
     torch_dtype=torch.float16,  # Turing GPUs only support fp16 precision
 )  # set offload to False if you want to disable offloading
diff --git a/examples/flux.1-dev.py b/examples/flux.1-dev.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-fill-dev.py b/examples/flux.1-fill-dev.py
@@ -10,7 +10,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-fill-dev/svdq-{precision}_r32-flux.1-fill-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-fill-dev/svdq-{precision}_r32-flux.1-fill-dev.safetensors"
 )
 pipe = FluxFillPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-Fill-dev", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/flux.1-kontext-dev.py b/examples/flux.1-kontext-dev.py
@@ -6,7 +6,7 @@
 from nunchaku.utils import get_precision
 
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-kontext-dev/svdq-{get_precision()}_r32-flux.1-kontext-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-kontext-dev/svdq-{get_precision()}_r32-flux.1-kontext-dev.safetensors"
 )
 
 pipeline = FluxKontextPipeline.from_pretrained(
diff --git a/examples/flux.1-redux-dev.py b/examples/flux.1-redux-dev.py
@@ -10,7 +10,7 @@
     "black-forest-labs/FLUX.1-Redux-dev", torch_dtype=torch.bfloat16
 ).to("cuda")
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-dev/svdq-{precision}_r32-flux.1-dev.safetensors"
 )
 pipe = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev",
diff --git a/examples/flux.1-schnell.py b/examples/flux.1-schnell.py
@@ -6,7 +6,7 @@
 
 precision = get_precision()  # auto-detect your precision is 'int4' or 'fp4' based on your GPU
 transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-    f"mit-han-lab/nunchaku-flux.1-schnell/svdq-{precision}_r32-flux.1-schnell.safetensors"
+    f"nunchaku-tech/nunchaku-flux.1-schnell/svdq-{precision}_r32-flux.1-schnell.safetensors"
 )
 pipeline = FluxPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-schnell", transformer=transformer, torch_dtype=torch.bfloat16
diff --git a/examples/sana1.6b-cache.py b/examples/sana1.6b-cache.py
@@ -5,7 +5,7 @@
 from nunchaku.caching.diffusers_adapters import apply_cache_on_pipe
 
 transformer = NunchakuSanaTransformer2DModel.from_pretrained(
-    "mit-han-lab/nunchaku-sana/svdq-int4_r32-sana1.6b.safetensors"
+    "nunchaku-tech/nunchaku-sana/svdq-int4_r32-sana1.6b.safetensors"
 )
 pipe = SanaPipeline.from_pretrained(
     "Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers",
diff --git a/examples/sana1.6b.py b/examples/sana1.6b.py
@@ -4,7 +4,7 @@
 from nunchaku import NunchakuSanaTransformer2DModel
 
 transformer = NunchakuSanaTransformer2DModel.from_pretrained(
-    "mit-han-lab/nunchaku-sana/svdq-int4_r32-sana1.6b.safetensors"
+    "nunchaku-tech/nunchaku-sana/svdq-int4_r32-sana1.6b.safetensors"
 )
 pipe = SanaPipeline.from_pretrained(
     "Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers",
diff --git a/examples/sana1.6b_pag.py b/examples/sana1.6b_pag.py
@@ -4,7 +4,7 @@
 from nunchaku import NunchakuSanaTransformer2DModel
 
 transformer = NunchakuSanaTransformer2DModel.from_pretrained(
-    "mit-han-lab/nunchaku-sana/svdq-int4_r32-sana1.6b.safetensors", pag_layers=8
+    "nunchaku-tech/nunchaku-sana/svdq-int4_r32-sana1.6b.safetensors", pag_layers=8
 )
 pipe = SanaPAGPipeline.from_pretrained(
     "Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers",
diff --git a/nunchaku/caching/diffusers_adapters/flux.py b/nunchaku/caching/diffusers_adapters/flux.py
diff --git a/nunchaku/caching/utils.py b/nunchaku/caching/utils.py
diff --git a/nunchaku/csrc/flux.h b/nunchaku/csrc/flux.h
diff --git a/nunchaku/csrc/pybind.cpp b/nunchaku/csrc/pybind.cpp
diff --git a/nunchaku/models/ip_adapter/__init__.py b/nunchaku/models/ip_adapter/__init__.py
diff --git a/nunchaku/models/ip_adapter/diffusers_adapters/__init__.py b/nunchaku/models/ip_adapter/diffusers_adapters/__init__.py
diff --git a/nunchaku/models/ip_adapter/diffusers_adapters/flux.py b/nunchaku/models/ip_adapter/diffusers_adapters/flux.py
diff --git a/nunchaku/models/ip_adapter/utils.py b/nunchaku/models/ip_adapter/utils.py
diff --git a/requirements.txt b/requirements.txt
diff --git a/src/FluxModel.cpp b/src/FluxModel.cpp
diff --git a/src/FluxModel.h b/src/FluxModel.h
diff --git a/tests/flux/test_flux_dev_IPA.py b/tests/flux/test_flux_dev_IPA.py

Original file line number	Diff line number	Diff line change
`@@ -8,7 +8,7 @@`
`8`	`8`
`9`	`9`	`precision = get_precision() # auto-detect your precision is 'int4' or 'fp4' based on your GPU`
`10`	`10`	`transformer = NunchakuFluxTransformer2dModel.from_pretrained(`
`11`		`- f"mit-han-lab/nunchaku-flux.1-canny-dev/svdq-{precision}_r32-flux.1-canny-dev.safetensors"`
	`11`	`+ f"nunchaku-tech/nunchaku-flux.1-canny-dev/svdq-{precision}_r32-flux.1-canny-dev.safetensors"`
`12`	`12`	`)`
`13`	`13`	`pipe = FluxControlPipeline.from_pretrained(`
`14`	`14`	`"black-forest-labs/FLUX.1-Canny-dev", transformer=transformer, torch_dtype=torch.bfloat16`