Support for SDXL refiner #227

ZachNagengast · 2023-08-03T23:44:06Z

This PR provides updates for refiner conversion to the torch2coreml script and inference utilities to the StableDiffusionXLPipeline.

How to test

Conversion (standalone):
python -m python_coreml_stable_diffusion.torch2coreml --convert-text-encoder --convert-vae-decoder --convert-vae-encoder --convert-unet --model-version stabilityai/stable-diffusion-xl-refiner-1.0 --compute-unit ALL --attention-implementation ORIGINAL --bundle-resources-for-swift-cli -o <your_output_path> --xl-version

Conversion (base + refiner):
python -m python_coreml_stable_diffusion.torch2coreml --convert-text-encoder --convert-vae-decoder --convert-vae-encoder --convert-unet --model-version stabilityai/stable-diffusion-xl-base-1.0 --refiner-version stabilityai/stable-diffusion-xl-refiner-1.0 --compute-unit ALL --attention-implementation ORIGINAL --bundle-resources-for-swift-cli -o <your_output_path> --xl-version

CLI Inference:
swift run StableDiffusionSample "a photo of an astronaut riding a horse on mars" --resource-path <your_output_path>/Resources --step-count 50 --save-every 5 --strength 0.5 --output-path <your_output_path> --image <your_input_image_path>.png --xl

Summary of the changes

Conversion script

Update to the latest DiffusionPipeline as it now automatically sets up the pipeline based on model name, so it can share all the same logic with SDXL and SD 1/2 pipelines
SDXL refiner does not have a text_encoder, only text_encoder_2, so this handles either of them being missing
VAE encoder uses fp32 precision for SDXL or else it overflows
Updated the vae encoder input variable name from "z" to "x" to match huggingface/diffusers library
Changed unet time_ids input shape from [12] to [2, {5,6}] to better match huggingface/diffusers
Unet inputs now include time_ids that can either be [2,6] or [2,5] shape to support the refiner's aesthetic_score conditioning

Part of SDXL's micro-conditioning as explained in section 2.2 of
https://huggingface.co/papers/2307.01952. Can be used to
simulate an aesthetic score of the generated image by influencing the position and negative text conditions.

Inference

The existing function plannerRGBShapedArray() was having major issues with 1024-sized images, so I had to add new logic for converting cgimages to a normalized MLShapedArray with the new toRGBShapedArray()
Added configs for all time_ids parameters, although it assumes a square image right now. Will need to be updated to support aspect ratios.
Since there are models out there that still have the time_ids with shape [12], I have both this and the new shapes handled.
Refiner only uses 1 text encoder with shape [1, 77, 1280], so I updated encodePrompt() to recognize this.

Now for the catch:

I noticed that trying to run inference on the CLI wasn't working quite right, and I figured out that it needed the Unet to be float32 precision to work. I'm not sure why this happens, and the refiner works perfectly fine when running through the Diffusers app.

So the question is for this PR - would you prefer to default to float32 for the refiner (adds 4.5gb to the Unet), or disable the CLI for refiner unless the unet is in float32 precision, and give the option as a conversion argument? Or some other path? Open to any ideas of how to best go about it.

Sample outputs:

Base here done with 50 steps, Refiner done with 0.5 strength (25 steps)

Base	Refined

Thank you for your interest in contributing to Core ML Stable Diffusion! Please review CONTRIBUTING.md first. If you would like to proceed with making a pull request, please indicate your agreement to the terms outlined in CONTRIBUTING.md by checking the box below. If not, please go ahead and fork this repo and make your updates.

We appreciate your interest in the project!

Do not erase the below when submitting your pull request:
#########

I agree to the terms outlined in CONTRIBUTING.md

ZachNagengast · 2023-08-04T17:19:24Z

More info on the CLI issue:

Here's a visualization of what the unet is outputting on float16 vs float32 prevision via the CLI swift command. This is a greyscale visualization of the latent channels for comparison.

The expected output should be some prediction of noise in the image (in latent space), but the fp16 output has a large number of repeated values, which of course breaks the diffusion process.

itsjoshpark · 2023-08-09T13:55:21Z

@atiorh this PR is ready for review :)

atiorh · 2023-08-13T05:59:27Z

Thanks for the PR @ZachNagengast! I am traveling this week but will aim to review it asap.

atiorh

Did a cursory pass to kick off some discussions. Thanks for the great work @ZachNagengast!

One more high-level note: Looks like the current behavior is to detect whether the given XL Unet model is a refiner or a base one and run the pipeline accordingly. Could we also handle the cascaded base+refiner workflow similar to the canonical configuration on diffusers? What do you think? We could enforce a reduceMemory-like policy for the base2refiner transition to avoid 2x memory.

atiorh · 2023-08-15T20:20:44Z

swift/StableDiffusion/pipeline/CGImage+vImage.swift

+                    let i = 4 * (y * width + x)
+                    if ptr[i+3] == 0 {
+                        // Alpha mask for controlnets
+                        redChannel[y * width + x] = alphaMaskValue


@vzsg had specified a similar convention in #209 for ControlNet inpainting. Tagging for visibility to ensure we are not moving in a different direction with this potential switch from plannerRGBShapedArray to toRGBShapedArray.

I attempted to replicate it with this new function, and the plannerRGBShapedArray function is still in use for the controlnet conditioning. Would be great to get another eye on it if it's working as intended.

atiorh · 2023-08-15T20:22:18Z

swift/StableDiffusion/pipeline/Encoder.swift

@@ -50,7 +50,7 @@ public struct Encoder: ResourceManaging {
        scaleFactor: Float32,
        random: inout RandomSource
    ) throws -> MLShapedArray<Float32> {
-        let imageData = try image.plannerRGBShapedArray(minValue: -1.0, maxValue: 1.0)
+        let imageData = try image.toRGBShapedArray(minValue: -1.0, maxValue: 1.0)


Could you please summarize the reasons for moving away from plannerRGBShapedArray?

This function is intended to fix a bug in the plannerRGBShapedArray that wasn't working right with 1024x1024. There may be another approach that fixes whatever the bug was, but I opted to just use the CoreGraphics library for simplicity.

Example of the bug (converted to plannerRGBShapedArray then back using CGImage.fromShapedArray):

Wow, thanks @ZachNagengast! Is this for any 1024x1024 or some pixel value-dependent edge case? I might want to file a ticket for this.

In any case, makes sense to workaround it by using your function at the moment

Yea you can try it out just do

let imageData = try image.plannerRGBShapedArray(minValue: -1.0, maxValue: 1.0) let decodedImage = try CGImage.fromShapedArray(imageData)

and inspect the decoded image, it should look off like this for the SDXL pipeline.

atiorh · 2023-08-15T20:22:55Z

swift/StableDiffusion/pipeline/Encoder.swift

@@ -93,7 +93,7 @@ public struct Encoder: ResourceManaging {

    var inputDescription: MLFeatureDescription {
        try! model.perform { model in
-            model.modelDescription.inputDescriptionsByName["z"]!
+            model.modelDescription.inputDescriptionsByName.first!.value


This preserves backwards compatibility so I am not concerned about the rename and thanks for the semantically correct fix :)

No need to update, just appreciating the change

atiorh · 2023-08-15T20:26:41Z

swift/StableDiffusion/pipeline/StableDiffusionPipeline.Configuration.swift

@@ -44,7 +44,19 @@ public struct PipelineConfiguration: Hashable {
    public var encoderScaleFactor: Float32 = 0.18215
    /// Scale factor to use on the latent before decoding
    public var decoderScaleFactor: Float32 = 0.18215
-
+    /// If `originalSize` is not the same as `targetSize` the image will appear to be down- or upsampled.


I am inclined to recommend a dedicated StableDiffusionXLPipeline.Configuration.swift as quite a few things on the input conditioning and pipeline are different. @ZachNagengast & @pcuenca What do you think?

Yea, I'd agree it probably shouldn't be a requirement on the caller to set the scale factor properly, especially when we already know the pipeline is SDXL and can set the default properly. My question would be whether we'd want a new class for this or just setup the defaults somewhere in the pipeline. Or potentially a broader option of just including the config files with the converted models directly, alongside the merges.txt and vocab.json, and using those for the default config.

potentially a broader option of just including the config files with the converted models directly

That would be nice and we can even make that backward compatible. However, it still wouldn't address the argument set mismatch across XL and non-XL right? IMO, the easiest is to create 2 (XL and non-XL) default configs and then add your proposal from above as an extension.

Yea that makes sense to me, and seems to be the approach Diffusers likes to take. My only nitpick is the repeated code, be it can be consolidated a fair amount by committing to the two pipelines and pulling out shared code as needed.

I agree, we can refactor over time :)

I agree with the approach discussed, separate configs do seem necessary to make usage simpler.

atiorh · 2023-08-15T21:17:43Z

python_coreml_stable_diffusion/torch2coreml.py

@@ -1392,6 +1425,13 @@ def parser_spec():
        "If specified, enable unet to receive additional inputs from controlnet. "
        "Each input added to corresponding resnet output."
        )
+    parser.add_argument(


Can we rename this to --unet-compute-precision-float32? Sidebar: I still don't understand why the float32 constraint would only apply to the CLI and not any other Swift program that calls StableDiffusionXLPipeline 🤔 We should discuss this more 🙏

Yea I'll rename it. The CLI thing is still a mystery to me, I'd love to hear if anyone can replicate it or if it's just my machine. I was just getting noise on the output with using fp16, which went away when in fp32 format.

I am still traveling with bad internet so I am yet to convert and test the refiner model.. (hence the slow review). Will confirm once I do!

ZachNagengast · 2023-08-20T20:46:55Z

@atiorh Regarding your note on the base2refiner transition, here are two approaches that could work well:

Include "UnetRefiner.mlmodelc" as a path to look for when loading an xl model, where the refiner unet is included in the folder with the base unet, and intelligently switch between them depending on whether it exists, and whether we need to use reduced memory, as determined by the configuration. Since the encoders/vaes are shared, it would be efficient to keep them in one folder, rather than duplicated for both base and refiner. It would just require 2 unet passes during conversion if the refiner is also desired. The downside is they would then be tightly coupled as one "model" conceptually.
Allow devs to control and swap out the Unet as needed for the pipeline they're aiming for, outside of the normal loadResources call. This would give developers more control over the pipeline, at the expense of a little more legwork upfront.

Maybe both would be useful down the road, what do you think?

atiorh · 2023-08-20T21:10:51Z

@atiorh Regarding your note on the base2refiner transition, here are two approaches that could work well:

Include "UnetRefiner.mlmodelc" as a path to look for when loading an xl model, where the refiner unet is included in the folder with the base unet, and intelligently switch between them depending on whether it exists, and whether we need to use reduced memory, as determined by the configuration. Since the encoders/vaes are shared, it would be efficient to keep them in one folder, rather than duplicated for both base and refiner. It would just require 2 unet passes during conversion if the refiner is also desired. The downside is they would then be tightly coupled as one "model" conceptually.

Allow devs to control and swap out the Unet as needed for the pipeline they're aiming for, outside of the normal loadResources call. This would give developers more control over the pipeline, at the expense of a little more legwork upfront.

Maybe both would be useful down the road, what do you think?

Totally, this is what I had in mind too:

if refiner (UnetRefiner.mlmodelc) doesn't exist, we run a base-only pipeline
if base doesn't exist, we error out unless a starting image was provided?
if both exist, we run base first and refiner second using their dedicated subconfigs (e.g. --step-count, --strength etc.)

pcuenca · 2023-08-22T16:53:57Z

python_coreml_stable_diffusion/torch2coreml.py

@@ -796,9 +827,15 @@ def convert_unet(pipe, args):
            for k, v in sample_unet_inputs.items()
        }

+        if args.xl_version and pipe.config.requires_aesthetics_score and args.unet_support_cli:


Suggested change

if args.xl_version and pipe.config.requires_aesthetics_score and args.unet_support_cli:

if args.xl_version and pipe.config.requires_aesthetics_score and args.unet_compute_precision_float32:

pcuenca

I've just tested this PR and have some high-level overview for now:

I haven't been able to reproduce the unet inference issues in float16 with the CLI. It works for me using both cpuAndGPU and all compute units (although using all results in very slow compilation times, as expected). It'd be really useful to identify an easy reproducible case. Any ideas about how to proceed @ZachNagengast?
The plannerRGBShapedArray issue is caused by row padding applied by vImage for performance reasons. I have a simple fix that I can submit as a separate PR for discussion. (I could submit a PR to this PR if that's preferred, but I think the problem is general enough to deal with separately).
I didn't see any provisions in the Swift portion of the code to deal with absent textEncoder or textEncoder2 modules, so applying the refiner in text2image mode results in an error unless both encoders exist. This is only a problem because the original refiner checkpoints don't include a textEncoder, so a conversion process followed by inference on the output folder would fail.
There was a minor typo in the name of the arg to control float32 conversion of the UNet (it had only been changed in one of two places); I submitted a small suggestion for it.

Other than that, I agree in providing a dedicated method to run both the base and refiner models one after the other, in addition to using them separately.

I'll dive deeper into the code details later :)

Great work @ZachNagengast, this is a really exciting feature!

- Also skip loading model if check_output_correctness is missing, since the model does not require inferencing at conversion time

ZachNagengast · 2023-08-28T20:48:06Z

@pcuenca Just updated the conversion script to bundle the UnetRefiner.mlmodelc with the base unet if specified via --refiner-version. Take a look and let me know if you have any edits, likewise @atiorh. I was blocked for a while due to the model load stage of the convert() coremltools method causing OOM issues with the refiner (not sure what brought that in because it never had that issue until recently). Resolved that by uncommenting the skip_model_load param and making it based on args.check_output_correctness because we don't need to load the converted model for inference unless specified.

Also saw your update to the plannerRGBShapedArray, nice fix. I'll include that back in with the pipeline updates, which should be coming later today.

Regarding the float precision thing - good to know your computer didn't have a problem. Unless anyone else reports a similar issue, I'll just remove that logic for simplicity.

ZachNagengast · 2023-08-28T22:23:51Z

Here's an example refiner + base conversion script. Thinking behind the new parameter was the potential for future refiners not tightly coupled with the base, so this will let you bundle any pre-trained unet via the --refiner-version param as the refiner.

python -m python_coreml_stable_diffusion.torch2coreml --convert-text-encoder --convert-vae-decoder --convert-vae-encoder --convert-unet --model-version stabilityai/stable-diffusion-xl-base-1.0 --refiner-version stabilityai/stable-diffusion-xl-refiner-1.0 --compute-unit ALL --attention-implementation ORIGINAL --bundle-resources-for-swift-cli -o <output_path> --xl-version

ZachNagengast · 2023-08-29T01:28:20Z

@atiorh @pcuenca Update - just pushed some code to run the UnetRefiner and also handle the potential missing models in the bundle. Haven't included any dedicated subconfigs yet, but this latest update uses strength as the fraction for when the refiner should take over. Worth noting that we're diverging a bit from diffusers because they use two separate pipelines for base+refiner whereas this is currently set up to do it all at once. Will update the CLI next but looking for feedback on what's here so far, also whether you'd prefer to have the dedicated subconfigs in this PR or have that come in separately. Thanks! Heres some outputs from my tests:

Strength 1.0	Strength 0.8	Strength 0.5

0/25 refiner steps	5/25 refiner steps	13/25 refiner steps

atiorh · 2023-08-29T18:55:04Z

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift

    var textEncoder2: TextEncoderXLModel

    /// Model used to predict noise residuals given an input, diffusion time step, and conditional embedding
    var unet: Unet
+
+    /// Model used to refine the image, if present
+    var unetRefiner: Unet?


Shall we make unet also an optional since we were talking about the following scenarios:
1-) Only refiner unet provided, we only refine
2-) Only base unet provided, we only run base gen
3-) Both provided, we run full pipeline (loading/unloading base/refiner during the transition)
4-) Neither provided, we error out

If you think (1-) is not relevant, we can discuss 👍

Good question, 1-) might have a catch in the case that someone wants to use the refiner as a base model (in the case of imageToImage mode). If only a refiner is provided then we can assume it's imageToImage mode and require a startingImage as well. It might be useful to include some flexibility to use either a Unet or UnetRefiner for purely imageToImage applications to save disk space.

This question also goes back to torch2coreml.py I think - in the conversion script we could logically separate refiner unets in a similar way to how the controlnets are separate, and discourage using refiner unets as base models. Currently, refiner unets are considered the same as any other sdxl unet unless specifically set by the new argument. I do think there are cases where you'd want to use a "base" model as a refiner as more fine-tuned models/loras come out, and there are also some cases I've seen people use the refiner as a base model, but probably less common overall.

I see! Let's keep it as is for now then. I will test the workflows as is.

atiorh · 2023-09-14T21:35:08Z

python_coreml_stable_diffusion/torch2coreml.py

+        logger.info(f"Converting refiner")
+        del pipe
+        gc.collect()
+        args.model_version = args.refiner_version


@ZachNagengast Could we restore the original model_version after the refiner conversion is done 2 lines down? The name override breaks some filename assumptions in the coremlcompiler calls

Yep! Just pushed

atiorh · 2023-09-15T16:02:39Z

Update: still testing but things are looking great so far!

@ZachNagengast One request: could you please hardcode cpuAndGPU for the SDXL VAE (with a FIXME note) since we are forcing it to float32 for now? Neural Engine does not support float32 and the VAE ends up on the CPU when cpuAndNeuralEngine is selected

atiorh

Tested base+refiner flow and ran into an issue that I think may require some changes, thanks for the continued iterations @ZachNagengast!

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift

atiorh · 2023-09-17T02:28:17Z

@ZachNagengast Tested out the base+refiner and base only workflows, works great! Final things to put in place before I can approve:

unetRefiner chunking support
Lazily loading unetRefiner just in time for iOS
coremlcompiler calls still skip the refiner compilation because the _get_out_path(args, source_name) uses args.model_version instead of args.refiner_version, it is tricky to handle both model versions throughput the script..

ZachNagengast · 2023-09-19T06:05:19Z

Thanks for testing it out @atiorh, just pushed refiner chunking, and the compilation step should work now too, where it will save the files as {base model version}_refiner_chunk#.mlpackage so that they are closely coupled to the base (similar to controlnet). Lazy loading should be working similar to normal unets, and now added an extra step to make sure it's unloaded afterwards.

pcuenca · 2023-09-19T08:44:30Z

swift/StableDiffusion/pipeline/StableDiffusionPipeline.Configuration.swift

-    public var refinerStart: Float = 0.7
+    public var refinerStart: Float = 0.8


What do you think should be the default here? I see 0.7 here, but 0.8 in the examples here

This was based on my own testing, but according to the paper it was specialized on the first 200 steps, so 20% seems reasonable. I noticed fewer "wrong" artifacts show up with 0.8 which seems preferred as a default, whereas I didn't notice much more "refinement" with a higher percentage of refinement steps.

Awesome, thanks for the clarification!

pcuenca · 2023-09-19T08:47:23Z

swift/StableDiffusion/pipeline/StableDiffusionXL+Resources.swift

-        if FileManager.default.fileExists(atPath: urls.unetRefinerURL.path) {
-            unetRefiner = Unet(modelAt: urls.unetRefinerURL, configuration: config)
+        if FileManager.default.fileExists(atPath: urls.unetRefinerChunk1URL.path) &&
+            FileManager.default.fileExists(atPath: urls.unetRefinerChunk2URL.path) {
+            unetRefiner = Unet(chunksAt: [urls.unetRefinerChunk1URL, urls.unetRefinerChunk2URL],
+                               configuration: config)
        } else {
-            unetRefiner = nil
+            unetRefiner = Unet(modelAt: urls.unetRefinerURL, configuration: config)


Shouldn't unetRefiner be nil when it's not there?

Update: after testing I can confirm that the generation loop fails if there's no refiner in the bundle.

Yes, good catch. I copied this from the controlnet logic but missed the first parts.

pcuenca · 2023-09-19T08:52:46Z

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift

-                self.unet.unloadResources()
+                unet.unloadResources()


This is cool. But if reduceMemory is false, I think both the normal unet and the refiner would be loaded while iterating through the unet. I would suggest we always prewarm the refiner so it's unloaded until needed.

This way the UnetRefiner will always lazy-loaded, which makes sense because we're always unloading the base unet here every time. @atiorh FYI, I'll update the logic so that UnetRefiner will always be unloaded initially on launch, not just for iOS. It was touched on before but there might be a use case for a reduceLatency type of setting if the user wants to keep both in memory at all times, but that can be for later.

pcuenca · 2023-09-19T08:53:45Z

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift

@@ -270,7 +270,8 @@ public struct StableDiffusionXLPipeline: StableDiffusionPipelineProtocol {
        }

        if reduceMemory {
-            unetModel.unloadResources()
+            unetRefiner?.unloadResources()


I'd maybe move this line out of the conditional for the same reason above.

pcuenca · 2023-09-19T09:00:48Z

@ZachNagengast Tested out the base+refiner and base only workflows, works great! Final things to put in place before I can approve:

unetRefiner chunking support

Lazily loading unetRefiner just in time for iOS

coremlcompiler calls still skip the refiner compilation because the _get_out_path(args, source_name) uses args.model_version instead of args.refiner_version, it is tricky to handle both model versions throughput the script..

I think we are also missing the hardcoded cpuAndGPU for the VAE, unless I missed that change.

swift/StableDiffusion/pipeline/StableDiffusionXL+Resources.swift

Co-authored-by: Pedro Cuenca <[email protected]>

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift

… from torch2coreml

Allow a custom VAE to be converted.

atiorh

Looks great, thanks for the perseverance and great work @ZachNagengast! Merging now and we can address minor issues in subsequent PRs

ZachNagengast added 2 commits August 3, 2023 15:56

Initial support for SDXL refiner

4cd95e1

Cleanup

7c03b1d

ZachNagengast marked this pull request as draft August 4, 2023 00:44

Add arg for converting Unet in float32 precision

db154eb

ZachNagengast added 2 commits August 7, 2023 20:41

Merge branch 'main' into sdxl-refiner

1d8bfff

Setup scale factor with pipeline in CLI

d3be8b9

ZachNagengast marked this pull request as ready for review August 8, 2023 03:45

atiorh reviewed Aug 15, 2023

View reviewed changes

Update cli arg and future warning

5f9d50c

Merge branch 'main' into sdxl-refiner

fc204ef

pcuenca reviewed Aug 22, 2023

View reviewed changes

pcuenca reviewed Aug 23, 2023

View reviewed changes

pcuenca mentioned this pull request Aug 23, 2023

CGImage to shaped array fix #239

Merged

1 task

ZachNagengast added 4 commits August 25, 2023 11:38

Bundle refiner unet if specified

f4142cf

Merge branch 'main' into sdxl-refiner

3aa0e23

Update script for bundled refiner

58b4f62

- Also skip loading model if check_output_correctness is missing, since the model does not require inferencing at conversion time

Merge branch 'main' into sdxl-refiner

137d0c8

Flip skip_model_load bool

e28f812

ZachNagengast force-pushed the sdxl-refiner branch from ceaa54e to e28f812 Compare August 28, 2023 21:08

ZachNagengast added 2 commits August 28, 2023 16:16

Cleanup

2543c1f

Support bundled UnetRefiner

b211d2d

atiorh reviewed Aug 29, 2023

View reviewed changes

Revert pipeline interface change, add extra logging on pipe load

352f349

atiorh reviewed Sep 14, 2023

View reviewed changes

Reset model_version after refiner conversion

e5724db

Reset model_version before refiner conversion but after pipe init

90864bc

atiorh suggested changes Sep 16, 2023

View reviewed changes

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift Show resolved Hide resolved

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift Show resolved Hide resolved

ZachNagengast added 2 commits September 18, 2023 22:57

Add refiner chunking

e2e2b16

Ensure unets are unloaded for reduceMemory true

7cb53e8

pcuenca reviewed Sep 19, 2023

View reviewed changes

swift/StableDiffusion/pipeline/StableDiffusionXL+Resources.swift Outdated Show resolved Hide resolved

ZachNagengast and others added 3 commits September 19, 2023 08:03

Handle missing UnetRefiner.mlmodelc on pipeline load

84450eb

Co-authored-by: Pedro Cuenca <[email protected]>

Prewarm refiner on load, unload on complete

1dae882

Force cpu_and_gpu for VAE until it can be fixed

ae516e7

ZachNagengast requested review from atiorh and pcuenca September 19, 2023 18:10

Include output dtype of np.float32 for all conversions

efda893

pcuenca reviewed Sep 25, 2023

View reviewed changes

swift/StableDiffusion/pipeline/StableDiffusionXLPipeline.swift Outdated Show resolved Hide resolved

atiorh mentioned this pull request Sep 25, 2023

Xl python inference #261

Merged

pcuenca and others added 6 commits September 25, 2023 20:13

Allow a custom VAE to be converted.

a5f0280

Revert hardcoded reduceMemory

17dec17

Merge branch 'main' into sdxl-refiner

580bcbd

Fix merge

c69deb7

Default chunking arg for --merge-chunks-in-pipeline-model when called…

68c39b3

… from torch2coreml

Merge pull request #2 from pcuenca/custom-vae-version

afdfbcd

Allow a custom VAE to be converted.

atiorh approved these changes Sep 26, 2023

View reviewed changes

atiorh merged commit 94dfc6b into apple:main Sep 26, 2023

atiorh mentioned this pull request Sep 26, 2023

Unsupported type in MLMultiArray when running with Swift CLI #253

Closed

	if args.xl_version and pipe.config.requires_aesthetics_score and args.unet_support_cli:
	if args.xl_version and pipe.config.requires_aesthetics_score and args.unet_compute_precision_float32:

		public var refinerStart: Float = 0.7
		public var refinerStart: Float = 0.8

Support for SDXL refiner #227

Support for SDXL refiner #227

Uh oh!

Conversation

ZachNagengast commented Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to test

Summary of the changes

Conversion script

Inference

Now for the catch:

Sample outputs:

Uh oh!

ZachNagengast commented Aug 4, 2023

More info on the CLI issue:

Uh oh!

itsjoshpark commented Aug 9, 2023

Uh oh!

atiorh commented Aug 13, 2023

Uh oh!

atiorh left a comment

Choose a reason for hiding this comment

Uh oh!

atiorh Aug 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atiorh Aug 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZachNagengast commented Aug 20, 2023

Uh oh!

atiorh commented Aug 20, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

ZachNagengast commented Aug 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZachNagengast commented Aug 28, 2023

Uh oh!

ZachNagengast commented Aug 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZachNagengast commented Aug 3, 2023 •

edited

Loading

atiorh Aug 15, 2023 •

edited

Loading

atiorh Aug 15, 2023 •

edited

Loading

ZachNagengast commented Aug 28, 2023 •

edited

Loading

ZachNagengast commented Aug 29, 2023 •

edited

Loading

atiorh Sep 14, 2023 •

edited

Loading

atiorh commented Sep 15, 2023 •

edited

Loading

atiorh commented Sep 17, 2023 •

edited

Loading