attempted implementation for InvokeAI

I was interested in using your published model weights for image generation, so made an attempt to integrate the model with InvokeAI.

My process went something like this:
- extracted the TextTransformerRoPE class from transformer_rope.py in your repo here to [transformer_rope.py](https://gitlab.com/keturn/TULIP-najderak-for-invoke/-/blob/main/transformer_rope.py)
- followed the example set by your [eval_tulip](https://github.com/ivonajdenkoska/tulip/blob/main/eval_tulip.py#L119) for filtering the checkpoint's state_dict for that model's keys
- saved that filtered state_dict to https://huggingface.co/keturn/TULIP/blob/main/model.safetensors
- tokenized some text with the standard CLIP-L tokenizer
- ran those tokens through TextTransformerROPE
- passed the resulting embeddings to a Stable Diffusion (1.x) model

<details>
<summary>the result</summary>

![Image](https://github.com/user-attachments/assets/f843040a-e21d-4f96-86ff-c1c56d0c4927)

[Update: I corrected a bit in how I was handling CFG and it makes things that look more like images than noise mush now, but there is still no recognizable connection from the prompt to the content.]

</details>

Clearly I took a wrong turn somewhere along the way. Do you have any guidance for me?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

attempted implementation for InvokeAI #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

attempted implementation for InvokeAI #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions