-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
I was interested in using your published model weights for image generation, so made an attempt to integrate the model with InvokeAI.
My process went something like this:
- extracted the TextTransformerRoPE class from transformer_rope.py in your repo here to transformer_rope.py
- followed the example set by your eval_tulip for filtering the checkpoint's state_dict for that model's keys
- saved that filtered state_dict to https://huggingface.co/keturn/TULIP/blob/main/model.safetensors
- tokenized some text with the standard CLIP-L tokenizer
- ran those tokens through TextTransformerROPE
- passed the resulting embeddings to a Stable Diffusion (1.x) model
the result
[Update: I corrected a bit in how I was handling CFG and it makes things that look more like images than noise mush now, but there is still no recognizable connection from the prompt to the content.]
Clearly I took a wrong turn somewhere along the way. Do you have any guidance for me?
Metadata
Metadata
Assignees
Labels
No labels