You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i was trying to finetune this model on flicrfa dataset. but the projection heads start from random weights, which drastically reduces the model's performance (i used clip trainer for my training process)
aslo i have problem loading the model after pushing it
I was thinking of fine-tuning the projection heads first and then, after a few epochs, fine-tuning the LoRA configuration of the last two layers of the text and image encoders.
also i didn’t quite understand the purpose of the clip_wrapper in your training notebook. Doesn’t it cause issues by setting hidden_size to 1? Wouldn’t that mean the loss during training is computed on projection heads of size 1, which seems meaningless?
Could you please clarify this for me and provide guidance on the best way to fine-tune your model on FlickrFA while ensuring the projection heads start with meaningful weights and maintaining good performance?
this is my code for loading and finetuning the model:
Some weights of CLIPVisionModel were not initialized from the model checkpoint at Setayeshk/Clipfa_finetune and are newly initialized: ['vision_model.embeddings.class_embedding', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.0.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.encoder ...
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Hello,
i was trying to finetune this model on flicrfa dataset. but the projection heads start from random weights, which drastically reduces the model's performance (i used clip trainer for my training process)
aslo i have problem loading the model after pushing it
I was thinking of fine-tuning the projection heads first and then, after a few epochs, fine-tuning the LoRA configuration of the last two layers of the text and image encoders.
also i didn’t quite understand the purpose of the clip_wrapper in your training notebook. Doesn’t it cause issues by setting hidden_size to 1? Wouldn’t that mean the loss during training is computed on projection heads of size 1, which seems meaningless?
Could you please clarify this for me and provide guidance on the best way to fine-tune your model on FlickrFA while ensuring the projection heads start with meaningful weights and maintaining good performance?
this is my code for loading and finetuning the model:
trainer = CLIPTrainer(
model=clip_model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
data_collator=data_collator,
tokenizer=tokenizer,
callbacks=[EarlyStoppingCallback(early_stopping_patience=3)]
)
trainer.save_model(args.output_dir)
and this is the code for loading it:
Some weights of CLIPVisionModel were not initialized from the model checkpoint at Setayeshk/Clipfa_finetune and are newly initialized: ['vision_model.embeddings.class_embedding', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.0.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.encoder ...
The text was updated successfully, but these errors were encountered: