You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
load base model first,then use PeftModel.from_pretrained(base_model,CONFIG['output_dir']) to load p-tuning output.
is there a way to merge p-tuning output to generate a new model?
Expected behavior
merge model like lora
The text was updated successfully, but these errors were encountered:
As far as I know there is no way to merge P-Tuning virtual tokens with the model and I don't see how this should work either. The nature of this method is that new virtual tokens are prepended to the embedded input, i.e. we're dealing with activations and not weights anymore.
What kind of problem are you trying to solve with merging?
From a quick search it seems that vLLM does not support soft prompting directly.
However I found that it supports passing embeddings when generating. In theory you could use that to pass the learned embeddings to the vLLM instance for inference.
A quick sketch based on your initial code (no guarantees):
System Info
peft 0.13.2
transformers 4.46.3
Who can help?
No response
Information
Tasks
examples
folderReproduction
there is my code:
when use P-Tuning model to inference,i need run like this:
load base model first,then use
PeftModel.from_pretrained(base_model,CONFIG['output_dir'])
to load p-tuning output.is there a way to merge p-tuning output to generate a new model?
Expected behavior
merge model like lora
The text was updated successfully, but these errors were encountered: