-
Notifications
You must be signed in to change notification settings - Fork 99
Error Using LLama-2 with Fine-Tuned LoRA Adapters: Tensor Size Mismatch in apply_rotary_pos_emb Function #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @montygole, did you manage to make it work ? I am also trying to use the transformers-interpret library with a fine-tuned Llama-2 model for sequence classification. |
Hi @nicolas-richet . No I didn't get this library to work. Instead I used captum. This is the code I used for layer integrated gradients
|
Hi @montygole, Thank you for the code! I was able to run it but it seems that the attribution score of the first 'begin_of_text' token is much higher than the rest. Did you know of/had this problem ? I tried to set the attention mask of the first token to 0 but im not sure this is a correct approach. |
Hey @nicolas-richet. I actually had the same problem. I experimented with different I think it makes sense that it would attribute said seemingly meaningless tokens and sequences because they are present in each training sequence and are important to make the model process inputs properly. Anyways, I have since used the Let me know how it goes! Good luck 😃 |
I encountered a runtime error while using the transformers-interpret library with a fine-tuned LLama-2 model that includes LoRA adapters for sequence classification. The error occurs when invoking the SequenceClassificationExplainer and seems related to tensor size mismatches during the rotary positional embedding application.
Code sample:
Additional Context:
The error seems to occur in the apply_rotary_pos_emb function, indicating a tensor size mismatch. This might be due to the integration of LoRA adapters with the LLama-2 model. Any help to resolve this issue or guidance on proper compatibility would be greatly appreciated.
The text was updated successfully, but these errors were encountered: