Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

Generalize T5 modules #5166

Merged
merged 22 commits into from
Jun 2, 2021
Merged

Generalize T5 modules #5166

merged 22 commits into from
Jun 2, 2021

Conversation

AkshitaB
Copy link
Contributor

@AkshitaB AkshitaB commented Apr 29, 2021

Changes proposed in this pull request:

  • Adding AttentionModule (naming it so instead of just Attention to differentiate it from allennlp.modules.attention.Attention)
  • SelfAttention and T5Attention inherit from this.

@AkshitaB AkshitaB marked this pull request as draft April 29, 2021 08:10
@AkshitaB AkshitaB marked this pull request as ready for review May 6, 2021 16:46
@AkshitaB AkshitaB requested review from dirkgr and epwalsh May 6, 2021 16:46
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple questions/comments about organization so far:

  • Could "general attention" just be called "attention"?
  • Could we put the T5Attention in the t5 module?
  • Can we get rid of T5AttentionOld?

@AkshitaB
Copy link
Contributor Author

AkshitaB commented May 6, 2021

Just a couple questions/comments about organization so far:

  • Could "general attention" just be called "attention"?
  • Could we put the T5Attention in the t5 module?
  • Can we get rid of T5AttentionOld?

Yes to all 3. I'm still debugging a thing with loading pretrained weights, will clean this up in the next commit. This was to get some feedback on the GeneralAttention module itself; does it make sense? Is the logical flow clear enough in terms of readability?

@AkshitaB AkshitaB requested a review from epwalsh May 25, 2021 19:45
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much to say in addition to Dirk's comments. As long as tests pass this LGTM!

@AkshitaB AkshitaB requested a review from epwalsh June 2, 2021 00:17
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple super small suggestions / questions.

We can probably delete allennlp.modules.transformer.self_attention, right?

Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@AkshitaB AkshitaB merged commit b0aa1d4 into main Jun 2, 2021
@AkshitaB AkshitaB deleted the t5-generalize branch June 2, 2021 21:24
Abhishek-P pushed a commit to Abhishek-P/allennlp that referenced this pull request Aug 11, 2021
* initial commit

* general self attn

* fixing bugs, adding tests, adding docs

* updating other modules

* refactor

* bug fix

* update changelog

* fix shape

* fix format

* address feedback

* small doc fix

* Update allennlp/modules/transformer/transformer_stack.py

Co-authored-by: Pete <[email protected]>

* remove old file

Co-authored-by: epwalsh <[email protected]>
Co-authored-by: Pete <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants