[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled #13961

patrickvonplaten · 2021-10-11T10:35:38Z

What does this PR do?

Following #13657 this PR makes sure that the Trainer uses the new gradient_checkpointing logic to disable find_unused_parameters argument in DDP.

@sgugger - don't think many people have switched from self.config._gradient_checkpointing to the new API yet, but for those that have previously find_unused_parameters would not have been set to False which then leads to some hard to debug problems like: https://discuss.pytorch.org/t/finding-the-cause-of-runtimeerror-expected-to-mark-a-variable-ready-only-once/124428/5

Not sure if this worth a patch or not

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

src/transformers/modeling_utils.py

sgugger

Thanks a lot for fixing! The check in the Trainer was changed to match the intermediate iteration in the PR for gradient checkpointing and I forgot to adapt it to the last one.

src/transformers/modeling_utils.py

up

a3b4e46

patrickvonplaten commented Oct 11, 2021

View reviewed changes

src/transformers/modeling_utils.py Show resolved Hide resolved

correct test

fa0cd20

patrickvonplaten requested a review from sgugger October 11, 2021 11:07

patrickvonplaten mentioned this pull request Oct 11, 2021

[Speech Examples] Add pytorch speech pretraining #13877

Merged

5 tasks

sgugger approved these changes Oct 11, 2021

View reviewed changes

src/transformers/modeling_utils.py Show resolved Hide resolved

patrickvonplaten merged commit dca6796 into huggingface:master Oct 11, 2021

patrickvonplaten deleted the correct_gradient_checkpointing branch October 11, 2021 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled #13961

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled #13961

Uh oh!

patrickvonplaten commented Oct 11, 2021 •

edited

Loading

Uh oh!

Uh oh!

sgugger left a comment

Uh oh!

Uh oh!

Uh oh!

[Gradient checkpoining] Correct disabling find_unused_parameters in Trainer when gradient checkpointing is enabled #13961

[Gradient checkpoining] Correct disabling find_unused_parameters in Trainer when gradient checkpointing is enabled #13961

Uh oh!

Conversation

patrickvonplaten commented Oct 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled #13961

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled #13961

patrickvonplaten commented Oct 11, 2021 •

edited

Loading