-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Conversation
allennlp/commands/train.py
Outdated
|
||
logger.info("Following parameters are Frozen (without gradient):") | ||
logger.info("FROZEN PARAMETERS (WITHOUT GRADIENT):") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No shouting in the logs please 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh Cool, will revert it back then. I thought I was supposed to capitalize the headings similar to "CURRENTLY DEFINED PARAMETERS".
allennlp/commands/fine_tune.py
Outdated
for name in nograd_parameter_names: | ||
logger.info(name) | ||
logger.info("Following parameters are Tunable (with gradient):") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps consider adding a small test for this - given the same code is in both places, consider pulling it out into a function, which would make testing it easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Do you mean I add def get_frozen_and_tunable_parameter_names(parameters: List[Parameter]) -> List
in different place like allennlp/common/util.py
and use it in both train.py
and fine_tune.py
? Something like following?
# allennlp/commons/util.py
def get_frozen_and_tunable_parameter_names(model: torch.nn.Module) -> List:
frozen_parameter_names = []
tunable_parameter_names = []
for name, parameter in model.named_parameters():
if not parameter.requires_grad:
frozen_parameter_names.append(name)
else:
tunable_parameter_names.append(name)
return [frozen_parameter_names, tunable_parameter_names]
and test could be something like:
def test_get_frozen_and_tunable_parameter_names(self):
model = torch.nn.Sequential(OrderedDict([
('conv', torch.nn.Conv1d(5, 5, 5)),
('linear', torch.nn.Linear(5, 10)),
]))
named_parameters = dict(model.named_parameters())
named_parameters['linear.weight'].requires_grad_(False)
named_parameters['linear.bias'].requires_grad_(False)
frozen_parameter_names, tunable_parameter_names = \
util.get_frozen_and_tunable_parameter_names(model)
assert set(frozen_parameter_names) == {'linear.weight', 'linear.bias'}
assert set(tunable_parameter_names) == {'conv.weight', 'conv.bias'}
and train and fine-tune would be:
frozen_parameter_names, tunable_parameter_names = \
util.get_frozen_and_tunable_parameter_names(model)
logger.info("Following parameters are Frozen (without gradient):")
for name in frozen_parameter_names:
logger.info(name)
logger.info("Following parameters are Tunable (with gradient):")
for name in tunable_parameter_names:
logger.info(name)
Seems good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect 👍
- Factor out common code between fine_tune and train in a function. - Add test for this function. - Remove capitalization from logs of frozen/tunable logs.
@DeNeutoy I was busy with something else for last 2 days ... This should be good now. |
Fix issue in no-grad parameters logging as mentioned by @rulai-huajunzeng in (allenai#1427). If parameters were already set `requires_grad=False` not through no through nograd regex but other means then they were logged as Tunable instead of Frozen. This pr fixes that. I have made the headings capitalized. They are more distinguishable this way amidst a long list of parameters.
Fix issue in no-grad parameters logging as mentioned by @rulai-huajunzeng in (#1427).
If parameters were already set
requires_grad=False
not through no through nograd regex but other means then they were logged as Tunable instead of Frozen. This pr fixes that.I have made the headings capitalized. They are more distinguishable this way amidst a long list of parameters.