Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[MRG] LRScheduler batch option #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] LRScheduler batch option #626
Changes from 10 commits
8ed7824
bf46f20
00c9527
6c23cd3
f62b4ba
22b1e24
321cfae
ce235dc
e761d07
f47423b
a85be11
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no expert on lr schedulers, so can't comment on the exact logic that should be applied here. However, what I'm seeing is that the logic might have a "hole", where score wouldn't be defined at all if none of the
if
orelif
matches. In the logic before, score would always be defined.Could it be that we don't actually need
elif epoch
? Previously, we accessednet.history[-2, self.monitor]
, i.e. the second to last row in thehistory
. Therefore, we had to make sure that we're not in the first epoch (at least that's my interpretation of things). Now that we're accessingnet.history[-1, self.monitor]
, we can probably never get an index error here, hence don't need the guard. Would that make sense?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Pushed a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not be backwards compatible with
TorchCyclicLR
with default arguments. As inTorchCyclicLR
would step every epoch if a user does not pass instep_every='batch'
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the implementation is fine and we just need to restore the
if TorchCyclicLR and isinstance(self.lr_scheduler_, TorchCyclicLR):...
segment?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that
step_every
is a parameter toLRScheduler
, if we keep theisinstance(self.lr_scheduler_, TorchCyclicLR)
piece, we will not be strictly following thestep_every
parameter.We can have a
step_every='auto'
option that special cases theCosineAnnealingLR
andTorchCyclicLR
. Specifically, if the class isCosineAnnealingLR
orTorchCyclicLR
we use batch steps, otherwise we use epoch step. We will still allow thebatch
andepoch
options for custom lr schdulers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean introducing the entire concept of an 'auto' option ,and having to be backwards compatible for it in the future. However all it'll do is special case cyclic (for the record, cosine annealing isn't hard coded to work with batches, only cyclic).
This basically means that if we don't want 'auto' to just be a cover up for the cyclic case and instead be meaningful, we will have to add reasonable recommendations for every scheduler we introduce (which will be the default auto options). That will most likely involve having to read the paper from which it was introduced, or it's documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a conundrum. Do you know why some schedulers prefer batch and other epoch? Is it some derived property or more arbitrary?
This is certainly not the way to go. Could we not just use whatever the default is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for certain. Naturally if you wanted your lr to change smoothly you'd do it per batch rather than epoch. I'm just plugging in a scheduler as is written in its documentation.
A scheduler doesn't have a default. It has a step function which the user can use whenever and wherever they'd like. Setting mode
epoch
orbatch
as defaults will break backwards compatibility for anything that isn't intended to use the default mode we choose.Since the Cyclic scheduler is a special case for which skorch hacked a solution, I suggested leaving that solution as is (so that it won't conflict with this feature or have compatibility issues) until skorch makes an update that breaks compatibility, and just get rid of that hack then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. Maybe it would be worth it to actually break backwards compatibility here for the sake of getting a clean solution. In fact, we could even detect when such a breakage happens and issue a warning to the user, with instructions how to revert to the old behavior. Would that be a compromise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be happy with a deprecation warning with a suggestion of using
step_every='batch'
in the future.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.