You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The on_grad_computed api contains both net and named_parameters, where named_parameters is set by calling list(self.module_.named_parameters()). The callback can already access the module by calling net.module_.named_parameters().
The calling of list on the named_parameters does add some overhead per training loop:
importtorchvisionm=torchvision.models.densenet201()
%timeb=list(m.named_parameters())
# 2.47 ms ± 33.1 µs per loop
This overhead is added per batch, so for an epoch with 500 batches, this would add 1.25 seconds to that epoch.
I propose simplifying on_grad_computed by removing named_parameters. This is a fairly small change, but does break backwards compatibility.
The text was updated successfully, but these errors were encountered:
The reason why we do it this way is that the receiver does not know how many modules there are, if the training loop was overwritten and if there is more than one gradient update per loop for different parameter sets. For example there might be a generator/discriminator cycle where you'll have two gradient computation steps per epoch with different sets of parameters.
Another approach would be to have a lazy generator that yields the named parameters to everyone who asks, something like this:
The
on_grad_computed
api contains bothnet
andnamed_parameters
, wherenamed_parameters
is set by callinglist(self.module_.named_parameters())
. The callback can already access the module by callingnet.module_.named_parameters()
.The calling of
list
on thenamed_parameters
does add some overhead per training loop:This overhead is added per batch, so for an epoch with 500 batches, this would add 1.25 seconds to that epoch.
I propose simplifying
on_grad_computed
by removingnamed_parameters
. This is a fairly small change, but does break backwards compatibility.The text was updated successfully, but these errors were encountered: