Skip to content

Change: refactor skorch for more consistency when adding custom modules etc. #751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
108b434
Don't reinitialize uninitialized net bc set_params
Mar 28, 2021
a880987
Move cb_params update, improve comment
Mar 28, 2021
c6e92a5
Simplify initialize_* methods
Mar 29, 2021
0b19538
Add unit test for set_params on uninitialized net
Mar 29, 2021
d6e1bca
Print only when verbose
Mar 30, 2021
c87c932
Remove init code related to likelihood
Mar 30, 2021
0cda3ef
Further clean up of set_params re-initialization
Mar 30, 2021
cff5edb
Add more tests for re-initialization logic
Mar 30, 2021
6c50ef7
Rework logic of creating custom modules/optimizers
Mar 31, 2021
f02738a
Add battery of tests for custom modules/optimizers
Mar 31, 2021
435fc75
Implement changes to make tests pass
Apr 1, 2021
52bdefa
[WIP] Update CHANGES
Apr 1, 2021
be4c035
[WIP] Document an edge case not covered yet
Apr 1, 2021
f20982d
Remove _PYTORCH_COMPONENTS global
Apr 1, 2021
5480b8f
Update documentation reflecting the changes
Apr 2, 2021
7454e85
All optimizers perform updates automatically
Apr 3, 2021
e1d3c2f
Address reviewer comments
Apr 5, 2021
1bec3d3
Fix corner case with pre-initialized modules
Apr 8, 2021
c6fb0aa
Custom modules are set to train/eval mode
Apr 9, 2021
66b2e80
Update docs about train/eval mode
Apr 9, 2021
1547600
Complete docstrings
Apr 10, 2021
44069bc
Complete entries in CHANGES.md
Apr 10, 2021
f6e8647
Merge branch 'master' into changed/refactor-init-more-consistency-cus…
BenjaminBossan Apr 23, 2021
4ca941d
Reviewer comment: Consider virtual params
Apr 24, 2021
1a33aec
Reviewer comment: Docs: No need to return self
Apr 24, 2021
bb4e573
Reviewer comment: Docs: explain NeuralNet.predict
Apr 24, 2021
d787d4a
Reviewer comment: Docs: When not calling super
Apr 24, 2021
a61a4c7
Reviewer comment: get_all_learnable_params
Apr 24, 2021
64b3380
Reviewer comment: facilitate module initialization
Apr 24, 2021
72d37ee
Address reviewer comments:
May 11, 2021
699d93a
Merge branch 'master' into changed/refactor-init-more-consistency-cus…
May 22, 2021
ffdf9fd
Make modules_, criteria_, optimizers_ private
May 22, 2021
b9d5efa
Remove unnecessary *args, **kwargs in doc example
May 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- Added `load_best` attribute to `Checkpoint` callback to automatically load state of the
best result at the end of training
- Added `load_best` attribute to `Checkpoint` callback to automatically load state of the best result at the end of training
- Added a `get_all_learnable_params` method to retrieve the named parameters of all PyTorch modules defined on the net, including of criteria if applicable

### Changed

- Changed the signature of `validation_step`, `train_step_single`, `train_step`, `evaluation_step`, `on_batch_begin`, and `on_batch_end` such that instead of receiving `X` and `y`, they receive the whole batch; this makes it easier to deal with datasets that don't strictly return an `(X, y)` tuple, which is true for quite a few PyTorch datasets; please refer to the [migration guide](https://skorch.readthedocs.io/en/latest/user/FAQ.html#migration-from-0-9-to-0-10) if you encounter problems
- Checking of arguments to `NeuralNet` is now during `.initialize()`, not during `__init__`, to avoid raising false positives for yet unknown module or optimizer attributes
- Modules, criteria, and optimizers that are added to a net by the user are now first class: skorch takes care of setting train/eval mode, moving to the indicated device, and updating all learnable parameters during training (check the [docs](https://skorch.readthedocs.io/en/latest/user/customization.html#initialization-and-custom-modules) for more details)

### Fixed

Expand Down
199 changes: 181 additions & 18 deletions docs/user/customization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,26 @@ Customization
Customizing NeuralNet
---------------------

:class:`.NeuralNet` and its subclasses like
:class:`.NeuralNetClassifier` are already very flexible as they are
and should cover many use cases by adjusting the provided
parameters. However, this may not always be sufficient for your use
cases. If you thus find yourself wanting to customize
:class:`.NeuralNet`, please follow these guidelines.
Apart from the :class:`.NeuralNet` base class, we provide
:class:`.NeuralNetClassifier`, :class:`.NeuralNetBinaryClassifier`,
and :class:`.NeuralNetRegressor` for typical classification, binary
classification, and regressions tasks. They should work as drop-in
replacements for sklearn classifiers and regressors.

Initialization
^^^^^^^^^^^^^^
The :class:`.NeuralNet` class is a little less opinionated about the
incoming data, e.g. it does not determine a loss function by default.
Therefore, if you want to write your own subclass for a special use
case, you would typically subclass from :class:`.NeuralNet`. The
:func:`~skorch.net.NeuralNet.predict` method returns the same output
as :func:`~skorch.net.NeuralNet.predict_proba` by default, which is
the module output (or the first module output, in case it returns
multiple values).

The method :func:`~skorch.net.NeuralNet.initialize` is responsible for
initializing all the components needed by the net, e.g. the module and
the optimizer. For this, it calls specific initialization methods,
such as :func:`~skorch.net.NeuralNet.initialize_module` and
:func:`~skorch.net.NeuralNet.initialize_optimizer`. If you'd like to
customize the initialization behavior, you should override the
corresponding methods. Following sklearn conventions, the created
components should be set as an attribute with a trailing underscore as
the name, e.g. ``module_`` for the initialized module. Finally, the
method should return ``self``.
:class:`.NeuralNet` and its subclasses are already very flexible as they are and
should cover many use cases by adjusting the provided parameters or by using
callbacks. However, this may not always be sufficient for your use cases. If you
thus find yourself wanting to customize :class:`.NeuralNet`, please follow the
guidelines in this section.

Methods starting with get_*
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -38,6 +38,31 @@ quite sure, consult their documentations. In general, these methods
are fairly safe to override as long as you make sure to conform to the
same signature as the original.

A short example should serve to illustrate this.
:func:`~skorch.net.NeuralNet.get_loss` is called when the loss is determined.
Below we show an example of overriding :func:`~skorch.net.NeuralNet.get_loss` to
add L1 regularization to our total loss:

.. code:: python

class RegularizedNet(NeuralNet):
def __init__(self, *args, lambda1=0.01, **kwargs):
super().__init__(*args, **kwargs)
self.lambda1 = lambda1

def get_loss(self, y_pred, y_true, X=None, training=False):
loss = super().get_loss(y_pred, y_true, X=X, training=training)
loss += self.lambda1 * sum([w.abs().sum() for w in self.module_.parameters()])
return loss

.. note:: This example also regularizes the biases, which you typically
don't need to do.

It is often a good idea to call ``super`` of the method you override, to make
sure that everything that needs to happen inside that method does happen. If you
don't, you should make sure to take care of everything that needs to happen by
following the original implementation.

Training and validation
^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -96,3 +121,141 @@ perform some book keeping, like making sure that callbacks are handled
or writing logs to the ``history``. If you do need to override these,
make sure that you perform the same book keeping as the original
methods.

Initialization and custom modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The method :func:`~skorch.net.NeuralNet.initialize` is responsible for
initializing all the components needed by the net, e.g. the module and
the optimizer. For this, it calls specific initialization methods,
such as :func:`~skorch.net.NeuralNet.initialize_module` and
:func:`~skorch.net.NeuralNet.initialize_optimizer`. If you'd like to
customize the initialization behavior, you should override the
corresponding methods. Following sklearn conventions, the created
components should be set as an attribute with a trailing underscore as
the name, e.g. ``module_`` for the initialized module.

A possible modification you may want to make is to add more modules, criteria,
and optimizers to your net. This is possible in skorch by following the
guidelines below. If you do this, your custom modules and optimizers will be
treated as "first class citizens" in skorch land. This means:

1. The parameters of your custom modules are automatically passed to the
optimizer (but you can modify this behavior).
2. skorch takes care of moving your modules to the correct device.
3. skorch takes care of setting the training/eval mode correctly.
4. When a module needs to be re-initialized because ``set_params`` was called,
all modules and optimizers that may depend on it are also re-initialized.
This is for instance important for the optimizer, which must know about the
parameters of the newly initialized module.
5. You can pass arguments to the custom modules and optimizers using the now
familiar double-underscore notation. E.g., you can initialize your net like
this:

.. code:: python

net = MyNet(
module=MyModule,
module__num_units=100,

othermodule=MyOtherModule,
othermodule__num_units=200,
)
net.fit(X, y)

A word about the distinction between modules and criteria made by skorch:
Typically, criteria are also just subclasses of PyTorch
:class:`~torch.nn.Module`. As such, skorch moves them to CUDA if that is the
indicated device and will even pass parameters of criteria to the optimizers, if
there are any. This can be useful when e.g. training GANs, where you might
implement the discriminator as the criterion (and the generator as the module).

A difference between module and criterion is that the output of modules are used
for generating the predictions and are thus returned by
:func:`~skorch.net.NeuralNet.predict` etc. In contrast, the output of the
criterion is used for calculating the loss and should therefore be a scalar.

skorch assumes that criteria may depend on the modules. Therefore, if a module
is re-initialized, all criteria are also re-initialized, but not vice-versa. On
top of that, the optimizer is re-initialized when either modules or criteria
are changed.

So after all this talk, what are the aforementioned guidelines to add your own
modules, criteria, and optimizers? You have to follow these rules:

1. Initialize them during their respective ``initialize_`` methods, e.g. modules
should be set inside :func:`~skorch.net.NeuralNet.initialize_module`.
2. If they have learnable parameters, they should be instances of
:class:`~torch.nn.Module`. Optimizers should be instances of
:class:`~torch.optim.Optimizer`.
3. Their names should end on an underscore. This is true for all attributes that
are created during ``initialize`` and distinguishes them from arguments
passed to ``__init__``. So a name for a custom module could be ``mymodule_``.
4. Inside the initialization method, use :meth:`.get_params_for` (or,
if dealing with an optimizer, :meth:`.get_params_for_optimizer`) to
retrieve the arguments for the constructor of the instance.

Here is an example of how this could look like in practice:

.. code:: python

class MyNet(NeuralNet):
def initialize_module(self):
super().initialize_module()

# add an additional module called 'module2_'
params = self.get_params_for('module2')
self.module2_ = Module2(**params)
return self

def initialize_criterion(self):
super().initialize_criterion()

# add an additional criterion called 'other_criterion_'
params = self.get_params_for('other_criterion')
self.other_criterion_ = nn.BCELoss(**params)
return self

def initialize_optimizer(self):
# first initialize the normal optimizer
named_params = self.module_.named_parameters()
args, kwargs = self.get_params_for_optimizer('optimizer', named_params)
self.optimizer_ = self.optimizer(*args, **kwargs)

# next add an another optimizer called 'optimizer2_' that is
# only responsible for training 'module2_'
named_params = self.module2_.named_parameters()
args, kwargs = self.get_params_for_optimizer('optimizer2', named_params)
self.optimizer2_ = torch.optim.SGD(*args, **kwargs)
return self

... # additional changes


net = MyNet(
...,
module2__num_units=123,
other_criterion__reduction='sum',
optimizer2__lr=0.1,
)
net.fit(X, y)

# set_params works
net.set_params(optimizer2__lr=0.05)
net.partial_fit(X, y)

# grid search et al. works
search = GridSearchCV(net, {'module2__num_units': [10, 50, 100]}, ...)
search.fit(X, y)

In this example, a new criterion, a new module, and a new optimizer
were added. Of course, additional changes should be made to the net so
that those new components are actually being used for something, but
this example should illustrate how to start. Since the rules outlined
above are being followed, we can use grid search on our customly
defined components.

.. note:: In the example above, the parameters of ``module_`` are trained by
``optimzer_`` and the parameters of ``module2_`` are trained by
``optimizer2_``. To conveniently obtain the parameters of all modules,
call the method :func:`~skorch.net.NeuralNet.get_all_learnable_params`.
130 changes: 0 additions & 130 deletions docs/user/neuralnet.rst
Original file line number Diff line number Diff line change
Expand Up @@ -514,133 +514,3 @@ Those arguments are used to initialize your ``module``, ``criterion``,
etc. They are not fixed because we cannot know them in advance; in
fact, you can define any parameter for your ``module`` or other
components.

All special prefixes are stored in the ``prefixes_`` class attribute
of :class:`.NeuralNet`. Currently, they are:

- ``module``
- ``iterator_train``
- ``iterator_valid``
- ``optimizer``
- ``criterion``
- ``callbacks``
- ``dataset``

Subclassing NeuralNet
---------------------

Apart from the :class:`.NeuralNet` base class, we provide
:class:`.NeuralNetClassifier`, :class:`.NeuralNetBinaryClassifier`,
and :class:`.NeuralNetRegressor` for typical classification, binary
classification, and regressions tasks. They should work as drop-in
replacements for sklearn classifiers and regressors.

The :class:`.NeuralNet` class is a little less opinionated about the
incoming data, e.g. it does not determine a loss function by default.
Therefore, if you want to write your own subclass for a special use
case, you would typically subclass from :class:`.NeuralNet`.

skorch aims at making subclassing as easy as possible, so that it
doesn't stand in your way. For instance, all components (``module``,
``optimizer``, etc.) have their own initialization method
(:meth:`.initialize_module`, :meth:`.initialize_optimizer`,
etc.). That way, if you want to modify the initialization of a
component, you can easily do so.

Additonally, :class:`.NeuralNet` has a couple of ``get_*`` methods for
when a component is retrieved repeatedly. E.g.,
:func:`~skorch.net.NeuralNet.get_loss` is called when the loss is
determined. Below we show an example of overriding
:func:`~skorch.net.NeuralNet.get_loss` to add L1 regularization to our
total loss:

.. code:: python

class RegularizedNet(NeuralNet):
def __init__(self, *args, lambda1=0.01, **kwargs):
super().__init__(*args, **kwargs)
self.lambda1 = lambda1

def get_loss(self, y_pred, y_true, X=None, training=False):
loss = super().get_loss(y_pred, y_true, X=X, training=training)
loss += self.lambda1 * sum([w.abs().sum() for w in self.module_.parameters()])
return loss

.. note:: This example also regularizes the biases, which you typically
don't need to do.

It is possible to add your own criterion, module, or optimizer to your
customized neural net class. You should follow a few rules when you do
so:

1. Set this attribute inside the corresponding method. E.g., when
setting an optimizer, use :meth:`.initialize_optimizer` for that.
2. Inside the initialization method, use :meth:`.get_params_for` (or,
if dealing with an optimizer, :meth:`.get_params_for_optimizer`) to
retrieve the arguments for the constructor.
3. The attribute name should contain the substring ``"module"`` if
it's a module, ``"criterion"`` if a criterion, and ``"optimizer"``
if an optimizer. This way, skorch knows if a change in
parameters (say, because :meth:`.set_params` was called) should
trigger re-initialization.

When you follow these rules, you will make sure that your added
components are amenable to :meth:`.set_params` and hence to things
like grid search.

Here is an example of how this could look like in practice:

.. code:: python

class MyNet(NeuralNet):
def initialize_criterion(self, *args, **kwargs):
super().initialize_criterion(*args, **kwargs)

# add an additional criterion
params = self.get_params_for('other_criterion')
self.other_criterion_ = nn.BCELoss(**params)
return self

def initialize_module(self, *args, **kwargs):
super().initialize_module(*args, **kwargs)

# add an additional module called 'mymodule'
params = self.get_params_for('mymodule')
self.mymodule_ = MyModule(**params)
return self

def initialize_optimizer(self, *args, **kwargs):
super().initialize_optimizer(*args, **kwargs)

# add an additional optimizer called 'optimizer2' that is
# responsible for 'mymodule'
named_params = self.mymodule_.named_parameters()
pgroups, params = self.get_params_for_optimizer('optimizer2', named_params)
self.optimizer2_ = torch.optim.SGD(*pgroups, **params)
return self

... # additional changes


net = MyNet(
...,
other_criterion__reduction='sum',
mymodule__num_units=123,
optimizer2__lr=0.1,
)
net.fit(X, y)

# set_params works
net.set_params(optimizer2__lr=0.05)
net.partial_fit(X, y)

# grid search et al. works
search = GridSearchCV(net, {'mymodule__num_units': [10, 50, 100]}, ...)
search.fit(X, y)

In this example, a new criterion, a new module, and a new optimizer
were added. Of course, additional changes should be made to the net so
that those new components are actually being used for something, but
this example should illustrate how to start. Since the rules outlined
above are being followed, we can use grid search on our customly
defined components.
2 changes: 1 addition & 1 deletion skorch/callbacks/training.py
Original file line number Diff line number Diff line change
Expand Up @@ -532,7 +532,7 @@ def initialize(self):
return self

def named_parameters(self, net):
return net.module_.named_parameters()
return net.get_all_learnable_params()

def filter_parameters(self, patterns, params):
pattern_fns = (
Expand Down
4 changes: 4 additions & 0 deletions skorch/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ class NotInitializedError(SkorchException):
"""


class SkorchAttributeError(SkorchException):
"""An attribute was set incorrectly on a skorch net."""


class SkorchWarning(UserWarning):
"""Base skorch warning."""

Expand Down
Loading