Thoughts on the on_grad_computed API

The `on_grad_computed` api contains both `net` and `named_parameters`, where `named_parameters` is set by calling `list(self.module_.named_parameters())`. The callback can already access the module by calling `net.module_.named_parameters()`.

The calling of `list` on the `named_parameters` does add some overhead per training loop:

```python
 import torchvision

m = torchvision.models.densenet201() 
%time b = list(m.named_parameters())
# 2.47 ms ± 33.1 µs per loop
```

This overhead is added per batch, so for an epoch with 500 batches, this would add 1.25 seconds to that epoch.

I propose simplifying `on_grad_computed` by removing `named_parameters`. This is a fairly small change, but does break backwards compatibility.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Thoughts on the on_grad_computed API #378

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Thoughts on the on_grad_computed API #378

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions