Skip to content

Add logger to Weights & Biases #607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 5, 2020

Conversation

borisdayma
Copy link
Contributor

Add support for logging through Weights & Biases.

WandbLogger automatically logs metrics, model topology, gradients and best trained model.

Sample run: https://app.wandb.ai/borisd13/skorch/runs/1vs6725x?workspace=user-borisd13
You can switch tabs on the left and see computer resources, model graph, logged files, etc

@borisdayma
Copy link
Contributor Author

I'm now working on adding tests. Feel free if you have any comments.

@BenjaminBossan
Copy link
Collaborator

Thanks for this PR.

You've probably seen the Tensorboard and NeptuneLogger callbacks. There we adopt a pattern of passing the actual logger instance to the callback (e.g. the SummaryWriter) instead of using the try: import ... pattern inside the callback. Would it be possible to change WandbLogger in a similar fashion?

@borisdayma
Copy link
Contributor Author

Yes of course, I'll make the changes.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for adding this feature to skorch and implementing my suggestion. There are a few things that need to be changed before we can merge this. Some of them I have commented on. Apart from that:

  1. Please implement unit tests for your new callback. You could have a look at how it's done for the NeptuneLogger.

  2. Please add W&B to the requirements-dev.txt.

  3. For me it's not quite clear how I can test this in practice. I could probably figure this out by digging through the docs of W&B but I think it would be much nicer for me -- and future users -- if the callback's documentation explained everything that is required. Do I need to start a local server? Or can I even use a test server that you provide? Do I need to sign up or is there a way to use this without?

class WandbLogger(Callback):
"""Logs best model and metrics to `Weights & Biases <https://docs.wandb.com/>`_

"Use this callback to automatically log best trained model and all metrics from
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Use this callback to automatically log best trained model and all metrics from
Use this callback to automatically log best trained model and all metrics from

"""Logs best model and metrics to `Weights & Biases <https://docs.wandb.com/>`_

"Use this callback to automatically log best trained model and all metrics from
your net's history to Weights & Biases after each epoch.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should use the docstring to help skorch users who are unfamiliar with W&B to get started quickly. E.g., you could specify what package they must install for this to work, i.e. a pip (or conda) instruction. You should also indicate what kind of setup they need to make beforehand (say, starting a local server).

For a nice example, look at the docstring for NeptuneLogger.

>>> import wandb
>>> from skorch.callbacks import WandbLogger
>>> wandb_run = wandb.init()
>>> wandb.config.update({"learning rate": 1e-3, "batch size": 32}) # optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you indicate what this config update does?

"""

# Record if watch has been called previously (even in another instance)
_watch_called = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really used anywhere? If it is, please move this inside initialize and call it watch_called_.

self.wandb_run = wandb_run
self.save_model = save_model
self.keys_ignored = keys_ignored
self.model_path = Path(wandb_run.dir) / 'best_model.pth'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't set any arguments in __init__ that are not passed by the user. So either allow them to pass the model_path argument (if that makes any sense) or instead set the model_path inside initialize (and call it model_path_).

"""Automatically log values from the last history step."""
hist = net.history[-1]
keys_kept = filter_log_keys(hist, keys_ignored=self.keys_ignored_)
logged_vals = dict((k, hist[k]) for k in keys_kept if k in hist)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logged_vals = dict((k, hist[k]) for k in keys_kept if k in hist)
logged_vals = {k: hist[k] for k in keys_kept if k in hist}

wandb Run used to log data.

save_model : bool (default=True)
Saves best trained model.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Saves best trained model.
Whether to save a checkpoint of the best model.

def __init__(
self,
wandb_run,
save_model=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already provide a checkpoint callback, I think this functionality is redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to log the trained model to W&B.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. How does that work? In this code, I don't see any interaction with W&B:

        # save best model
        if self.save_model and hist['valid_loss_best']:
            model_path = Path(self.wandb_run.dir) / 'best_model.pth'
            with model_path.open('wb') as model_file:
                net.save_params(f_params=model_file)

Is this some code working in the background or is it simply the fact that the model parameters are stored in the wandb_run_dir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All files stored in wandb_run.dir are automatically saved.
You can see in my example run on the "files" tab

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave a comment that states that the files in wandb_run.dir is automatically saved in on_epoch_end.

@borisdayma
Copy link
Contributor Author

Thanks for the comments @BenjaminBossan
They should now be addressed but feel free to add anything.

I am using a new feature (wandb.run.watch) that has not been pushed to pypi yet so I'll update the requirements-dev with the correct version as soon as it's pushed.

@BenjaminBossan
Copy link
Collaborator

@borisdayma Thank you for implementing the suggested changes. I will review this in detail when I have more time, hopefully soon.

@borisdayma
Copy link
Contributor Author

borisdayma commented Mar 17, 2020

I just need to pin wandb version as soon as next release appears

@borisdayma
Copy link
Contributor Author

It should now be ready to be merged! Feel free if you have any questions or comments.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I tested it and the logs are really useful. Thanks for the addition.

I have a minor comment about how to update the docstring. On top of that, if there is the option to run W&B without an account (i.e. locally), it would be nice to mention that. I believe this will reduce the friction for users who want to test this.

@ottonemo @thomasjpfan do you want to take a look at this as well? Otherwise I think we can merge as soon as the docs have been updated.

wandb Run used to log data.

save_model : bool (default=True)
Whether to save a checkpoint of the best model.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an explanation here that the model will be uploaded to the W&B server for convenience.

@borisdayma
Copy link
Contributor Author

@BenjaminBossan Glad you found it useful!

I updated the docstring and added a comment for logging anonymously (without a W&B account). Feel free if you want me to change the wording or anything else.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @borisdayma

... wandb_run = wandb.init(anonymous="allow)

>>> # Log hyper-parameters (optional)
... wandb.config.update({"learning rate": 1e-3, "batch size": 32})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to update this using the wandb_run object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I'll update

def __init__(
self,
wandb_run,
save_model=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave a comment that states that the files in wandb_run.dir is automatically saved in on_epoch_end.

return mock

@pytest.fixture
def net_fitted(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture is never used and can be removed.

@borisdayma
Copy link
Contributor Author

All comments should have been implemented in last commit. Let me know if I missed anything.

@borisdayma
Copy link
Contributor Author

Just checking if you want me to add anything else to this PR

@BenjaminBossan
Copy link
Collaborator

@borisdayma From my side, it looks good.

I had requested a re-review from @thomasjpfan since you addressed his comments. Maybe let's give him until the weekend to respond, if he doesn't we can consider it a thumbs up and merge.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thank you @borisdayma !

@BenjaminBossan
Copy link
Collaborator

Good job everyone, @borisdayma thanks for the contribution and your patience.

@BenjaminBossan BenjaminBossan merged commit e61f10c into skorch-dev:master Apr 5, 2020
@borisdayma
Copy link
Contributor Author

Thanks, I think it will be very useful for fine-tuning. Feel free to reach if there's any issue related to it.

@BenjaminBossan
Copy link
Collaborator

@borisdayma I now notice that conda install -c conda-forge --file requirements-dev.txt fails with the following error:

PackagesNotFoundError: The following packages are not available from current channels:
  - wandb[version='>=0.8.30']

Is there a conda channel for wandb or is pip the only solution?

@borisdayma
Copy link
Contributor Author

At the moment it is only on pip but we can just use the new conda interoperability with pip

@borisdayma
Copy link
Contributor Author

Let me know if you want me to try it and suggest a new PR (to update docs).
I personally use pipenv all the time so I had not noticed this issue with conda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants