Enhancement/auto device option #600

BenAjayiObe · 2020-02-27T09:51:13Z

Context

We would like to modify the to_device function to add the option for users to choose "auto" or None. By selecting "auto" the user will be allowing the skorch tool to automatically detect if GPU's is available. This functionality will mean that the user is no longer burdens with writing repetitive boiler plate code like the following:

_ = "cuda" if torch.cuda.is_available() else "cpu"

By selecting, None, we leave all tensors and modules unmodified so that a user can still customise how they allocate the devices themselves without us interfering.

Changes

CHANGES.md: Added note about changes
examples/benchmarks/mnist.py: Updated Mnist example to use to_device function.
examples/word_language_model/data.py: Updated language model example to use to_device function.
skorch/net.py: Updated device parameter description and added to_device where needed.
skorch/tests/test_utils.py: Added tests for new "auto" and None options in to_device.
skorch/utils.py: Updated to_device function to automatically find the fastest device or leave the input unmodified, depending on the argument passed.

Tests

Modified existing tests in class TestToDevice to update the expected value depending on the argument passed to to_device.
Added a new test to test for when a Packed Sequence object is passed to to_device as this object needs special handling. It is a child of a namedTuple, however, if to_device operated on a PackedSequence object the same way it does to a regular tuple, it would break the execution as some of its attributes aren't tensors.

Pulling Latest Skorch Code

BenjaminBossan

Thanks for this addition, you did a very good job of finding all instances that need to be changed and of extending the tests. We're almost good to go.

I have a few minor comments that you could address. Apart from those, could you please update the CHANGES.md to include this new addition?

skorch/utils.py

BenAjayiObe · 2020-03-02T10:14:22Z

Hi @BenjaminBossan , thank you very much for your comments. I'll action those asap. I just wanted to apologise as I've accidentally submitted this PR prematurely, hence the empty MR description. In addition to actioning your comments, I also have to implement the None feature testing to prevent forced device casting.

This PR should be updated shortly.

* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)

skorch/net.py

skorch/tests/test_utils.py

skorch/utils.py

BenjaminBossan · 2020-03-04T20:26:32Z

@BenAjayiObe thanks for adding the device=None option as well, that's great! I have a few minor comments, apart from that this looks good.

Since it's a more fundamental change, I would like @ottonemo or @thomasjpfan to also have a look at this before merging.

Small comment: It's nice that you rebased your existing commit. However, it makes it harder for me to review the changes because it's not obvious what was changed since my last review and what stayed the same. Could you please, if you add more changes, just merge them? Then, before, merging the whole PR into master, we will squash it into one commit.

ottonemo

Thanks for the PR, this is looking pretty mature already.

I need to address two concerns though. Firstly:

>>> torch.device('auto')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Expected one of cpu, cuda, mkldnn, opengl, opencl, ideep, hip, msnpu device type at start of device string: auto

obviously we need to make sure that torch.device(self.device) doesn't appear in the code anywhere (it still does at some places, for example when loading parameters)
(more important) there are many more options than just 'cpu' and 'cuda' so 'auto' is misleading or at least not well defined. On an AMD platform I would expect 'auto' to select 'hip' or 'opencl' instead of 'cuda', for example.

Secondly, load_params won't work with this change when setting device='auto' since it relies on self.device for pointing to the target device. Of course it is possible to check which device is requested in load_params as well but maybe it would be smarter to do this once during initialization and rely on self.device afterwards.

What are your opinions?

skorch/tests/test_utils.py

skorch/utils.py

BenAjayiObe · 2020-03-09T13:59:18Z

@BenjaminBossan Thanks for your comments. Ah, apologises for obscuring my changes with the rebase. I'll preserve the commit history, as you say, in the future. 👍

BenAjayiObe · 2020-03-10T11:38:17Z

@ottonemo , Thanks for your comments.

On your first point

obviously we need to make sure that torch.device(self.device) doesn't appear in the code anywhere

This is a great point, thanks for flagging. I will update the code to ensure that it isn't used anywhere.

there are many more options than just 'cpu' and 'cuda'

I feel that it would be a complex piece of work to programmatically account for which device to use under which circumstances for all the devices that pytorch supports. To me, the original spirit of the issue was to remove the burden of the most common case of checking if cuda is available.

Perhaps, instead of allowing an "auto" category for self.device, we could specify that if they enter "cuda" we will automatically default to "cpu" if "cuda" isn't available. We could then output a warning informing the user that cuda wasn't available and has been replaced by "cpu" in such circumstances.

I believe this would resolve your second concern as well. Let me know what you think.

BenjaminBossan · 2020-03-11T21:25:15Z

Perhaps, instead of allowing an "auto" category for self.device, we could specify that if they enter "cuda" we will automatically default to "cpu" if "cuda" isn't available. We could then output a warning informing the user that cuda wasn't available and has been replaced by "cpu" in such circumstances.

I wouldn't go that way. Even with a warning, this can be easily missed. I prefer a hard error than silently switching the device.

We're not changing the default behavior for anyone. So we should keep this in mind: Do we actually make it worse for anyone when we introduce this feature? If not, we should probably proceed. We should only keep in mind that we shouldn't dig ourselves into a hole that we can't get out of (easily) later.

ottonemo · 2020-03-12T10:28:18Z

there are many more options than just 'cpu' and 'cuda'

I feel that it would be a complex piece of work to programmatically account for which device to use under which circumstances for all the devices that pytorch supports. To me, the original spirit of the issue was to remove the burden of the most common case of checking if cuda is available.

Perhaps, instead of allowing an "auto" category for self.device, we could specify that if they enter "cuda" we will automatically default to "cpu" if "cuda" isn't available. We could then output a warning informing the user that cuda wasn't available and has been replaced by "cpu" in such circumstances.

I believe this would resolve your second concern as well. Let me know what you think.

I'm torn on this. In this very circumstance I think it would be generally OK to do it the automagic way since the user can always assert torch.cuda.is_available() beforehand if they want to make sure that CUDA will be used but I also concur with @BenjaminBossan that changing the default which was failing hard before to a soft warning can be potentially dangerous (even if this is 'just' about training time, naively speaking).

Can we do something like this? (roughly sketching):

class NeuralNet:
    def __init__(self, ..., device='auto_cuda'):
        self.device = get_device(device)

while keeping the self.device handling as-is? I would expect get_device for input 'auto_cuda' to return 'cuda' with 'cpu' as fallback while 'cuda' and all other device variants will just be passed through. This would enable us to build similar fallbacks for other devices in the future as well and self.device will point to the actual device the computation will be done on.
What do you think?

thomasjpfan · 2020-03-12T14:08:17Z

I am -1 on doing this magically without a parameter in NeuralNet. The suggestion by @ottonemo looks good.

BenAjayiObe · 2020-03-12T17:32:25Z

@BenjaminBossan , @ottonemo, @thomasjpfan , thank you for your thoughts

What I'm taking away from this is that a user must always initiate the auto feature through deliberate action. I can see the sense in this argument.

Building on your suggestion,@ottonemo , if the intention is to grow Skorch's coverage of Pytorchs other devices over time, should we ensure that the way that user's interact with the auto feature reflects this now?

So for example, I could imagine having a new boolean parameter for NeuralNet(), called something like auto_find_device, that defaults toFalse. Setting this this to True would then ensure that we call your get_device for the devices that we currently support. This way when we extend the list of devices we can auto-find, we just have to add the code to get_device and update the doc string.

Let me know what you think.

BenjaminBossan · 2020-03-14T21:52:58Z

Can we do something like this? (roughly sketching):
class NeuralNet:
    def __init__(self, ..., device='auto_cuda'):
        self.device = get_device(device)

It won't be quite as easy since we should set the init parameters the same as they're passed (for sklearn compatibility). Therefore, we either

need to introduce device_ for the actual device (with an accompanying initialize_device?) or
provide a property for device that returns get_device(self._device).

The disadvantage of 1. is that many people may already rely on self.device in their custom and hence their code could suddenly be subtly wrong.

Option 2. could also have some subtle consequences that I'm not quite sure of (maybe problems with clone or pickle incompatibility).

How about a bit of a different direction: Could we provide a helper function auto_device that does this automatic handling? It would look like this:

net = NeuralNet(..., device=auto_device())  # use cuda if available, else cpu

That could be extended to support new devices.

So for example, I could imagine having a new boolean parameter for NeuralNet(), called something like auto_find_device

Tbh, this sounds like too much complexity for what should be a simple feature.

It's too bad that this feature turns out to be more nuanced than initially thought. However, the None option should be a good addition without drawbacks.

BenAjayiObe · 2020-03-21T22:31:23Z

@BenjaminBossan

It won't be quite as easy since we should set the init parameters the same as they're passed (for sklearn compatibility).

That's a shame. It sounds like it would be sensible to avoid this route then.

Tbh, this sounds like too much complexity for what should be a simple feature.

That's fair

Thanks for your suggestion. I think it would offer a good way to completely decouple the auto device logic from the rest of the code, so future work would be fairly straightforward.

My one concern is whether this route is any less work that just calling cuda.is_available() for the user. I think that for just the cuda vs cpu case, perhaps not. However, as the feature starts to include more device options I think it's justified.

Another reason I like this is because at the moment I'm not sure how I would go about discovering if I have some of the other Pytorch compatible devices available to me and I'm sure this is the case for other developers as well. I could see people using Skorch for this function alone, if it means avoiding googling "how to detect device X on my machine" and going down that rabbit hole.

BenjaminBossan · 2020-03-25T19:15:28Z

My one concern is whether this route is any less work that just calling cuda.is_available() for the user. I think that for just the cuda vs cpu case, perhaps not. However, as the feature starts to include more device options I think it's justified.

So maybe we could postpone this functionality and pick it up again once more devices are supported?

As mentioned above, I still think the None option could be useful for some users and we could go ahead with it. Also, I think it makes sense that we don't directly cast to device but instead always use to_device. Thus, to make progress, I would suggest to 1) remove the 'auto' feature from this PR for now or 2) create a separate PR with only the mentioned changes.

BenAjayiObe · 2020-03-25T20:11:57Z

So maybe we could postpone this functionality and pick it up again once more devices are supported?

This sounds good to me.

Okay, I'll remove the auto functionality, as you say. It's a shame, but I think its makes sense.

BenAjayiObe · 2020-04-03T17:59:07Z

One point that I would like to highlight is that I'm currently defaulting to the fallback_device when skorch.net.NeuralNet#load_params is called and the user has inputted device=None. As the fallback_device is 'cpu' by default and it isn't set when called by load_params, I think that this keeps it inline with the device default across the rest of the repo.

Let me know what you think.

CHANGES.md

BenjaminBossan · 2020-04-04T14:51:45Z

Thanks for the changes, this looks reasonable to me. @ottonemo do you have anything to add?

Another nice thing about this PR is the consistent use of to_device within the code base, which I like.

ottonemo · 2020-04-06T12:11:57Z

Thanks @BenAjayiObe for the work you have put into this. I'm sure we will revisit the auto functionality when there is more substance to it than just convenience. Maybe we can combine this with support for XLA.

BenjaminBossan

Good work, thanks a lot

In #600, we introduced the option to set device=None, which means that skorch should not move any data to any device. However, this setting introduced a bug when trying to load the net, as that code didn't explicitly deal with device=None. With this PR, the bug is fixed. Implementation If device=None, we really have no way of knowing what device to map the parameters to. The most reasonable thing to do is to use the fallback, which is 'cpu'.

Merge pull request #1 from skorch-dev/master

60cd73b

Pulling Latest Skorch Code

BenjaminBossan reviewed Feb 29, 2020

View reviewed changes

skorch/utils.py Outdated Show resolved Hide resolved

skorch/utils.py Outdated Show resolved Hide resolved

BenAjayiObe force-pushed the enhancement/auto-device-option branch from 7f819a9 to 268c22b Compare March 2, 2020 15:12

BenAjayiObe pushed a commit to BenAjayiObe/skorch that referenced this pull request Mar 2, 2020

updated to_device function with auto and None arguments to:

64f8a1a

* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)

BenAjayiObe force-pushed the enhancement/auto-device-option branch from 268c22b to 64f8a1a Compare March 2, 2020 15:14

BenAjayiObe pushed a commit to BenAjayiObe/skorch that referenced this pull request Mar 2, 2020

updated to_device function with auto and None arguments to:

3755ed4

* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)

BenAjayiObe force-pushed the enhancement/auto-device-option branch from 64f8a1a to 3755ed4 Compare March 2, 2020 15:18

updated to_device function with auto and None arguments to:

a1fc5ea

* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)

BenAjayiObe force-pushed the enhancement/auto-device-option branch from 3755ed4 to a1fc5ea Compare March 2, 2020 15:39

BenjaminBossan reviewed Mar 4, 2020

View reviewed changes

skorch/net.py Outdated Show resolved Hide resolved

BenjaminBossan reviewed Mar 4, 2020

View reviewed changes

skorch/tests/test_utils.py Outdated Show resolved Hide resolved

BenjaminBossan reviewed Mar 4, 2020

View reviewed changes

skorch/utils.py Outdated Show resolved Hide resolved

BenjaminBossan reviewed Mar 4, 2020

View reviewed changes

skorch/utils.py Outdated Show resolved Hide resolved

ottonemo reviewed Mar 6, 2020

View reviewed changes

skorch/tests/test_utils.py Outdated Show resolved Hide resolved

skorch/utils.py Outdated Show resolved Hide resolved

minor reformating of code and docstrings

95ae08b

BAjayiobe1 added 2 commits April 3, 2020 18:48

remove auto device code

eb6bfd4

Merge branch 'master' into enhancement/auto-device-option

93271cc

BenjaminBossan reviewed Apr 4, 2020

View reviewed changes

CHANGES.md Outdated Show resolved Hide resolved

BAjayiobe1 added 2 commits April 5, 2020 13:31

small edit to change.md

436ef61

tiny edit to change,md

f3952a5

ottonemo approved these changes Apr 6, 2020

View reviewed changes

Merge branch 'master' into enhancement/auto-device-option

309da50

BenjaminBossan approved these changes Apr 7, 2020

View reviewed changes

BenjaminBossan merged commit 2adee59 into skorch-dev:master Apr 7, 2020

BenjaminBossan mentioned this pull request Nov 7, 2020

Feature/auto device #717

Closed

BenjaminBossan mentioned this pull request Jul 28, 2022

Nets now can be loaded even if device=None #876

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement/auto device option #600

Enhancement/auto device option #600

BenAjayiObe commented Feb 27, 2020 •

edited

Loading

BenjaminBossan left a comment

BenAjayiObe commented Mar 2, 2020 •

edited

Loading

BenjaminBossan commented Mar 4, 2020

ottonemo left a comment •

edited

Loading

BenAjayiObe commented Mar 9, 2020 •

edited

Loading

BenAjayiObe commented Mar 10, 2020 •

edited

Loading

BenjaminBossan commented Mar 11, 2020

ottonemo commented Mar 12, 2020

thomasjpfan commented Mar 12, 2020

BenAjayiObe commented Mar 12, 2020 •

edited

Loading

BenjaminBossan commented Mar 14, 2020

BenAjayiObe commented Mar 21, 2020 •

edited

Loading

BenjaminBossan commented Mar 25, 2020

BenAjayiObe commented Mar 25, 2020

BenAjayiObe commented Apr 3, 2020

BenjaminBossan commented Apr 4, 2020

ottonemo commented Apr 6, 2020

BenjaminBossan left a comment

Enhancement/auto device option #600

Enhancement/auto device option #600

Conversation

BenAjayiObe commented Feb 27, 2020 • edited Loading

Context

Changes

Tests

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenAjayiObe commented Mar 2, 2020 • edited Loading

BenjaminBossan commented Mar 4, 2020

ottonemo left a comment • edited Loading

Choose a reason for hiding this comment

BenAjayiObe commented Mar 9, 2020 • edited Loading

BenAjayiObe commented Mar 10, 2020 • edited Loading

BenjaminBossan commented Mar 11, 2020

ottonemo commented Mar 12, 2020

thomasjpfan commented Mar 12, 2020

BenAjayiObe commented Mar 12, 2020 • edited Loading

BenjaminBossan commented Mar 14, 2020

BenAjayiObe commented Mar 21, 2020 • edited Loading

BenjaminBossan commented Mar 25, 2020

BenAjayiObe commented Mar 25, 2020

BenAjayiObe commented Apr 3, 2020

BenjaminBossan commented Apr 4, 2020

ottonemo commented Apr 6, 2020

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenAjayiObe commented Feb 27, 2020 •

edited

Loading

BenAjayiObe commented Mar 2, 2020 •

edited

Loading

ottonemo left a comment •

edited

Loading

BenAjayiObe commented Mar 9, 2020 •

edited

Loading

BenAjayiObe commented Mar 10, 2020 •

edited

Loading

BenAjayiObe commented Mar 12, 2020 •

edited

Loading

BenAjayiObe commented Mar 21, 2020 •

edited

Loading