-
Notifications
You must be signed in to change notification settings - Fork 398
Enhancement/auto device option #600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement/auto device option #600
Conversation
Pulling Latest Skorch Code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this addition, you did a very good job of finding all instances that need to be changed and of extending the tests. We're almost good to go.
I have a few minor comments that you could address. Apart from those, could you please update the CHANGES.md to include this new addition?
Hi @BenjaminBossan , thank you very much for your comments. I'll action those asap. I just wanted to apologise as I've accidentally submitted this PR prematurely, hence the empty MR description. In addition to actioning your comments, I also have to implement the This PR should be updated shortly. |
7f819a9
to
268c22b
Compare
* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)
268c22b
to
64f8a1a
Compare
* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)
64f8a1a
to
3755ed4
Compare
* use the fastest device available * leave the device type unmodified respectively (skorch-dev#600)
3755ed4
to
a1fc5ea
Compare
@BenAjayiObe thanks for adding the Since it's a more fundamental change, I would like @ottonemo or @thomasjpfan to also have a look at this before merging. Small comment: It's nice that you rebased your existing commit. However, it makes it harder for me to review the changes because it's not obvious what was changed since my last review and what stayed the same. Could you please, if you add more changes, just merge them? Then, before, merging the whole PR into master, we will squash it into one commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, this is looking pretty mature already.
I need to address two concerns though. Firstly:
>>> torch.device('auto')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: Expected one of cpu, cuda, mkldnn, opengl, opencl, ideep, hip, msnpu device type at start of device string: auto
- obviously we need to make sure that
torch.device(self.device)
doesn't appear in the code anywhere (it still does at some places, for example when loading parameters) - (more important) there are many more options than just
'cpu'
and'cuda'
so'auto'
is misleading or at least not well defined. On an AMD platform I would expect 'auto' to select'hip'
or'opencl'
instead of'cuda'
, for example.
Secondly, load_params
won't work with this change when setting device='auto'
since it relies on self.device
for pointing to the target device. Of course it is possible to check which device is requested in load_params
as well but maybe it would be smarter to do this once during initialization and rely on self.device
afterwards.
What are your opinions?
@BenjaminBossan Thanks for your comments. Ah, apologises for obscuring my changes with the rebase. I'll preserve the commit history, as you say, in the future. 👍 |
@ottonemo , Thanks for your comments. On your first point
This is a great point, thanks for flagging. I will update the code to ensure that it isn't used anywhere.
I feel that it would be a complex piece of work to programmatically account for which device to use under which circumstances for all the devices that pytorch supports. To me, the original spirit of the issue was to remove the burden of the most common case of checking if Perhaps, instead of allowing an I believe this would resolve your second concern as well. Let me know what you think. |
I wouldn't go that way. Even with a warning, this can be easily missed. I prefer a hard error than silently switching the device. We're not changing the default behavior for anyone. So we should keep this in mind: Do we actually make it worse for anyone when we introduce this feature? If not, we should probably proceed. We should only keep in mind that we shouldn't dig ourselves into a hole that we can't get out of (easily) later. |
I'm torn on this. In this very circumstance I think it would be generally OK to do it the automagic way since the user can always assert Can we do something like this? (roughly sketching):
while keeping the |
I am -1 on doing this magically without a parameter in |
@BenjaminBossan , @ottonemo, @thomasjpfan , thank you for your thoughts What I'm taking away from this is that a user must always initiate the auto feature through deliberate action. I can see the sense in this argument. Building on your suggestion,@ottonemo , if the intention is to grow Skorch's coverage of Pytorchs other devices over time, should we ensure that the way that user's interact with the auto feature reflects this now? So for example, I could imagine having a new boolean parameter for Let me know what you think. |
It won't be quite as easy since we should set the init parameters the same as they're passed (for sklearn compatibility). Therefore, we either
The disadvantage of 1. is that many people may already rely on Option 2. could also have some subtle consequences that I'm not quite sure of (maybe problems with How about a bit of a different direction: Could we provide a helper function net = NeuralNet(..., device=auto_device()) # use cuda if available, else cpu That could be extended to support new devices.
Tbh, this sounds like too much complexity for what should be a simple feature. It's too bad that this feature turns out to be more nuanced than initially thought. However, the |
That's a shame. It sounds like it would be sensible to avoid this route then.
That's fair Thanks for your suggestion. I think it would offer a good way to completely decouple the auto device logic from the rest of the code, so future work would be fairly straightforward. My one concern is whether this route is any less work that just calling Another reason I like this is because at the moment I'm not sure how I would go about discovering if I have some of the other Pytorch compatible devices available to me and I'm sure this is the case for other developers as well. I could see people using Skorch for this function alone, if it means avoiding googling "how to detect device X on my machine" and going down that rabbit hole. |
So maybe we could postpone this functionality and pick it up again once more devices are supported? As mentioned above, I still think the |
This sounds good to me. Okay, I'll remove the auto functionality, as you say. It's a shame, but I think its makes sense. |
One point that I would like to highlight is that I'm currently defaulting to the Let me know what you think. |
Thanks for the changes, this looks reasonable to me. @ottonemo do you have anything to add? Another nice thing about this PR is the consistent use of |
Thanks @BenAjayiObe for the work you have put into this. I'm sure we will revisit the auto functionality when there is more substance to it than just convenience. Maybe we can combine this with support for XLA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, thanks a lot
In #600, we introduced the option to set device=None, which means that skorch should not move any data to any device. However, this setting introduced a bug when trying to load the net, as that code didn't explicitly deal with device=None. With this PR, the bug is fixed. Implementation If device=None, we really have no way of knowing what device to map the parameters to. The most reasonable thing to do is to use the fallback, which is 'cpu'.
In #600, we introduced the option to set device=None, which means that skorch should not move any data to any device. However, this setting introduced a bug when trying to load the net, as that code didn't explicitly deal with device=None. With this PR, the bug is fixed. Implementation If device=None, we really have no way of knowing what device to map the parameters to. The most reasonable thing to do is to use the fallback, which is 'cpu'.
Context
We would like to modify the
to_device
function to add the option for users to choose"auto"
orNone
. By selecting"auto"
the user will be allowing the skorch tool to automatically detect if GPU's is available. This functionality will mean that the user is no longer burdens with writing repetitive boiler plate code like the following:By selecting,
None
, we leave all tensors and modules unmodified so that a user can still customise how they allocate the devices themselves without us interfering.Changes
CHANGES.md
: Added note about changesexamples/benchmarks/mnist.py
: Updated Mnist example to useto_device
function.examples/word_language_model/data.py
: Updated language model example to useto_device
function.skorch/net.py
: Updated device parameter description and addedto_device
where needed.skorch/tests/test_utils.py
: Added tests for new"auto"
andNone
options into_device
.skorch/utils.py
: Updatedto_device
function to automatically find the fastest device or leave the input unmodified, depending on the argument passed.Tests
class TestToDevice
to update the expected value depending on the argument passed toto_device
.Packed Sequence
object is passed toto_device
as this object needs special handling. It is a child of a namedTuple, however, ifto_device
operated on aPackedSequence
object the same way it does to a regular tuple, it would break the execution as some of its attributes aren't tensors.