Skip to content

[Bug] constraints.Interval is overzealous in forcing GPU synchronizes #2258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrcslws opened this issue Jan 21, 2023 · 3 comments
Closed

[Bug] constraints.Interval is overzealous in forcing GPU synchronizes #2258

mrcslws opened this issue Jan 21, 2023 · 3 comments
Labels

Comments

@mrcslws
Copy link

mrcslws commented Jan 21, 2023

🐛 Bug

By default, if any model parameters use a constraints.Interval, fetching these parameters forces a GPU synchronize on every forward pass, due to this line:

if max_bound == math.inf or min_bound == -math.inf:

True, this code can be disabled by calling gpytorch.settings.debug._set_state(False), but even with debug enabled this check ought to only happen once, not on every transform.

To reproduce

import gpytorch
import torch

device = torch.device("cuda")
interval = gpytorch.constraints.Interval(0, 1).to(device)
raw = torch.tensor([-0.42], device=device)

# Arguably it's okay for this check to occur on the first transform.
# If you choose that fix, then uncomment this line to test your fix.
# constrained = interval.transform(raw)

torch.cuda.set_sync_debug_mode(2)
constrained = interval.transform(raw)
Traceback (most recent call last):
  File "bug.py", line 17, in <module>
    constrained = interval.transform(raw)
  File "/opt/conda/envs/py39/lib/python3.9/site-packages/gpytorch/constraints/constraints.py", line 118, in transform
    if max_bound == math.inf or min_bound == -math.inf:
RuntimeError: called a synchronizing CUDA operation

Expected Behavior

This sync should happen at most once, since lower_bound and upper_bound are constant.

  • Easy fix: Add a flag on the self to denote whether the check has occurred, and only perform the check on the first call to transform or untransform
  • More elegant fix: this check should happen during __init__ (but this would require some refactoring)

System information

gpytorch: 1.9.1
torch: 1.13.1+cu116
Linux

@mrcslws mrcslws added the bug label Jan 21, 2023
@Balandat
Copy link
Collaborator

Thanks for flagging this. I do think that the right place for this is __init__, the current design to check the same condition in each pass is not great. An easy way to do this would be to make some "private" _construct method that has the bulk of the work and then just call that from the individual constructors. I can put up a PR for this.

@Balandat
Copy link
Collaborator

actually should just be able to gate this by the class instance

Balandat added a commit that referenced this issue Jan 21, 2023
Addresses #2258, avoids repeatedly checking the same condition in each forward pass. Avoids forcing GPU synchronization.
gpleiss pushed a commit that referenced this issue Jan 30, 2023
* Move  infinite interval bounds check into Internval constructor

Addresses #2258, avoids repeatedly checking the same condition in each forward pass. Avoids forcing GPU synchronization.

* Remove second obsolete check in forward, some type annotations.
@gpleiss
Copy link
Member

gpleiss commented Jan 30, 2023

@mrcslws - it looks like #2259 fixes this issue? Re-open if not.

@gpleiss gpleiss closed this as completed Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants