Skip to content

Fix device assignment in get_device_name for distributed training #3303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 3, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions sentence_transformers/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -1517,18 +1517,22 @@ def wrapper(self, *args, **kwargs):
return wrapper


def get_device_name() -> Literal["mps", "cuda", "npu", "hpu", "cpu"]:
def get_device_name() -> str:
"""
Returns the name of the device where this module is running on.

It's a simple implementation that doesn't cover cases when more powerful GPUs are available and
not a primary device ('cuda:0') or MPS device is available, but not configured properly.
This function only supports single device or basic distributed training setups.
In distributed mode for cuda device, it uses the rank to assign a specific CUDA device.

Returns:
str: Device name, like 'cuda' or 'cpu'
str: Device name, like 'cuda:2', 'mps', 'npu', 'hpu', or 'cpu'
"""
if torch.cuda.is_available():
return "cuda"
if torch.distributed.is_initialized():
local_rank = torch.distributed.get_rank()
else:
local_rank = int(os.environ.get("LOCAL_RANK", 0))
return f"cuda:{local_rank}"
elif torch.backends.mps.is_available():
return "mps"
elif is_torch_npu_available():
Expand Down
Loading