Skip to content

Avoid merging if a structure has only one site #4378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 23, 2025

Conversation

kmu
Copy link
Contributor

@kmu kmu commented Apr 23, 2025

Summary

This PR fixes an error that occurs when calling merge_sites on a structure with only one site. For example:

from pymatgen.core import Structure, Lattice
cu = Structure(Lattice.from_parameters(2.5, 2.5, 2.5, 60, 60, 60), species=["Cu"], coords=[[0.0, 0.0, 0.0]])
cu.merge_sites(mode="delete")

This currently raises the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mrok/work/pymatgen/.venv/lib/python3.11/site-packages/pymatgen/core/structure.py", line 4840, in merge_sites
    clusters = fcluster(linkage(squareform((dist_mat + dist_mat.T) / 2)), tol, "distance")
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mrok/work/pymatgen/.venv/lib/python3.11/site-packages/scipy/cluster/hierarchy.py", line 1033, in linkage
    n = int(distance.num_obs_y(y))
            ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mrok/work/pymatgen/.venv/lib/python3.11/site-packages/scipy/spatial/distance.py", line 2742, in num_obs_y
    raise ValueError("The number of observations cannot be determined on "
ValueError: The number of observations cannot be determined on an empty distance matrix.

The root cause is that the distance matrix is a 1×1 array [[0.0]], which leads to a failure in scipy.cluster.hierarchy.linkage.

This PR resolves the issue by skipping the merging step and returning the original structure unchanged when only one site is present.

A unit test is also added to verify this behavior and prevent regressions.

Todos

No more todos.

Checklist

  • Google format doc strings added. Check with ruff.
  • Type annotations included. Check with mypy.
  • Tests added for new features/fixes.
  • If applicable, new classes/functions/modules have duecredit @due.dcite decorators to reference relevant papers by DOI (example)

Tip: Install pre-commit hooks to auto-check types and linting before every commit:

pip install -U pre-commit
pre-commit install

@kmu kmu requested review from shyuep and mkhorton as code owners April 23, 2025 13:50
@shyuep shyuep merged commit 040b2c2 into materialsproject:master Apr 23, 2025
41 of 43 checks passed
@shyuep
Copy link
Member

shyuep commented Apr 23, 2025

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants