Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

structure.to_ase_atoms() modifies structure.properties in place #4289

Closed
wolearyc opened this issue Feb 11, 2025 · 5 comments · Fixed by #4297
Closed

structure.to_ase_atoms() modifies structure.properties in place #4289

wolearyc opened this issue Feb 11, 2025 · 5 comments · Fixed by #4297
Labels

Comments

@wolearyc
Copy link
Contributor

wolearyc commented Feb 11, 2025

Python version

3.12.4

Pymatgen version

2025.1.24

Operating system version

MacOS

Current behavior

Problems arise when converting Structure's with spacegroup information embedded in structure.properties into ase.Atoms objects.

MINIMAL EXAMPLE

import json
from pymatgen.core import Structure

structure = Structure.from_dict( # apologies for the hardcoded dictionary
    {
        "@module": "pymatgen.core.structure",
        "@class": "Structure",
        "charge": 0,
        "lattice": {
            "matrix": [
                [4.256387634803149, 4.314320171475424e-32, -6.507506952342294e-32],
                [3.3969381290161766e-32, 4.256387634803149, -2.3941644265810876e-18],
                [-1.1858272430092808e-32, -2.3941644265810356e-18, 4.256387634803149],
            ],
            "pbc": (True, True, True),
            "a": 4.256387634803149,
            "b": 4.256387634803149,
            "c": 4.256387634803149,
            "alpha": 90.0,
            "beta": 90.0,
            "gamma": 90.0,
            "volume": 77.11227544547246,
        },
        "properties": {
            "spacegroup": {"number": 225, "setting": 1},
            "unit_cell": "conventional",
        },
        "sites": [
            {
                "species": [{"element": "Mg", "occu": 1}],
                "abc": [0.0, 0.0, 0.0],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 0,
                },
                "label": "Mg",
                "xyz": [0.0, 0.0, 0.0],
            },
            {
                "species": [{"element": "Mg", "occu": 1}],
                "abc": [3.03275477652898e-48, 0.5, 0.5],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 0,
                },
                "label": "Mg",
                "xyz": [1.1055554430034493e-32, 2.1281938174015744, 2.1281938174015744],
            },
            {
                "species": [{"element": "Mg", "occu": 1}],
                "abc": [0.5, 1.8441828973042882e-35, 0.5],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 0,
                },
                "label": "Mg",
                "xyz": [
                    2.1281938174015744,
                    -1.1970822132904962e-18,
                    2.1281938174015744,
                ],
            },
            {
                "species": [{"element": "Mg", "occu": 1}],
                "abc": [0.5, 0.5, 3.414593957670284e-35],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 0,
                },
                "label": "Mg",
                "xyz": [
                    2.1281938174015744,
                    2.1281938174015744,
                    -1.1970822132905761e-18,
                ],
            },
            {
                "species": [{"element": "O", "occu": 1}],
                "abc": [0.5, 0.5, 0.5],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 1,
                },
                "label": "O",
                "xyz": [2.1281938174015744, 2.1281938174015744, 2.1281938174015744],
            },
            {
                "species": [{"element": "O", "occu": 1}],
                "abc": [0.5, 5.727981906470981e-49, -1.1457103901908975e-48],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 1,
                },
                "label": "O",
                "xyz": [
                    2.1281938174015744,
                    2.1571600857377124e-32,
                    -3.2537534761711473e-32,
                ],
            },
            {
                "species": [{"element": "O", "occu": 1}],
                "abc": [2.3421825782682186e-48, 0.5, -1.976290217125112e-35],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 1,
                },
                "label": "O",
                "xyz": [
                    1.6984690645080894e-32,
                    2.1281938174015744,
                    -1.1970822132905438e-18,
                ],
            },
            {
                "species": [{"element": "O", "occu": 1}],
                "abc": [2.1774700692994772e-48, 1.8441828973042882e-35, 0.5],
                "properties": {
                    "selective_dynamics": [False, False, False],
                    "spacegroup_kinds": 1,
                },
                "label": "O",
                "xyz": [
                    -5.9291362150463945e-33,
                    -1.1970822132905178e-18,
                    2.1281938174015744,
                ],
            },
        ],
    }
)
print(structure.properties) # note the space group is represented by a dictionary
structure.to_json()  # works

structure.to_ase_atoms()   # convert to ase atoms
print(structure.properties) # uh oh! to_ase_atoms has a side effect!
structure.to_json()            # this throws an exception, as ase's `Spacegroup` object cannot be converted into json

OUTPUT

{'spacegroup': {'number': 225, 'setting': 1}, 'unit_cell': 'conventional'}
{'spacegroup': Spacegroup(225, setting=1), 'unit_cell': 'conventional'}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 137
    135 structure.to_ase_atoms()  # convert to ase atoms
    136 print(structure.properties)  # uh oh! to_ase_atoms has a side effect!
--> 137 structure.to_json()  # this throws an exception, as ase's `Spacegroup` object cannot be converted into json

File ~/Documents/ff-pipe/.venv/lib/python3.12/site-packages/monty/json.py:239, in MSONable.to_json(self)
    235 def to_json(self) -> str:
    236     """
    237     Returns a json string representation of the MSONable object.
    238     """
--> 239     return json.dumps(self, cls=MontyEncoder)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
    234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
--> 238     **kw).encode(obj)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/encoder.py:200, in JSONEncoder.encode(self, o)
    196         return encode_basestring(o)
    197 # This doesn't pass the iterator directly to ''.join() because the
    198 # exceptions aren't as detailed.  The list call should be roughly
    199 # equivalent to the PySequence_Fast that ''.join() would do.
--> 200 chunks = self.iterencode(o, _one_shot=True)
    201 if not isinstance(chunks, (list, tuple)):
    202     chunks = list(chunks)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/encoder.py:258, in JSONEncoder.iterencode(self, o, _one_shot)
    253 else:
    254     _iterencode = _make_iterencode(
    255         markers, self.default, _encoder, self.indent, floatstr,
    256         self.key_separator, self.item_separator, self.sort_keys,
    257         self.skipkeys, _one_shot)
--> 258 return _iterencode(o, 0)

File ~/Documents/ff-pipe/.venv/lib/python3.12/site-packages/monty/json.py:669, in MontyEncoder.default(self, o)
    667     d = self._update_name_object_map(o)
    668 else:
--> 669     raise TypeError(
    670         f"Object of type {o.__class__.__name__} is not JSON serializable"
    671     )
    673 if "@module" not in d:
    674     d["@module"] = str(o.__class__.__module__)

TypeError: Object of type Spacegroup is not JSON serializable

Expected Behavior

The second call to to_json should not raise an exception. In addition, Structure.to_ase_atoms() should be side effect free.

Minimal example

See above.

Relevant files to reproduce this bug

No response

@wolearyc wolearyc added the bug label Feb 11, 2025
@benrich37
Copy link
Contributor

The minimal example also throws an error for me. It looks like this bug is occurring on lines 205 - 215 of pymatgen.io.ase

       # Atoms.info <---> Structure.properties
        if properties := structure.properties:
            atoms.info =  properties # This links values in atoms.info to the corresponding values in structure.properties

        # Regenerate Spacegroup object from `.todict()` representation
        if isinstance(atoms.info.get("spacegroup"), dict):
            atoms.info["spacegroup"] = Spacegroup( # So now here atoms.info["spacegroup"] is actually being changed by changing structure.properties["spacegroup"]
                atoms.info["spacegroup"]["number"],
                setting=atoms.info["spacegroup"].get("setting", 1),
            )

The following is a quick change that fixes the error raised by your example, but I have not yet ran the test-checks with this change implemented.

        # Atoms.info <---> Structure.properties
        if properties := structure.properties:
            # Assign a copy to unlink the keys in atoms.info to structure.properties
            atoms.info = properties.copy()

        # Regenerate Spacegroup object from `.todict()` representation
        if isinstance(atoms.info.get("spacegroup"), dict):
            atoms.info["spacegroup"] = Spacegroup(
                atoms.info["spacegroup"]["number"],
                setting=atoms.info["spacegroup"].get("setting", 1),
            )

I know people tend to like to keep thinks light, so it might be better to just make "spacegroup" an exception to the properties set on atoms as links to avoid the in-place changing.

        # Atoms.info <---> Structure.properties
        # Skip the following keys who's values in atoms.info are subject to change
        # to avoid in-place modification of the structure object.
        keys_to_skip = ["spacegroup"]
        # Walrus assign properties from structure.properties with truthy values and not in keys_to_skip
        if properties := {k: v for k, v in structure.properties.items() if v and k not in keys_to_skip}:
            atoms.info = properties

        # Regenerate Spacegroup object from `.todict()` representation
        if isinstance(structure.properties.get("spacegroup"), dict):
            atoms.info["spacegroup"] = Spacegroup(
                structure.properties["spacegroup"]["number"],
                setting=structure.properties["spacegroup"].get("setting", 1),
            )

@Andrew-S-Rosen
Copy link
Member

Happy to review a PR for this!

@wolearyc
Copy link
Contributor Author

wolearyc commented Feb 15, 2025

Thanks for the fix @benrich37! I would have opened a PR, but was unfortunately pretty swamped this week :)

My vote is for your first solution where properties is simply copied. We could perhaps use deepcopy here to handle the case where atoms.info includes collections, etc. I think this solution has simpler control flow and, critically, does not require us to anticipate all possible 'special' keys in Structure.properties. I personally don't think Atoms objects generated by to_ase_atoms() should ever share memory with original Structure objects, and I think using deepcopy will move us closer to that ideal.

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Feb 15, 2025

As the person who wrote much of the ASE converter, I wholeheartedly agree with you that nothing should be mutated! That was an oversight on my part. I also agree that a deepcopy would be wise.

@wolearyc
Copy link
Contributor Author

wolearyc commented Feb 19, 2025

Just submitted a PR - turns out there was a similar issue with get_structure, so I fixed that too and added a couple tests. Thanks @Andrew-S-Rosen for your work on the converter! It makes my life quite a bit easier :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants