Description
Description
There is an issue with the calculation of formula_units in the PhononBSDOSDoc class within the file atomate2.common.schemas.phonons. The current implementation is as follows:
formula_units = (
structure.composition.num_atoms
/ structure.composition.reduced_composition.num_atoms
)
The problem arises because the reduced_composition in pymatgen does not always reduce formulas to the minimum number of elements. Instead, it reduces them to the "most stable molecule" as defined in pymatgen.core.composition -> Composition.special_formulas.
This leads, for these elements, to an incorrect formula_units, which is half of the expected value.
In case the structure is composed of only one atom, the formula unit is non-integer.
To Reproduce
Taking, for instance, this structure in xsf format:
CRYSTAL
PRIMVEC
0.00000000000000 1.56088551189537 1.56088551189537
1.56088551189537 0.00000000000000 1.56088551189537
1.56088551189537 1.56088551189537 0.00000000000000
PRIMCOORD
1 1
7 0.00000000000000 0.00000000000000 0.00000000000000
this code snippet
from pymatgen.core import Structure
from pymatgen.core.composition import Composition
structure = Structure.from_file('N-fcc.xsf')
print(f"Composition: {structure.composition} with {structure.composition.num_atoms} atoms\n\
Reduced composition: {structure.composition.reduced_composition} with {structure.composition.reduced_composition.num_atoms} atoms.\n\
formula_units: {structure.composition.num_atoms/structure.composition.reduced_composition.num_atoms}")
provides:
Composition: N1 with 1.0 atoms
Reduced composition: N2 with 2.0 atoms.
formula_units: 0.5
Expected behavior
The expected behavior would be to have a reduced composition N1 and so 1 atom. The formula_units would then be 1.
Proposed solution
Either to discuss the behavior of get_reduced_formula_and_factor at the pymatgen level (https://github.com/materialsproject/pymatgen/blob/4f536fc6c1798740a0f1402b0456675011706d75/src/pymatgen/core/composition.py#L428)
or change the definition of formula_units to
from pymatgen.core.ion import Ion
formula_units = (
structure.composition.num_atoms
/ Ion(structure.composition).reduced_composition.num_atoms
)
We are encountering the same problem with the outputdoc of the ABINIT phonon workflow, and we think we should find a common solution for all the atomate2 workflows.