Releases: materialsproject/pymatgen
Releases · materialsproject/pymatgen
v2025.6.14
- Treat LATTICE_CONSTRAINTS as is for INCARs.
- PR #4425
JDFTXOutfileSlice.trajectory
revision by @benrich37
Major changes:- feature 1:
JDFTXOutfileSlice.trajectory
is now initialized withframe_properties
set
--JOutStructure.properties
filled with relevant data forframe_properties
-- More properties added toJOutStructure.site_properties
Todos
- Remove class attributes in
JOutStructure
now redundant to data stored inJOutStructure.properties
andJOutStructure.site_properties
- feature 1:
- PR #4431 Single source of truth for POTCAR directory structure by @esoteric-ephemera
Modifies the pymatgen CLI to use the same POTCAR library directory structure as inpymatgen.io.vasp.inputs
to close #4430. Possibly breaking from the CLI side (the directory structure will change)
Pinging @mkhorton since #4424 was probably motivated by similar concerns? - PR #4433 Speed up symmetry functions with faster
is_periodic_image
algorithm by @kavanase
I noticed that in some of ourdoped
testing workflows,SpacegroupAnalyzer.get_primitive_standard_structure()
is one of the main bottlenecks (as to be expected). One of the dominant cost factors here is the usage ofis_periodic_image
, which can be expensive for large structures due to manynp.allclose()
calls.
This PR implements a small change to instead use an equivalent (but faster) pure Python loop, which also breaks early if the tolerance is exceeded.
In my test case, this reduced the time spent onis_periodic_image
(and thusSpacegroupAnalyzer.get_primitive_standard_structure()
) from 35s to 10s. - PR #4432 Fingerprint sources by @JaGeo
Add correct papers to tanimoto fingerprints - PR #4061 Fix branch directory check in
io.vasp.outputs.get_band_structure_from_vasp_multiple_branches
by @DanielYang59Summary
- Fix branch directory check in
io.vasp.outputs.get_band_structure_from_vasp_multiple_branches
, to fix #4060 - Improve unit test (waiting for data, I don't have experience with "VASP multi-branch bandstructure calculation")
- Fix branch directory check in
- PR #4409 Packmol constraints by @davidwaroquiers
Added possibility to set individual constraints in packmol.
Added some sanity checks.
Added unit tests. - PR #4428 Fixes a bug in
NanoscaleStability.plot_one_stability_map
andplot_all_stability_map
. by @kmu
Major changes:- Replaced incorrect
ax.xlabel()
andax.ylabel()
calls with correctax.set_xlabel()
andax.set_ylabel()
. - Added
ax.legend()
toplot_all_stability_map
so that labels passed viaax.plot(..., label=...)
are displayed. - Added
test_plot()
totest_surface_analysis.py
.
- Replaced incorrect
- PR #4424 Add additional name mappings for new LDA v64 potcars by @mkhorton
As title. - PR #4426 Fix uncertainty as int for
EnergyAdjustment
by @DanielYang59- Avoid
==
or!=
for possible float comparison - Fix uncertainty as int for
EnergyAdjustment
cannot generate repr:
Gives:from pymatgen.entries.computed_entries import EnergyAdjustment print(EnergyAdjustment(10, uncertainty=0))
Traceback (most recent call last): File "/Users/yang/developer/pymatgen/test_json.py", line 25, in <module> print(EnergyAdjustment(10, uncertainty=0)) File "/Users/yang/developer/pymatgen/src/pymatgen/entries/computed_entries.py", line 108, in __repr__ return f"{type(self).__name__}({name=}, {value=:.3}, {uncertainty=:.3}, {description=}, {generated_by=})" ^^^^^^^^^^^^^^^^^ ValueError: Precision not allowed in integer format specifier
- Avoid
- PR #4421 Cache
Lattice
property (lengths/angles/volume
) for much fasterStructure.as_dict
by @DanielYang59Summary
lengths/angles/volume
ofLattice
would now be cached, related to #4385verbosity
inas_dict
ofPeriodicSite/Lattice
now explicitly requires literal 0 or 1 to be consistent with docstring, instead of checkingif verbosity > 0
(currently in grace period, only warning issued) https://github.com/materialsproject/pymatgen/blob/34608d0b92166e5fc4a9dd52ed465ae7dccfa525/src/pymatgen/core/lattice.py#L904-L905
Cache frequently used
CurrentlyLattice
propertieslength/angles/volume
is not cached and is frequently used, for example accessing all lattice parameter related property would lead tolength/angles
being repeatedly calculated: https://github.com/materialsproject/pymatgen/blob/34608d0b92166e5fc4a9dd52ed465ae7dccfa525/src/pymatgen/core/lattice.py#L475-L524
structure.as_dict
now around 8x faster
Before (1000 structure, each has 10-100 atoms):Now:Total time: 3.29617 s File: create_dummp_json_structure.py Function: generate_and_save_structures at line 34 Line # Hits Time Per Hit % Time Line Contents ============================================================== 34 @profile 35 def generate_and_save_structures(n, output_dir): 36 1 20.0 20.0 0.0 os.makedirs(output_dir, exist_ok=True) 37 38 1001 222.0 0.2 0.0 for i in range(n): 39 1000 291789.0 291.8 8.9 structure = generate_dummy_structure() 40 1000 583.0 0.6 0.0 filename = f"structure_{i:04d}.json.gz" 41 1000 1942.0 1.9 0.1 filepath = os.path.join(output_dir, filename) 42 43 2000 224549.0 112.3 6.8 with gzip.open(filepath, "wb") as f: 44 1000 2761900.0 2761.9 83.8 dct = structure.as_dict() 45 1000 15163.0 15.2 0.5 f.write(orjson.dumps(dct))
Total time: 0.949622 s File: create_dummp_json_structure.py Function: generate_and_save_structures at line 34 Line # Hits Time Per Hit % Time Line Contents ============================================================== 34 @profile 35 def generate_and_save_structures(n, output_dir): 36 1 37.0 37.0 0.0 os.makedirs(output_dir, exist_ok=True) 37 38 1001 195.0 0.2 0.0 for i in range(n): 39 1000 286696.0 286.7 30.2 structure = generate_dummy_structure() 40 1000 511.0 0.5 0.1 filename = f"structure_{i:04d}.json.gz" 41 1000 1843.0 1.8 0.2 filepath = os.path.join(output_dir, filename) 42 43 2000 214130.0 107.1 22.5 with gzip.open(filepath, "wb") as f: 44 1000 431677.0 431.7 45.5 dct = structure.as_dict() 45 1000 14533.0 14.5 1.5 f.write(orjson.dumps(dct))
Also notelattice
(the performance bottleneck) is not used in the dict for site: https://github.com/materialsproject/pymatgen/blob/34608d0b92166e5fc4a9dd52ed465ae7dccfa525/src/pymatgen/core/structure.py#L2856-L2857
So we could modifyas_dict
to control whether lattice would be generated at all
This could reduce the runtime slightly so I guess it's not worth the effort:Total time: 0.867376 s
- PR #4391 Add custom as_dict/from_dict method for proper initialization of attributes of IcohpCollection by @naik-aakash
CurrentlyIcohpCollection
instance is not serialized correctly, thus I added custom from_dict and as_dict methods here.
v2025.5.28
- PR #4411 Add
orjson
as required dependency as default JSON handler when custom encoder/decoder is not needed by @DanielYang59 - PR #4417 adding radd dunder method to Volumetric data + test_outputs by @wladerer
- PR #4418
JDFTXOutfileSlice
Durability Improvement by @benrich37
Major changes:- feature 1: Improved durability of
JDFTXOutfileSlice._from_out_slice
method (less likely to error out on unexpected termination)
-- So long as one step of electronic minimization has started on an out file slice, parsing shouldn't error out - fix 1: Allow partially dumped eigstats
- fix 2: Added missing optical band gap dumped by eigstats
- fix 3: Protect the final
JOutStructure
in initializing aJOutStructures
with a try/except block - fix 4: Detect if positions were only partially dumped and revert to data from
init_structure
inJOutStructure
- fix 5: Prevent partially dumped matrices from being used in initializing a
JOutStructure
Todos
- feature 1: Ensure parse-ability as long as a
JDFTXOutfileSlice.infile
can be initialized
- feature 1: Improved durability of
- PR #4419 Fix Molecule.get_boxed_structure when reorder=False by @gpetretto
- PR #4416
JDFTXInfile
Comparison Methods by @benrich37
Major changes:- feature 1: Convenience methods for comparing
JDFTXInfile
objects
--JDFTXInfile.is_comparable_to
--- Returns True if at least one tag is found different
--- Optional argumentsexclude_tags
,exclude_tag_categories
,ensure_include_tags
to ignore certain tags in the comparison
----exclude_tag_categories
defaults to["export", "restart", "structure"]
as"export"
and"restart"
are very rarely pertinent to comparability,"structure"
as subtags of this category are generally the one thing being intentionally changed in comparisons (ie different local minima or a slab with/without an adsorbate)
--JDFTXInfile.get_filtered_differing_tags
--- What is used inJDFTXInfile.is_comparable_to
to get filtered differing tags betweenJDFTXInfile
objects
--- Convenient as a "verbose" alternative toJDFTXInfile.is_comparable_to
--AbstractTag.is_equal_to
andAbstractTag._is_equal_to
--- Used in tag comparison for finding differing tags
---AbstractTag._is_equal_to
is an abstract method that must be implemented for eachAbstractTag
inheritor - feature 2: Default
JDFTXInfile
objectpymatgen.io.jdftx.inputs.ref_infile
-- Initialized from reading default JDFTx settings frompymatgen.io.jdftx.jdftxinfile_default_inputs.default_inputs: dict
-- Used inJDFTXInfile.get_differing_tags_from
for tags inself
missing fromother
that are identical to the default setting - fix 1: Re-ordered contents of
JDFTXInfile
to follow the order: magic methods -> class methods / transformation methods -> validation methods -> properties -> private methods - fix 2: Checking for
'selective_dynamics'
insite_properties
for aStructure
passed inJDFTXInfile.from_structure
(used ifselective_dynamics
argument left asNone
)
Todos
- feature 1: Add examples to documentation on how to properly use new comparison methods
- feature 2: Improve the mapping of
TagContainer
s to their default values
-- The current implementation of comparison for tags to default values only works if the tag as written exactly matches the full default value - at the very least the missing subtags of a partially filledTagContainer
needs to be filled with the default values before comparing to the full default value
-- Some subtags also change depending on the other subtags present for a particular tag (ie convergence threshold depending on algorithm specified for'fluid-minimize'
, so an improved mapping for dynamic default values needs to be implemented
- feature 1: Convenience methods for comparing
- PR #4413
JDFTXOutputs.bandstructure: BandStructure
by @benrich37
Major changes:- feature 1: Added 'kpts' storable variable to JDFTXOutputs
-- Currently only able to obtain from the 'bandProjections' file - feature 2: Added
bandstructure
attribute to JDFTXOutputs
-- Standard pymatgenBandStrucure
object
-- Request-able as astore_var
, but functions slightly differently
--- Ensures 'eigenvals' and 'kpts' are instore_vars
and then is deleted
-- Initialized if JDFTXOutputs has successfully stored at least 'kpts' and 'eigenvals'
-- Fillsprojections
field if also has stored 'bandProjections' - feature 3: Added
wk_list
toJDFTXOutputs
-- List of weights for each k-point
-- Currently doesn't have a use, but will be helpful forElecData
initializing incrawfish
Todos
- feature 1: Add reading 'kpts' from the standalone 'kPts' file dumped by JDFTx
- feature 2: Outline how we might initialize
BandStructureSymmLine
(s) for calculations with explicitly defined 'kpoint' tags, as using 'kpoint's instead ofkpoint-folding
is most likely an indicator of a band-structure calculation
- feature 1: Added 'kpts' storable variable to JDFTXOutputs
- PR #4415 speed-up Structure instantiation by @danielzuegner
This PR speeds up the instantiation ofStructure
objects by preventing hash collisions in thelru_cache
ofget_el_sp
and increasing itsmaxsize
. The issue is that currentlyElement
objects are hashed to the same value as the integer atomic numbers (e.g.,Element[H]
maps to the same hash asint(1)
). This forces thelru_hash
to perform an expensive__eq__
comparison between the two, which reduces the performance of instantiating manyStructure
objects. Also here we increase themaxsize
ofget_el_sp
'slru_cache
to 1024 for further performance improvements.
This reduces time taken to instantiate 100,000Structure
objects from 31 seconds to 8.7s (avoid hash collisions) to 6.1s (also increasemaxsize
to 1024). - PR #4410 JDFTx Inputs - boundary value checking by @benrich37
Major changes:- feature 1: Revised boundary checking for input tags
-- Added avalidate_value_bounds
method toAbstractTag
, that by default always returnsTrue, ""
-- Added an alternateAbstractNumericTag
that inheritsAbstractTag
to implementvalidate_value_bounds
properly
--- Changed boundary storing to the following fields
----ub
andlb
----- Can either beNone
, or some value to indicate an upper or lower bound
----ub_incl
andlb_incl
----- If True, applies>=
instead of>
in comparative checks on upper and lower bounds
-- Switched inheritance ofFloatTag
andIntTag
fromAbstractTag
toAbstractNumericTag
-- Implementedvalidate_value_bounds
forTagContainer
to dispatch checking for contained subtags
-- Added a methodvalidate_boundaries
toJDFTXInfile
to runvalidate_value_bounds
on all contained tags and values
-- Addedvalidate_value_boundaries
argument for initialization methods ofJDFTXInfile
, which will runvalidate_boundaries
after initializingJDFTXInfile
but before returning when True
--- Note that this is explicitly disabled when initializing aJDFTXInfile
from the input summary in aJDFTXOutfileSlice
- boundary values may exist internally in JDFTx for non-inclusive bounded tags as the default values, but cannot be passed in the input file. For this reason, errors on boundary checking must be an easily disabled feature for the construction and manipulation of aJDFTXInfile
, but out-of-bounds values must never be written when writing a file for passing to JDFTx.
Todos
- feature 1
-- Implement some way boundary checking can run when adding tags to a pre-existingJDFTXInfile
object
--- boundary checking is currently only run when initializing from a pre-existing collection of input tags
--- writing this intoJDFTXInfile.__setitem__
is too extreme as it would require adding an attribute toJDFTXInfile
to allow disabling the check
--- the better solution would be to implement a more obvious user-friendly method for reading in additional inputs so that the user doesn't need to learn how to properly write out the dictionary representation of complicated tag containers.
-- Fill out reference tags for other unimplemented boundaries
- feature 1: Revised boundary checking for input tags
- PR #4408
to_jdftxinfile
method for JDFTXOutfile by @benrich37
Major changes:- feature 1: Method
to_jdftxinfile
for JDFTXOutfile(Slice)
-- Uses internalJDFTXInfile
andStructure
to create a newJDFTXInfile
object that can be ran to restart a calculation - feature 2: Method
strip_structure_tags
forJDFTXInfile
-- Strips all structural tags from aJDFTXInfile
for creating equivalentJDFTXInfile
objects with updated associated structures - fix 1: Changing 'nAlphaAdjustMax' shared tag from a
FloatTag
to anIntTag
- fix 2: Adding an optional
minval
field for certainFloatTag
s which prevent writing error-raising values
-- Certain tag options in JDFTx can internally be the minimum value, but trying to pass the minimum value will raise an error
Todos
- feature 1: Testing for the
to_jdftxinfile
-- I know the function works from having used it, but I haven't written in an explicit test for it yet. - fix 2: Look through JDFTx source code and identify all the numeric tag value boundaries and add them to the FloatTag. This will likely require generalizing how boundaries are tested as a quick glance (see here) shows there are tags that actually do use the
>=
operator - Unrelated: Reduce bloat in outputs module
-- Remove references to deprecated fields
-- Begin phasing out redundant fields
--- i.e.JDFTXOutfile.lattice
redundant toJDFTXOutfile.structure.lattice.matrix
-- Generalize how optimization logs are stored in outputs module objects
--- Fields likegrad_K
are part of a broad group of values that can be logged for an optimization step, and...
- feature 1: Method
v2025.5.2
v2025.5.1
- lxml is now used for faster Vasprun parsing.
- Minor bug fix for MPRester.get_entries summary_data for larger queries.
- New JSON for ptable with better value/unit handling (also slightly faster) (@DanielYang59)
- Handle missing trailing newline in ICOHPLIST.lobster (@alibh95)
- Updated MVLSlabSet with MPSurfaceSet parameters from atomate1 (@abhardwaj73)
v2025.4.24
- Structure now has a calc_property method that enables one to get a wide range of elasticity, EOS, and phonon properties using matcalc. Requires matcalc to be
installed. - Bug fix and expansion of pymatgen.ext.matproj.MPRester. Now property_data is always consistent with the returned entry in get_entries. Summary data, which is not
always consistent but is more comprehensive, can be obtained via a summary_data kwarg. - PR #4378 Avoid merging if a structure has only one site by @kmu
This PR fixes an error that occurs when callingmerge_sites
on a structure with only one site. For example: - PR #4372 Reapply update to ptable vdw radii CSV source and JSON with CRC handbook by @DanielYang59
- Update ptable vdw radii CSV source, to fix #4370
- Revert #4345 and apply changes to CSV
vdw radii data source:
John R. Rumble, ed., CRC Handbook of Chemistry and Physics, 105th Edition (Internet Version 2024), CRC Press/Taylor & Francis, Boca Raton, FL.
If a specific table is cited, use the format: "Physical Constants of Organic Compounds," in CRC Handbook of Chemistry and Physics, 105th Edition (Internet Version 2024), John R. Rumble, ed., CRC Press/Taylor & Francis, Boca Raton, FL.
v2025.4.20
- Updated
perturb
method to be in parity for Structure and Molecule. - PR #4226 Fix file existence check in ChargemolAnalysis to verify directory instead. by @lllangWV
- PR #4324 GibbsComputedStructureEntry update to handle float temperature values by @slee-lab
- PR #4303 Fix mcl kpoints by @dgaines2
Fixed errors in two of the k-points for the MCL reciprocal lattice (according to Table 16 in Setyawan-Curtarolo 2010)
M2 and D1 aren't included in the recommended k-point path, but third-party software that plots k-point paths using pymatgen labelled M2 in the path instead of M1 due to it being the "same" k-point. - PR #4344 Update "electron affinities" in
periodic_table.json
by @DanielYang59 - PR #4365 Python 3.13 support by @DanielYang59
v2025.4.19
- MPRester.get_entries and get_entries_in_chemsys now supports property_data. inc_structure, conventional_only and
- PR #4367 fix perturb bug that displaced all atoms equally by @skasamatsu
- PR #4361 Replace
pybtex
withbibtexparser
by @DanielYang59 - PR #4362 fix(MVLSlabSet): convert DIPOL vector to pure Python list before writing INCAR by @atulcthakur
- PR #4363 Ensure
actual_kpoints_weights
islist[float]
and add test by @kavanase - PR #4345 Fix inconsistent "Van der waals radius" and "Metallic radius" in
core.periodic_table.json
by @DanielYang59 - PR #4212 Deprecate
PymatgenTest
, migrate tests topytest
fromunittest
by @DanielYang59
v2025.4.17
- Bug fix for list based searches in MPRester.
v2025.4.16
- Major new feature and breaking change: Legacy MP API is no longer supported. Pymatgen also no longer support mp-api in the backend. Instead, Pymatgen's MPRester now
has nearly 100% feature parity with mp-api's document searches. One major difference is that pymatgen's MPRester will follow the documented REST API end points exactly, i.e., users just need to refer to https://api.materialsproject.org/docs for the exact field names. - PR #4360 Speed up
Vasprun
parsing by @kavanase - PR #4343 Drop duplicate
iupac_ordering
entries incore.periodic_table.json
by @DanielYang59 - PR #4348 Remove deprecated grain boundary analysis by @DanielYang59
- PR #4357 Fix circular import of
SymmOp
by @DanielYang59
v2025.4.10
- Parity with MPRester.materials.summary.search in MPResterBasic.
- PR #4355 Fix round-trip constraints handling of
AseAtomsAdaptor
by @yantar92- src/pymatgen/io/ase.py (AseAtomsAdaptor.get_structure): When no explicit constraint is given for a site in ASE Atoms object, use "T T T" selective dynamics (no constraint). The old code is plain wrong.
- tests/io/test_ase.py (test_back_forth): Add new test case.
Fixes #4354.
Thanks to @yaoyi92 for reporting!
- PR #4352 Replace
to_dict
withas_dict
by @DanielYang59- Deprecate
to_dict
withas_dict
, to close #4351 - Updated
contributing.md
to note preferred naming convention - Regenerate
docs
The recent additional of JDFTx IOs have a relatively short grace period of 6 months, and others have one year
- Deprecate
- PR #4342 Correct Mn "ionic radii" in
core.periodic_table.json
by @DanielYang59- Correct Mn "ionic radii" in
core.periodic_table.json
Our csv parser should copy the high spin ionic radii to the base entry:
https://github.com/materialsproject/pymatgen/blob/4c7892f5c9dcc51a1389b3ad2ada77632989a13e/dev_scripts/update_pt_data.py#L84-L87
- Correct Mn "ionic radii" in
- PR #4341 Remove "Electrical resistivity" for Se as "high" by @DanielYang59
Summary
- Remove "Electrical resistivity" for Se as "high", to fix #4312
Current the data for Electrical resistivity of Se is "high" (with10<sup>-8</sup> Ω m
as the unit), and our parser would interpret it to:
This is the only data as "high" infrom pymatgen.core import Element print(Element.Se.electrical_resistivity) # 1e-08 m ohm
periodic_table.json
AFAIK.
After this, it would be None with a warning:/Users/yang/developer/pymatgen/debug/test_elements.py:3: UserWarning: No data available for electrical_resistivity for Se print(Element.Se.electrical_resistivity) None
- Remove "Electrical resistivity" for Se as "high", to fix #4312
- PR #4334 Updated Potentials Class in FEFF io to consider radius by @CharlesCardot
Changed the Potentials class to consider the same radius that is used in
the Atoms class. This is necessary to avoid a situation where the radius is small enough to only have a subset of the unique elements in a structure, but all the elements have potentials defined, which causes FEFF to fail when run.
Major changes:- fix 1: Previous behavior: When creating a FEFFDictset using the feff.io tools, the potentials class defined a potential for every unique element in a structure, while the atoms class defined an atom coordinate for every atom in a radius around the absorbing atom. If the radius was defined to be small, only a subset of the unique atoms in the structure would be included in the atom class. New behavior: Updated the potentials class to only create potentials for atoms inside the radius specified when creating a FEFFDictset. Without this, a too small radius for a structure with unique elements outside of that radius would cause FEFF to fail, given that there was a Potential defined for an element that did not exist in the Atoms flag.
Todos
None, the work is complete - PR #4068 Fix
monty
imports, enhance test forOutcar
parser to cover uncompressed format by @DanielYang59Summary
- PR #4331 Optimized cube file parsing in from_cube for improved speed by @OliMarc
Summary
Major Changes:
This PR enhances thefrom_cube
function inio.common.VolumetricData
to significantly improve performance. When processing large.cube
files, the original implementation took minutes to read and set Pymatgen objects. The optimized version incorporates several key improvements: file reading is now handled withreadlines()
instead of multiplereadline()
calls, reducing I/O operations. Voxel data parsing has been rewritten to use NumPy vectorized parsing instead of loops, making numerical processing faster. Atom site parsing has been improved by replacing the loop-basedreadline()
approach with list comprehensions. Additionally, volumetric data parsing now leveragesnp.fromstring()
instead of converting lists to NumPy arrays. - PR #4329 Add protostructure and prototype functions from aviary by @CompRhys
Adds functions to get protostructure labels from spglib, moyo and aflow-sym. This avoids users who wish to use this functionality from needing to downloadaviary
to use these functions. - PR #4321 [Breaking]
from_ase_atoms
constructor for(I)Structure/(I)Molecule
returns the corresponding type by @DanielYang59Summary
- Fix
from_ase_atoms
forMolecule
, to close #4320 - Add tests
- Fix
- PR #4296 Make dict representation of
SymmetrizedStructure
MSONable by @DanielYang59Summary
- Make dict representation of
SymmetrizedStructure
MSONable, to fix #3018 - Unit test
- Make dict representation of
- PR #4323 Tweak POSCAR / XDATCAR to accommodate malformed files by @esoteric-ephemera
Related to this matsci.org issue: sometimes the XDATCAR can be malformed because fortran uses fixed format floats when printing. In those cases, there's no space between coordinates:In some cases, this issue is reparable (a negative sign separates coordinates). This PR implements a fix when it is reparable and adds a few Trajectory-like convenience features toDirect configuration= 2 -0.63265286-0.11227753 -0.15402785 -0.12414874 -0.01213420 -0.28106824 ...
Xdatcar