Skip to content

Speed up Vasprun parsing some more #4360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 15, 2025

Conversation

kavanase
Copy link
Contributor

Me again.
From further profiling and playing around, I found I could speed up _parse_vasp_array (one of the main bottlenecks when using parse_dos = True (default), parse_eigen = True (default) and/or parse_projected_eigen = True (False by default)), using numpy's parse from string function.
e.g. parsing a SOC defect supercell vasprun via doped with these updates (with parse_projected_eigen=True to get eigenvalues/magnetisation) decreases parsing time from ~8.5s to ~4.8s.

All changes here should be covered by tests already in the codebase.

@shyuep
Copy link
Member

shyuep commented Apr 15, 2025

Thanks. but I don't think we need to use string concat? I believe np.loadtxt would be able to handle the text without concat.

@shyuep
Copy link
Member

shyuep commented Apr 15, 2025

Example:

In [1]: import numpy as np

In [3]: np.loadtxt(["1 2", "3 4"])
Out[3]:
array([[1., 2.],
       [3., 4.]])

@shyuep
Copy link
Member

shyuep commented Apr 15, 2025

Even better, no reshaping needed.

@kavanase
Copy link
Contributor Author

Ah yes! Good points. Done ⬆️

@shyuep shyuep merged commit e714e2b into materialsproject:master Apr 15, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants