Skip to content

Commit f84ee0f

Browse files
committed
add model code and checkpoint license types and URLs to all model YAML files
- `datasets.yml` add missing descriptions and fix license formats - `dataset-schema.d.ts` and `model-schema.d.ts` add new license definitions - `+page.svelte` fix license mapping
1 parent 7c0b089 commit f84ee0f

40 files changed

+348
-28
lines changed

contributing.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -146,14 +146,14 @@ To submit a new model to this benchmark and add it to our leaderboard, please cr
146146
phonons:
147147
kappa_103:
148148
pred_file: models/<model_dir>/<yyyy-mm-dd>-kappa-103-<values-of-dist|fmax|symprec>.json.gz
149-
pred_file_url: https://ndownloader.figshare.com/files/<figshare_id>
149+
pred_file_url: https://figshare.com/files/<figshare_id>
150150
geo_opt: # only applicable if the model performed structure relaxation
151151
pred_file: models/<model_dir>/<yyyy-mm-dd>-wbm-geo-opt-<optimizer>.json.gz # should contain the models relaxed structures as ASE Atoms or pymatgen Structures, and separate columns for material_id and energies/forces/stresses at each relaxation step
152-
pred_file_url: https://ndownloader.figshare.com/files/<figshare_id>
152+
pred_file_url: https://figshare.com/files/<figshare_id>
153153
struct_col: <column_name_of_material_ids_in_relaxed_structures>
154154
discovery:
155155
pred_file: models/<model_dir>/<yyyy-mm-dd>-<model_name>-wbm-IS2RE.csv.gz # should contain the models energy predictions for the WBM test set
156-
pred_file_url: https://ndownloader.figshare.com/files/<figshare_id>
156+
pred_file_url: https://figshare.com/files/<figshare_id>
157157
pred_col: e_form_per_atom_<model_name>
158158
```
159159

data/datasets.yml

Lines changed: 28 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@ MP 2022:
55
n_structures: 154_719
66
open: true
77
date_created: 2022-10-28
8-
license: CC BY 4.0
8+
license: CC-BY-4.0
9+
doi: https://doi.org/10.1063/1.4812323
10+
description: Entire Materials Project dataset from October 2022.
911
params:
1012
method: DFT
1113
code: VASP
@@ -19,12 +21,14 @@ MPtrj:
1921
title: MPtrj
2022
url: https://figshare.com/articles/dataset/23713842
2123
download_url: https://figshare.com/files/41619375
22-
doi: https://doi.org/10.6084/m9.figshare.23713842
24+
doi: https://doi.org/10.1038/s42256-023-00716-3
25+
2326
n_structures: 1_580_395
2427
n_materials: 145_923
2528
open: true
2629
date_created: 2023-07-19
2730
license: MIT
31+
description: Materials Project DFT relaxation trajectories cleaned of unrealistic energies and forces, and filtered for less redundancy using pymatgen's StructureMatcher to contain only about every 10th ionic step (1.5M structures). Originally created for training CHGNet.
2832
params:
2933
method: DFT
3034
code: VASP
@@ -41,13 +45,14 @@ MPF:
4145
title: MPF.2021.2.8
4246
url: https://figshare.com/articles/dataset/19470599
4347
download_url: https://figshare.com/ndownloader/articles/19470599/versions/3
44-
doi: https://doi.org/10.6084/m9.figshare.19470599.v3
48+
doi: https://doi.org/10.1038/s43588-022-00349-3
4549
n_structures: 188_349
4650
n_materials: 62_783
4751
open: true
4852
date_created: 2022-03-05
4953
date_added: 2022-03-05
50-
license: CC BY 4.0
54+
license: CC-BY-4.0
55+
description: Materials Project Force dataset used to train the M3GNet model reported in https://arxiv.org/abs/2202.02450.
5156
params:
5257
method: DFT
5358
code: VASP
@@ -65,11 +70,12 @@ MP Graphs:
6570
title: Graphs of MP 2019
6671
url: https://figshare.com/articles/dataset/8097992
6772
download_url: https://figshare.com/ndownloader/articles/8097992/versions/2
68-
doi: https://doi.org/10.6084/m9.figshare.8097992.v2
73+
doi: https://doi.org/10.1021/acs.chemmater.9b01294
6974
n_structures: 133_420
7075
open: true
7176
date_created: 2019-06-06
7277
license: MIT
78+
description: Contains 133,420 graph-target pairs for Materials Project structures which were used to train the MEGNet formation energy model.
7379
params:
7480
method: DFT
7581
code: VASP
@@ -91,7 +97,8 @@ GNoME:
9197
n_materials: 6_000_000
9298
open: false
9399
date_created: 2023-11-29 # https://github.com/google-deepmind/materials_discovery/commit/a701b9529
94-
license: Apache 2.0
100+
license: Apache-2.0
101+
description: Google DeepMind's Graph Networks for Materials Exploration dataset containing millions of crystal structures generated with symmetry aware partial substitutions (SAPS) and their DFT-calculated energies, forces and stresses. Aimed at large-scale materials discovery.
95102
params:
96103
method: DFT
97104
code: VASP
@@ -110,7 +117,8 @@ MatterSim:
110117
pressure_range: 0-1000 GPa
111118
open: false
112119
date_created: 2024-05-08
113-
license: Unreleased
120+
license: unreleased
121+
description: Large-scale materials simulation dataset from Microsoft Research containing DFT-calculated properties across a wide range of temperatures and pressures. Aimed at training robust universal interatomic potentials that remain accurate even far from equilibrium.
114122
params:
115123
method: DFT
116124
code: VASP
@@ -129,7 +137,8 @@ Alex:
129137
n_materials: 3_100_000
130138
open: true
131139
date_created: 2023-04-22
132-
license: CC BY 4.0
140+
license: CC-BY-4.0
141+
description: Large collection of DFT structure relaxation trajectories with energies, forces, stresses for ~3M materials. Aimed at training ML potentials for materials discovery.
133142
params:
134143
method: DFT
135144
code: VASP
@@ -164,7 +173,8 @@ OMat24:
164173
n_structures: 100_824_585
165174
open: true
166175
date_created: 2024-10-16
167-
license: CC BY 4.0
176+
license: CC-BY-4.0
177+
description: Open Materials 2024 dataset from Meta's FAIRchem containing over 100M structures derived from applying perturbations to Alexandria structures. Aimed at training foundation models for materials science.
168178
params:
169179
method: DFT
170180
code: VASP
@@ -207,7 +217,8 @@ sAlex:
207217
n_structures: 10_447_765
208218
open: true
209219
date_created: 2024-10-16
210-
license: CC BY 4.0
220+
license: CC-BY-4.0
221+
description: Subsampled version of the Alexandria dataset, containing approximately 10 million structures filtered to remove structure prototype overlap with Matbench Discovery's WBM test set. See https://github.com/janosh/matbench-discovery/blob/5f02f790e1/matbench_discovery/structure/prototype.py#L104-L193.
211222
params:
212223
method: DFT
213224
code: VASP
@@ -230,7 +241,8 @@ sAlex Validation:
230241
n_structures: 553_218
231242
open: true
232243
date_created: 2024-10-16
233-
license: CC BY 4.0
244+
license: CC-BY-4.0
245+
description: Validation subset of the subsampled Alexandria dataset, containing ~500k structures for evaluating model performance during training. Filtered to remove structure prototype overlap with WBM.
234246
params:
235247
method: DFT
236248
code: VASP
@@ -252,7 +264,7 @@ OpenLAM:
252264
n_structures: 162_507_178
253265
open: true
254266
date_created: 2025-02-17
255-
license: CC BY 4.0
267+
license: CC-BY-4.0
256268
params:
257269
method: DFT
258270
code: VASP
@@ -283,7 +295,7 @@ OC20:
283295
n_structures: 133_934_018
284296
open: true
285297
date_created: 2020-10-01
286-
license: CC BY 4.0
298+
license: CC-BY-4.0
287299
params:
288300
method: DFT
289301
code: VASP
@@ -304,7 +316,7 @@ NOMAD:
304316
n_materials: 4_335_728
305317
open: true
306318
date_created: 2019-06-01
307-
license: CC BY 4.0
319+
license: CC-BY-4.0
308320
params:
309321
method: [DFT, ML]
310322
code: Various
@@ -325,7 +337,7 @@ AFLOW:
325337
n_materials: 3_530_330 # unsure how number of materials and structures differ for AFLOW
326338
open: true
327339
date_created: 2012-06-01
328-
license: Open
340+
license: CC-BY-4.0
329341
params:
330342
method: DFT
331343
code: VASP
@@ -347,7 +359,7 @@ OQMD:
347359
n_structures: 1_226_781
348360
open: true
349361
date_created: 2014-04-03
350-
license: CC BY 4.0
362+
license: CC-BY-4.0
351363
params:
352364
method: DFT
353365
code: VASP

models/alchembert/alchembert.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,12 @@ paper: https://chemrxiv.org/engage/chemrxiv/article-details/67540a28085116a133a6
1818
pr_url: https://github.com/janosh/matbench-discovery/pull/187
1919
checkpoint_url: https://figshare.com/ndownloader/files/53298683
2020

21+
license:
22+
code: GPL-3.0
23+
code_url: https://gitee.com/liuxiaotong15/alchemBERT/blob/master/LICENSE
24+
checkpoint: CC-BY-4.0
25+
checkpoint_url: https://figshare.com/articles/dataset/28690583?file=53298683
26+
2127
targets: E
2228
train_task: RS2RE
2329
test_task: IS2RE

models/alignn/alignn.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/85
2727
# ALIGNN model 2023-02-07-alignn-checkpoint.pth trained on mp_computed_structure_entries
2828
checkpoint_url: https://figshare.com/files/40344436
2929

30+
license:
31+
code: MIT
32+
code_url: https://github.com/usnistgov/alignn/blob/408bb6e996/LICENSE.rst
33+
checkpoint: CC-BY-4.0
34+
checkpoint_url: https://figshare.com/articles/dataset/Matbench_Discovery_v1_0_0/22715158?file=41233560
35+
3036
requirements:
3137
ase: 3.22.0
3238
dgl-cu111: 0.6.1

models/alignn_ff/alignn-ff.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ authors:
2424
- name: Francesca Tavazza
2525
affiliation: National Institute of Standards and Technology
2626
orcid: https://orcid.org/0000-0002-5602-180X
27+
trained_by:
2728
- name: Philipp Benner
2829
affiliation: Bundesanstalt für Materialforschung und -prüfung BAM
2930
orcid: https://orcid.org/0000-0002-0912-8137
@@ -37,6 +38,12 @@ pypi: https://pypi.org/project/alignn
3738
pr_url: https://github.com/janosh/matbench-discovery/pull/37
3839
checkpoint_url: https://github.com/usnistgov/alignn/blob/461b35fe/alignn/ff/alignnff_wt10/best_model.pt
3940

41+
license:
42+
code: MIT
43+
code_url: https://github.com/usnistgov/alignn/blob/408bb6e996/LICENSE.rst
44+
checkpoint: MIT
45+
checkpoint_url: https://github.com/usnistgov/alignn/blob/408bb6e996/LICENSE.rst
46+
4047
requirements:
4148
ase: 3.22.0
4249
dgl-cu111: 0.6.1

models/alphanet/alphanet-mptrj.yml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,13 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/216
4444
paper: https://arxiv.org/abs/2501.07155
4545
doi: https://doi.org/10.48550/arXiv.2501.07155
4646
# checkpoint page: https://github.com/zmyybc/AlphaNet/blob/243fe71cb96/README.md#pretrained-models
47-
checkpoint_url: https://ndownloader.figshare.com/files/52870784
47+
checkpoint_url: https://figshare.com/files/52870784
48+
49+
license:
50+
code: GPL-3.0
51+
code_url: https://github.com/zmyybc/AlphaNet/blob/243fe71cb/LICENSE
52+
checkpoint: CC-BY-4.0
53+
checkpoint_url: https://figshare.com/articles/dataset/mp-0225-2_ckpt/28560176?file=52870784
4854

4955
openness: OSOD
5056
train_task: S2EFS

models/bowsr/bowsr.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/85
2727
# see https://github.com/materialsvirtuallab/maml/blob/43619d4fb/maml/apps/bowsr/model/megnet.py#L27
2828
checkpoint_url: https://github.com/materialsvirtuallab/maml/raw/43619d4fb/maml/apps/bowsr/model/model_files/megnet/formation_energy.hdf5
2929

30+
license:
31+
code: BSD-3-Clause
32+
code_url: https://github.com/materialsvirtuallab/maml/blob/50c61ea45f/LICENSE
33+
checkpoint: BSD-3-Clause
34+
checkpoint_url: https://github.com/materialsvirtuallab/maml/raw/50c61ea45f/LICENSE
35+
3036
requirements:
3137
maml: 2022.9.20
3238
pymatgen: 2022.10.22

models/cgcnn/cgcnn+p.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,24 @@ authors:
1616
url: https://hennig.mse.ufl.edu
1717
1818
orcid: https://orcid.org/0000-0003-4933-7686
19+
trained_by:
20+
- name: Janosh Riebesell
21+
affiliation: University of Cambridge, Lawrence Berkeley National Laboratory
22+
23+
orcid: https://orcid.org/0000-0001-5233-3462
1924

2025
repo: https://github.com/JasonGibsonUfl/Augmented_CGCNN
2126
doi: https://doi.org/10.1038/s41524-022-00891-8
2227
paper: https://arxiv.org/abs/2202.13947
2328
pr_url: https://github.com/janosh/matbench-discovery/pull/85
2429
checkpoint_url: https://api.wandb.ai/files/janosh/matbench-discovery/tx6cepg6/checkpoint.pth
2530

31+
license:
32+
code: MIT
33+
code_url: https://github.com/CompRhys/aviary/blob/3238fb415/LICENSE
34+
checkpoint: MIT
35+
checkpoint_url: https://github.com/janosh/matbench-discovery/blob/7c0b089e7/license
36+
2637
requirements:
2738
aviary: https://github.com/CompRhys/aviary/releases/tag/v0.1.0
2839
torch: 1.11.0

models/cgcnn/cgcnn.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,11 @@ authors:
1212
- name: Jeffrey C. Grossman
1313
affiliation: Massachusetts Institute of Technology
1414
url: https://dmse.mit.edu/people/jeffrey-c-grossman
15+
trained_by:
16+
- name: Janosh Riebesell
17+
affiliation: University of Cambridge, Lawrence Berkeley National Laboratory
18+
19+
orcid: https://orcid.org/0000-0001-5233-3462
1520

1621
repo: https://github.com/CompRhys/aviary
1722
doi: https://doi.org/10.1103/PhysRevLett.120.145301
@@ -20,6 +25,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/85
2025
# submission used an ensemble of 10 models, URL is just the first checkpoint
2126
checkpoint_url: https://api.wandb.ai/files/janosh/matbench-discovery/ykh764i5/checkpoint.pt
2227

28+
license:
29+
code: MIT
30+
code_url: https://github.com/CompRhys/aviary/blob/3238fb415/LICENSE
31+
checkpoint: MIT
32+
checkpoint_url: https://github.com/janosh/matbench-discovery/blob/7c0b089e7/license
33+
2334
requirements:
2435
aviary: https://github.com/CompRhys/aviary/releases/tag/v0.1.0
2536
torch: 1.11.0

models/chgnet/chgnet-0.3.0.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/85
3838
# checkpoint URL copied from https://github.com/CederGroupHub/chgnet/blob/d55f185199fddc9/chgnet/pretrained/0.3.0/chgnet_0.3.0_e29f68s314m37.pth.tar
3939
checkpoint_url: https://github.com/CederGroupHub/chgnet/raw/refs/heads/main/chgnet/pretrained/0.3.0/chgnet_0.3.0_e29f68s314m37.pth.tar
4040

41+
license:
42+
code: BSD-3-Clause
43+
code_url: https://github.com/CederGroupHub/chgnet/blob/d55f185199f/LICENSE
44+
checkpoint: BSD-3-Clause
45+
checkpoint_url: https://github.com/CederGroupHub/chgnet/blob/d55f185199f/LICENSE
46+
4147
requirements:
4248
torch: 1.11.0
4349
ase: 3.22.0

models/deepmd/dpa3-v1-mptrj.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/192
3535
# checkpoints reported in https://github.com/deepmodeling/deepmd-kit/discussions/4682
3636
checkpoint_url: https://bohrium-api.dp.tech/ds-dl/matbench-submit-DPA3mptraj-ictz-v2.zip
3737

38+
license:
39+
code: LGPL-3.0
40+
code_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
41+
checkpoint: LGPL-3.0
42+
checkpoint_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
43+
3844
openness: OSOD
3945
train_task: S2EFS
4046
test_task: IS2RE-SR

models/deepmd/dpa3-v1-openlam.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/192
3535
# checkpoints reported in https://github.com/deepmodeling/deepmd-kit/discussions/4682
3636
checkpoint_url: https://bohrium-api.dp.tech/ds-dl/dpa3openlam-74ng-v3.zip
3737

38+
license:
39+
code: LGPL-3.0
40+
code_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
41+
checkpoint: LGPL-3.0
42+
checkpoint_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
43+
3844
openness: OSCD
3945
train_task: S2EFS
4046
test_task: IS2RE-SR

models/deepmd/dpa3-v2-mptrj.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/222
3535
# checkpoints reported in https://github.com/deepmodeling/deepmd-kit/discussions/4682
3636
checkpoint_url: https://figshare.com/files/52989056
3737

38+
license:
39+
code: LGPL-3.0
40+
code_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
41+
checkpoint: LGPL-3.0
42+
checkpoint_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
43+
3844
openness: OSOD
3945
train_task: S2EFS
4046
test_task: IS2RE-SR

models/deepmd/dpa3-v2-openlam.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,12 @@ pr_url: https://github.com/janosh/matbench-discovery/pull/222
3535
# checkpoints reported in https://github.com/deepmodeling/deepmd-kit/discussions/4682
3636
checkpoint_url: https://figshare.com/files/52989059
3737

38+
license:
39+
code: LGPL-3.0
40+
code_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
41+
checkpoint: LGPL-3.0
42+
checkpoint_url: https://github.com/deepmodeling/deepmd-kit/blob/70bc6d89/LICENSE
43+
3844
openness: OSCD
3945
train_task: S2EFS
4046
test_task: IS2RE-SR

models/eSEN/eSEN-30m-mp.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,12 @@ pypi: https://pypi.org/project/fairchem-core
3434
pr_url: https://github.com/janosh/matbench-discovery/pull/226
3535
checkpoint_url: missing # unreleased, Xiang Fu will notify when available
3636

37+
license:
38+
code: MIT
39+
code_url: https://github.com/FAIR-Chem/fairchem/blob/aa160789e1/LICENSE.md
40+
checkpoint: unreleased
41+
checkpoint_url: missing
42+
3743
requirements:
3844
fairchem-core: 1.7.0
3945

models/eSEN/eSEN-30m-oam.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,12 @@ pypi: https://pypi.org/project/fairchem-core
3434
pr_url: https://github.com/janosh/matbench-discovery/pull/226
3535
checkpoint_url: missing # unreleased, Xiang Fu will notify when available
3636

37+
license:
38+
code: MIT
39+
code_url: https://github.com/FAIR-Chem/fairchem/blob/aa160789e1/LICENSE.md
40+
checkpoint: unreleased
41+
checkpoint_url: missing
42+
3743
requirements:
3844
fairchem-core: 1.7.0
3945

0 commit comments

Comments
 (0)