Skip to content

Commit 6f06b4c

Browse files
committed
mv site/src/routes/(about-the-data->data)
+layout.svelte move /home link into nav fix wrong TPR definition in preprint
1 parent 5d28a57 commit 6f06b4c

20 files changed

+13
-16
lines changed

data/wbm/eda.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
"""
2929

3030
module_dir = os.path.dirname(__file__)
31-
about_data_page = f"{ROOT}/site/src/routes/about-the-data"
31+
data_page = f"{ROOT}/site/src/routes/data"
3232

3333

3434
# %% load MP training set
@@ -54,15 +54,15 @@
5454

5555
# %%
5656
for dataset, count_mode, elem_counts in all_counts:
57-
elem_counts.to_json(f"{about_data_page}/{dataset}-element-counts-{count_mode}.json")
57+
elem_counts.to_json(f"{data_page}/{dataset}-element-counts-{count_mode}.json")
5858

5959

6060
# %% export element counts by WBM step to JSON
6161
df_wbm["step"] = df_wbm.index.str.split("-").str[1].astype(int)
6262
assert df_wbm.step.between(1, 5).all()
6363
for batch in range(1, 6):
6464
count_elements(df_wbm[df_wbm.step == batch].formula).to_json(
65-
f"{about_data_page}/wbm-element-counts-{batch=}.json"
65+
f"{data_page}/wbm-element-counts-{batch=}.json"
6666
)
6767

6868
# export element counts by arity (how many elements in the formula)
@@ -71,7 +71,7 @@
7171

7272
for arity, df_mp in df_wbm.groupby(df_wbm[comp_col].map(len)):
7373
count_elements(df_mp.formula).to_json(
74-
f"{about_data_page}/wbm-element-counts-{arity=}.json"
74+
f"{data_page}/wbm-element-counts-{arity=}.json"
7575
)
7676

7777

scripts/model_figs/per_element_errors.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
# %% compute number of samples per element in training set
5151
# counting element occurrences not weighted by composition, assuming model don't learn
5252
# much more about iron and oxygen from Fe2O3 than from FeO
53-
counts_path = f"{ROOT}/site/src/routes/about-the-data/mp-element-counts-occurrence.json"
53+
counts_path = f"{ROOT}/site/src/routes/data/mp-element-counts-occurrence.json"
5454
df_elem_err = pd.read_json(counts_path, typ="series")
5555
train_count_col = "MP Occurrences"
5656
df_elem_err = df_elem_err.reset_index(name=train_count_col).set_index("index")

site/src/lib/Nav.svelte

+3-2
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@
1212
</script>
1313

1414
<nav {style}>
15-
{#each routes as href, idx}
15+
{#each routes as route, idx}
16+
{@const [title, href] = Array.isArray(route) ? route : [route, route]}
1617
{#if idx > 0}<strong>&bull;</strong>{/if}
17-
<a {href} aria-current={is_current(href)} class="link">{href}</a>
18+
<a {href} aria-current={is_current(href)} class="link">{title}</a>
1819
{/each}
1920
</nav>
2021

site/src/routes/+layout.svelte

+3-7
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919
2020
$: description = {
2121
'/': `Benchmarking machine learning energy models for materials discovery.`,
22-
'/about-the-data': `Details about provenance, chemistry and energies in the benchmark's train and test set.`,
23-
'/about-the-data/tmi': `Too much information on the benchmark's data.`,
22+
'/data': `Details about provenance, chemistry and energies in the benchmark's train and test set.`,
23+
'/data/tmi': `Too much information on the benchmark's data.`,
2424
'/api': `API docs for the Matbench Discovery PyPI package.`,
2525
'/contribute': `Steps for contributing a new model to the benchmark.`,
2626
'/models': `Details on each model sortable by metrics.`,
@@ -70,14 +70,10 @@
7070

7171
<Toc {headingSelector} breakpoint={1250} minItems={3} />
7272

73-
{#if url !== `/`}
74-
<a href="/" aria-label="Back to index page">&laquo; home</a>
75-
{/if}
76-
7773
<GitHubCorner href={repository} />
7874

7975
<main>
80-
<Nav routes={routes.filter((route) => route != `/changelog`)} />
76+
<Nav routes={[[`/about`, `/`], ...routes.filter((route) => route != `/changelog`)]} />
8177

8278
<slot />
8379

File renamed without changes.

site/src/routes/preprint/+page.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ The results for M3GNet and MACE depart from the trend that F1 is rank-correlated
235235
Of all models, M3GNet achieves the highest true positive rate (TPR) but an unusually low true negative rate (TNR).
236236
A similar trend is seen for MACE. @fig:rolling-mae-vs-hull-dist-models provides a visual understanding of this observation.
237237
M3GNet and MACE have the lowest rolling mean of the absolute errors (rolling MAE) as a function of hull distance for materials above the convex hull (see right half of plot) but incur comparably large errors for materials below the hull (left half of plot).
238-
Since $\text{TPR} = \frac{\text{TN}}{\text{TN} + \text{FP}}$, lower error for energies above the hull increases both TN and decreases FP, resulting in the high TPR values observed.
238+
Since $\text{TPR} = \frac{\text{TP}}{\text{TP} + \text{FN}}$, lower error for energies above the hull increases both TN and decreases FP, resulting in the high TPR values observed.
239239

240240
The reason CGCNN+P achieves better regression metrics than CGCNN but is still worse as a classifier becomes apparent from @fig:hist-clf-pred-hull-dist-models by noting that the CGCNN+P histogram is more sharply peaked at the 0 hull distance stability threshold.
241241
This causes even small errors in the predicted convex hull distance to be large enough to invert a classification.
@@ -449,7 +449,7 @@ BOWSR has the largest median error, while Voronoi RF has the largest IQR. Note t
449449

450450
> @label:fig:hist-clf-pred-hull-dist-models Distribution of model-predicted hull distance colored by stability classification. Models are sorted from top to bottom by F1 score. The thickness of the red and yellow bands shows how often models misclassify as a function of how far away from the convex hull they place a material. While CHGNet and M3GNet perform almost equally well overall, these plots reveal that they do so via different trade-offs. M3GNet commits fewer false negatives but more false positives predictions compared to CHGNet. In a real discovery campaign, false positives have a higher opportunity cost than false negatives since they result in wasted DFT relaxations or even synthesis time in the lab. A false negative by contrast is just one missed opportunity out of many. This observation is also reflected in the higher TPR and lower TNR of M3GNet vs CHGNet in @fig:metrics-table, as well as the lower rolling MAE for CHGNet vs M3GNet on the stable side (left half) of @fig:rolling-mae-vs-hull-dist-models and vice-versa on the unstable side (right half).
451451

452-
Note the CGCNN+P histogram is more strongly peaked than CGCNN's which agrees better with the actual DFT ground truth [distribution of hull distances](/about-the-data#--target-distribution) in our test set. This explains why CGCNN+P performs better as a regressor, but also reveals how it can perform simultaneously worse as a classifier. By moving predictions closer to the stability threshold at 0 eV/atom above the hull, even small errors are significant enough to tip a classification over the threshold.
452+
Note the CGCNN+P histogram is more strongly peaked than CGCNN's which agrees better with the actual DFT ground truth [distribution of hull distances](/data#--target-distribution) in our test set. This explains why CGCNN+P performs better as a regressor, but also reveals how it can perform simultaneously worse as a classifier. By moving predictions closer to the stability threshold at 0 eV/atom above the hull, even small errors are significant enough to tip a classification over the threshold.
453453

454454
## Measuring extrapolation performance from WBM batch robustness
455455

0 commit comments

Comments
 (0)