Enable NanoPET for atomic-basis spherical targets #527

jwa7 · 2025-03-21T12:59:28Z

Extends NanoPET to enable predictions of spherical targets expressed on an atomic basis. Examples are the electron density decomposed on an auxiliary basis, and the Hamiltonian on the coupled atomic orbital basis.

Overview

Introduces a new target type "atomic_basis_spherical" to allow learning of spherical targets on an atomic basis set.
Only NanoPET is supported at present
Targets with 1 component axis are supported - this covers for example the electron density on a basis and the Hamiltonian/density matrix on a coupled basis
Per-pair targets can be either permutationally symmetrized or unsymmetrized. In both cases, only unique atom pairs are predicted

NanoPET architecture details

Re-uses the existing infrastructure for mapping last layer features to the spherical component heads
Infrastructural changes were required as follows:
- For per-atom targets:
  - Last-layer PET features are sliced just before being passed through the output layer. Slicing is needed to extract the samples of the correct type based on the "center_type" the output block corresponds to.
- For per-pair targets:
  - Again, the last-layer features are sliced according to the "first_atom_type" and "second_atom_type" the output block corresponds to. In the case of permutationally-symmetrized targets, the "s2_pi" key is also used to do this slicing.
  - Pre-last layer transformations are a little more complicated than for per-atom targets. Both the PET node and edge features are used. These are passed through separate heads before being combined in a single tensor that represents the last-layer features for the whole per-pair quantity.
  - Targets can be either permutationally symmetrized or not. In the former case, the PET edge features are symmetrized and thus the last layer features (if outputted) would carry the relevant metadata ("s2_pi" etc), but in the samples labels. as slicing into blocks only occurs at the output layer.
  - Only unique samples are predicted for per-pair targets. Practically this means that triangular off-site blocks in atom type are predicted where "first_atom_type" < "second_atom_type". For blocks where "first_atom_type" == "second_atom_type", samples are triangular in atom index such that "first_atom" <= "second_atom"
A new file "nanopet/modules/samples.py" has been created to handle the construction of samples for the features and outputs (which are in general different for these targets, per-pair ones in particular). These can stay here for now, and perhaps later be moved to metatomic.

Other metatrain infrastructure details

The dataloader has been modified to use join_kwargs={"different_keys": "union"} as this is required for targets on an atomic basis where different systems have different atom types (and therefore keys)
"per_atom.py" has been modified to not perform a sum over samples, also for per-pair targets
"augmentation.py" has been cleaned up a bit, specifically with regards to how the blocks are split along samples

Contributor (creator of pull-request) checklist

Tests updated (for new features and bugfixes)?
Documentation updated (for new features)?
Issue referenced (for PRs that solve an issue)?

Reviewer checklist

CHANGELOG updated with public API or any other important changes?

📚 Documentation preview 📚: https://metatrain--527.org.readthedocs.build/en/527/

…ingle component)

frostedoyster

Just checked the model part for now, I will check the data augmentation and infrastructure parts later

docs/src/advanced-concepts/fitting-atomic-basis-spherical-targets.rst

src/metatrain/experimental/nanopet/model.py

src/metatrain/experimental/nanopet/modules/augmentation.py

src/metatrain/utils/data/target_info.py

Co-authored-by: Filippo Bigi <[email protected]> Co-authored-by: Paolo Pegolo <paolo.pegolo.epfl.ch>

Co-authored-by: Paolo Pegolo <[email protected]>

jwa7 · 2025-04-16T14:35:01Z

@Luthaf @frostedoyster @ppegolo here's an update - ready for review when you're ready!

On my side, still to do (but shouldn't affect review in the meantime):

Write some tests
Update the changelog
Fix the model export torchscript error - help appreciated, I can't seem to figure it out!

src/metatrain/utils/augmentation.py

src/metatrain/utils/data/get_dataset.py

Luthaf

I'm not sure how much I understand the changes to PET, so I'll let someone else check this part.

One question is how much work would it be to port this to NativePET?

Luthaf · 2025-04-17T14:35:26Z

src/metatrain/experimental/nanopet/model.py

+            if self.atomic_basis_target_info[output_name][
+                "type"
+            ] == "atomic_basis_spherical" and self.atomic_basis_target_info[
+                output_name
+            ]["sample_kind"].startswith("per_pair"):


This is not great to read. Maybe you could extract two variable and change this to something like if atomic_basis_is_spherical and sample_is_per_pair

Luthaf · 2025-04-17T14:35:46Z

src/metatrain/experimental/nanopet/model.py

+                # symmetrize the PET edge features and pass through its head
+                if (
+                    self.atomic_basis_target_info[output_name]["sample_kind"]
+                    == "per_pair_sym"


why is this a different kind of sample?

Because in this case the PET features are still one whole block (i.e. we haven't sliced by s2_pi/first_atom_type/second_atom_type yet), and because it is symmetrized the normal samples aren't complete in info: we have "duplicated" samples that carry the index s2_pi=+/-1

Luthaf · 2025-04-17T14:36:56Z

src/metatrain/experimental/nanopet/modules/samples.py

+from metatensor.torch.atomistic import System
+
+
+def get_samples(


documentation please!

I was following suit on what is in concatenate_structures in structure.py in the same directory! :D But sure I can add

Luthaf · 2025-04-17T14:37:42Z

src/metatrain/experimental/nanopet/modules/samples.py

+    If ``include_atom_type=True``, the atom types are prepended dimensions, either
+    corresponding to "center_type" if ``n_center=1`` or ["first_atom_type",
+    "second_atom_type"] if ``n_center=2``.


There is no longer an n_center parameter

Luthaf · 2025-04-17T14:38:27Z

src/metatrain/experimental/nanopet/modules/samples.py

+                torch.zeros(len(first_atom_type), dtype=torch.int32).reshape(
+                    -1, 1
+                ),  # s2_pi = 0


Suggested change

torch.zeros(len(first_atom_type), dtype=torch.int32).reshape(

-1, 1

), # s2_pi = 0

# s2_pi = 0

torch.zeros((len(first_atom_type), 1), dtype=torch.int32),

Luthaf · 2025-04-17T14:39:58Z

src/metatrain/experimental/nanopet/modules/samples.py

+# ===== Slicing PET node/edge features for an atomic basis =====
+
+
+def samples_for_atomic_basis_per_atom(


docs please

Luthaf · 2025-04-17T14:40:57Z

src/metatrain/experimental/nanopet/trainer.py

+                # target keys.
+                for target_name in targets.keys():
+                    if predictions[target_name].keys != targets[target_name].keys:
+                        # TODO: use `metatensor.filter_blocks` once PR #XXX is available


Is this metatensor/metatensor#885? I can make a release =)

Yes! Thanks :)

Luthaf · 2025-04-17T14:42:00Z

src/metatrain/utils/augmentation.py

+        # First, build the indices that split the block samples by system
+        split_indices: List[int] = []
+
+        if target_type == "spherical":


what's the difference between target_type == "spherical" and target_type == "atomic_basis_spherical"?

atomic basis spherical doesn't have all the atomic samples in a given block - only those with the corresponding to the atom types. The splitting along the samples axis needs more care than for pure spherical targets

Luthaf · 2025-04-17T14:43:00Z

src/metatrain/utils/augmentation.py

+    s2_pi: int,
+) -> List[int]:
+    """
+    Finds the indices that splits a TensorBlock along the samples axis by system index.


Is this an implementation of metatensor/metatensor#627?

Luthaf · 2025-04-17T14:44:22Z

src/metatrain/utils/data/target_info.py

@@ -30,6 +30,8 @@ def __init__(
        self.is_scalar = False
        self.is_cartesian = False
        self.is_spherical = False
+        self.is_atomic_basis_spherical_per_atom = False
+        self.is_atomic_basis_spherical_per_pair = False


Would be nice to have a definition of what each of them mean

Sure can do

jwa7 added 7 commits March 21, 2025 11:24

Allow reading of traget type 'atomic_basis_spherical'

66b214d

Missing TargetInfo attribute

580036c

Allow rotational augmentation of targets on atomic spherical basis (s…

e635b48

…ingle component)

use {}_to_device and {}_to_dtype functions in validation batching

826263b

Modify loss to allow for empty blocks

c9d14cb

Allow multi-output training for atomic basis targets

cf17a19

Stubs for documentation

7a1179b

frostedoyster reviewed Mar 27, 2025

View reviewed changes

frostedoyster reviewed Mar 28, 2025

View reviewed changes

src/metatrain/experimental/nanopet/modules/augmentation.py Outdated Show resolved Hide resolved

src/metatrain/utils/data/target_info.py Outdated Show resolved Hide resolved

jwa7 and others added 2 commits April 3, 2025 16:48

Use DiskDataset.

4c2942b

Co-authored-by: Filippo Bigi <[email protected]> Co-authored-by: Paolo Pegolo <paolo.pegolo.epfl.ch>

Linter on docs.

130b110

Co-authored-by: Paolo Pegolo <[email protected]>

jwa7 force-pushed the pet_atomic_basis branch from 04d0e2f to 130b110 Compare April 4, 2025 14:07

jwa7 added 7 commits April 8, 2025 09:04

Merge branch 'main' into pet_atomic_basis

014cdf1

Predict whole matrix, part 1

716bde1

Treat matrices as a single target

3779229

Update docs

4c52ce8

Format, lint

06bf314

Remove uncoupled basis targets for now

4f3d879

Modify per_atom.py

fe6f26d

jwa7 requested review from Luthaf, frostedoyster and ppegolo April 16, 2025 14:24

Merge branch 'main' into pet_atomic_basis

696a62d

jwa7 marked this pull request as ready for review April 16, 2025 14:35

ppegolo reviewed Apr 17, 2025

View reviewed changes

src/metatrain/utils/augmentation.py Show resolved Hide resolved

ppegolo reviewed Apr 17, 2025

View reviewed changes

src/metatrain/utils/data/get_dataset.py Outdated Show resolved Hide resolved

Paolo review comments

a1e06d5

jwa7 self-assigned this Apr 17, 2025

Luthaf reviewed Apr 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable NanoPET for atomic-basis spherical targets #527

Enable NanoPET for atomic-basis spherical targets #527

jwa7 commented Mar 21, 2025 •

edited

Loading

frostedoyster left a comment

jwa7 commented Apr 16, 2025 •

edited

Loading

Luthaf left a comment

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

Luthaf Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

Luthaf Apr 17, 2025

jwa7 Apr 17, 2025

		from metatensor.torch.atomistic import System


		def get_samples(

		# ===== Slicing PET node/edge features for an atomic basis =====


		def samples_for_atomic_basis_per_atom(

Enable NanoPET for atomic-basis spherical targets #527

Are you sure you want to change the base?

Enable NanoPET for atomic-basis spherical targets #527

Conversation

jwa7 commented Mar 21, 2025 • edited Loading

Overview

NanoPET architecture details

Other metatrain infrastructure details

Contributor (creator of pull-request) checklist

Reviewer checklist

frostedoyster left a comment

Choose a reason for hiding this comment

jwa7 commented Apr 16, 2025 • edited Loading

Luthaf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwa7 commented Mar 21, 2025 •

edited

Loading

jwa7 commented Apr 16, 2025 •

edited

Loading