Skip to content

Residual Scaling (normalization strategy) #319

Open
@PortillaS-Predictia

Description

@PortillaS-Predictia

Is your feature request related to a problem? Please describe.

Following Watt-Meyer et al., 2023, 2024; ACE & ACE2. Variables are normalized using a residual scaling approach such that predicting outputs equal to input would result in each variable contributing equally to the loss function.

This should be straightforward, since anemoi-datasets already computes the tendencies statistics needed for this normalization strategy. However, the reference formulas (Appendix H) use the standard deviation of the mean-std normalized fields (not the unnormalized fields, which is what anemoi-datasets actually computes these statistics for).

Describe the solution you'd like

See our proposed solution. This is an implementation of the reference formulas (Appendix H), using the tendencies statistics computed by anemoi-datasets. We reworked the reference formulas so that the statistics of the unnormalized fields are used instead:

Let $a$ be the target field, and $a_{\textup{ff}}$ the mean-std normalized (or full-field normalized, using their terminology) image $a_{\textup{ff}}=\frac{a-\mu(a)}{\sigma(a)}$. Then, the residual scaling is $a_{\textup{res}}=\frac{a_{\textup{ff}}}{\sigma_{\textup{res}}(a)}$, where $\sigma_{\textup{res}}(a) = \frac{\sigma(a\prime_\textup{ff})}{\eta_{a\in\mathbf{T}}\left(\sigma(a\prime_\textup{ff})\right)}$ is the standard deviation of the tendency (of the mean-std normalized field, not the unnormalized field), divided by the geometric mean $\eta$ (of this quantity) of all targeted fields $\mathbf{T}$ (hereafter, we will omit the $a\in\mathbf{T}$ for clarity). Notice that these are just the reference equations from the paper (using $\eta$ for the geometric mean instead). Now, we want to rewrite these equations in terms of the statistics of the unnormalized field $a$ (instead of the normalized field $a_\textup{ff}$). Thus, we have $\sigma(a\prime_\textup{ff})=\sigma(a_\textup{ff}(t+1)-a_\textup{ff}(t))=\sigma\left(\frac{a(t+1) - \mu(a)}{\sigma(a)} - \frac{a(t) - \mu(a)}{\sigma(a)}\right)=\frac{\sigma(a(t+1)-a(t))}{\sigma(a)}= \frac{\sigma(a\prime)}{\sigma(a)}$, which depends only on the unnormalized field $a$. Then, we rewrite the residual scaling as $a_{\textup{res}} = \frac{a_{\textup{ff}}}{\sigma_{\textup{res}}(a)} = \frac{(a-\mu(a))/\sigma(a)}{\sigma(a\prime_{\textup{ff}})/\eta\left(\sigma(a\prime_\textup{ff})\right)} = \eta\left(\sigma(a\prime_\textup{ff})\right) \cdot \frac{a-\mu(a)}{\sigma(a)\cdot\sigma(a\prime_{\textup{ff}})}$. Now, using the relation $\sigma(a\prime_\textup{ff})=\frac{\sigma(a\prime)}{\sigma(a)}$, we have $\eta\left(\sigma(a\prime_\textup{ff})\right)=\eta\left(\frac{\sigma(a\prime)}{\sigma(a)}\right)$ and $\frac{a-\mu(a)}{\sigma(a)\cdot\sigma(a\prime_{\textup{ff}})}=\frac{a-\mu(a)}{\sigma(a)\cdot\frac{\sigma(a\prime)}{\sigma(a)}}=\frac{a-\mu(a)}{\sigma(a\prime)}$. Thus, $a_{\textup{res}} = \eta\left(\frac{\sigma(a\prime)}{\sigma(a)}\right)\cdot\frac{a-\mu(a)}{\sigma(a\prime)}$, which depends only on the unnormalized field $a$ (which is what anemoi-datasets actually computes the tendencies statistics for). Thus, the residual scaling consists of adding $\mathbf{add}=-\frac{\mu(a)\cdot\eta\left(\frac{\sigma(a\prime)}{\sigma(a)}\right)}{\sigma(a\prime)}$ and multiplying by $\mathbf{mul}=\frac{\eta\left(\frac{\sigma(a\prime)}{\sigma(a)}\right)}{\sigma(a\prime)}$.

In our implementation, the geometric mean $\eta\left(\frac{\sigma(a\prime)}{\sigma(a)}\right)$ is computed iteratively within the main for loop, and multiplied afterwards to both $\mathbf{add}$ and $\mathbf{mul}$ (notice that we use the logarithmic definition of geometric mean). If the tendencies' stdev doesn't exist in the statistics dictionary (because the tendencies statistics weren't computed during the creation of the dataset), then the code fallbacks to using the stdev instead, in which case the formulae above reduces to a mean-std normalization.

Additionally, we added the tendencies' stdev (and its ratio to the stdev) in the inspect command in anemoi-datasets.

Describe alternatives you've considered

No response

Additional context

No response

Organisation

Predictia Intelligent Data Solutions - DestinationEarth393

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions