Skip to content

Multiple Datasets III - Configure time coverage #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JPXKQX opened this issue Apr 7, 2025 · 3 comments
Open

Multiple Datasets III - Configure time coverage #232

JPXKQX opened this issue Apr 7, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@JPXKQX
Copy link
Member

JPXKQX commented Apr 7, 2025

Add the ability to define time coverage individually per dataset/source, both in input and output.

Goal

Train a forecasting model that:

  • Takes as input: ERA [t, t-6h] + CERRA [t, t-1h, t-2h]
  • Takes as output: cerra [t+1h, t+2h, t+3h]
  • No rollout
  • No diagnostics

Scope:

  • Config: Extend dataset config to allow specifying custom temporal offsets for each dataset.
  • Data module: Implement logic in datamodule/data handler to fetch time slices accordingly.
  • Training: Adapt the loss function to be computed over all time steps.

Notes

  • A step toward native support for temporal interpolation, delayed/coupled inputs, multi-step output, ...
  • Requires tight integration with metadata and config schema. New Metadata Schema #229
@JPXKQX JPXKQX added the enhancement New feature or request label Apr 7, 2025
@JPXKQX JPXKQX added this to the Multiple datasets milestone Apr 7, 2025
@eliott-lumet
Copy link

eliott-lumet commented May 5, 2025

Hello, for the spatial downscaling task (#233) it would be relevant to also handle the training of a model that:

  • Takes as input: ERA [t]
  • Takes as output: CERRA [t]

@mchantry
Copy link
Member

mchantry commented May 5, 2025

Thanks @eliott-lumet that's helpful feedback. Downscaling is planning for this milestone, through a combination of this issue and #233

@sabrinawahl
Copy link

For regional models an additional feature would be to have a model that takes global/boundary data at future/predicted time steps as input:

  • Global data input at t, t+3h
  • Regional data input t, t-3h, ...
  • Regional data output t+3h

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

4 participants