Skip to content

fix: unable to create cutout when dataset vars are ordered differently #341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

veech
Copy link

@veech veech commented May 24, 2025

Description

When attempting to create a limited area cutout using a dataset definition like this:

dataset: 
  cutout:
    - era5-ml88-conus-0p25-2020010100-2020013123-1h-v2.zarr
    - join:
        - era5-ml137t49-geopotential-0p25-2020010100-2020013123-1h-v1.zarr
        - era5-ml137t49-sp-0p25-2020010100-2020013123-1h-v1.zarr
        - era5-ml137t49-temperature-0p25-2020010100-2020013123-1h-v1.zarr
        - era5-ml137t49-uwind-0p25-2020010100-2020013123-1h-v1.zarr
        - era5-ml137t49-vorticity-0p25-2020010100-2020013123-1h-v1.zarr
        - era5-ml137t49-vwind-0p25-2g020010100-2020013123-1h-v1.zarr
        - era5-sfc-all-0p25-2020010100-2020013123-1h-v1.zarr

I would get the error:

ValueError: Incompatible variables: 

This comes from here.

Even though the datasets have the same variables, it must be that when I join the boundary dataset, the variables are in a different order.

@FussyDuck
Copy link

FussyDuck commented May 24, 2025

CLA assistant check
All committers have signed the CLA.

@mchantry mchantry moved this to Reviewers needed in Anemoi-dev Jun 4, 2025
@floriankrb
Copy link
Member

If the order of the variables is not correct, open_dataset rightfully complains that the data can not be merged

To avoid this, the variables can be reordered using select: and give a list : https://anemoi.readthedocs.io/projects/datasets/en/latest/datasets/using/selecting.html

It can be tested with : ds = open_dataset(dataset, select=["2t", "tp"])

--
Alternatively, you may want to try https://anemoi.readthedocs.io/projects/datasets/en/latest/datasets/using/matching.html#using-matching to adjust automatically the variables when doing a concat ds = open_dataset(concat=[dataset1, dataset2], adjust="variables")
But I don't think it applies directly to your case, it seems that each dataset contains only one variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Reviewers needed
Development

Successfully merging this pull request may close these issues.

3 participants