Skip to content

Move to stanio appears to break arviz converter #693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tillahoffmann opened this issue Sep 7, 2023 · 10 comments
Closed

Move to stanio appears to break arviz converter #693

tillahoffmann opened this issue Sep 7, 2023 · 10 comments

Comments

@tillahoffmann
Copy link
Contributor

The refactor in #681 to move data processing into a dedicated package appears to break the arviz.data.io_cmdstanpy.CmdStanPyConverter. In particular, f6a9e24 removed the stan_vars_cols property from cmdstanpy.stanfit.metadata.InferenceMetadata which the converter accesses here. Not sure where to best address the discrepancy, but thought I'd provide a heads up.

@mitzimorris
Copy link
Member

thanks - I worked on this a long time ago - I could try to find a fix.

@WardBrian
Copy link
Member

The bug should probably be filed against ArViz, since they're using API internals:

No guarantees are made about backwards compatibility between minor versions and refactors are expected. If you find yourself needing something exposed here, please open an issue requesting it be added to the public API.

@ahartikainen
Copy link
Contributor

Yes, this is ArviZ bug and we need to add issue there. We probably want to add similar fixes as we have had before (against version of cmdstanpy).

Is there any way to access this information in cmdstanpy side?

@WardBrian
Copy link
Member

It depends what you mean by "this information". For certain contexts, fit.column_names is probably a replacement

@WardBrian
Copy link
Member

Although, if Arviz wants to handle tuple parameters, using stan_variables or the data_xr method directly is probably easier

@OriolAbril
Copy link
Contributor

We can try to find a workaround. I'll try to open a PR to ArviZ when I have time.

On a different but related topic, I also wanted to test the waters on updating the xarray converter in cmdstanpy. I think it is great (and actually better) to have the converters in the ppl interfaces so they are always "aligned" with the ppl output instead of needing to support multiple versions at all time because they are a different library.

However, I can't recommend the current xarray converter because there is no way for the user to define the dimensions and coordinates, which to me is like getting the worst of both worlds. Using xarray kind of forces you to use labels for dimensions and coordinates instead of positional indexes, but you are unable to define them so you get partially arbitrary dimensions and coordinates still based on positional indexing.

I can help support that and also wanted to propose depending on arviz-base to take advantage of the scaffolding it will provide. arviz-base is still experimental, but will be a very small python library with only xarray as a dependency, with mostly helper functions to convert ppl outputs to xarray. There already is an initial release but I still say will because it depends on xarray-datatree package too, which is a package where xarray devs are developing a new data structure with czi funding. Once datatree development stabilizes it will be moved into the main xarray codebase. (datatree will be similar to inferencedata but significanly more flexible)

For the inferencedata/datatree converter, it could either also be in cmdstanpy or be in arviz and limit itself to using data_xr and organizing/sorting the variables into the different groups. I think this is much less important

@WardBrian
Copy link
Member

I would definitely be in favor of improvements to the data_xr method, especially if they are helpful to Arviz.

I’d also be happy to have Arviz as a test dependency after that is sorted out to make sure we don’t inadvertently break the converter in the future (assuming it is using only public APIs like the xarray converter).

As for actual runtime dependencies, I think I would currently say wait-and-see. We don’t currently require xarray as a dependency, so taking that would be bigger than arviz_base for us

@OriolAbril
Copy link
Contributor

We don’t currently require xarray as a dependency, so taking that would be bigger than arviz_base for us

optional is perfectly fine too, I was mostly thinking about switching xarray or arviz-base

@WardBrian
Copy link
Member

I think it would be fine if the optional dependencies were xarray and arviz-base.

My assumption is users who want to use xarray likely have arviz in their environment, anyway

@WardBrian
Copy link
Member

Since arviz is handling this (arviz-devs/arviz#2276, arviz-devs/arviz#2278) I'm going to close this issue.

@OriolAbril I encourage you to open a new issue or PR about improvements to data_xr if we can make arviz' life better by adding useful functionality here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants