Skip to content

Implement I/O for parquet files #307

Open
@niksirbi

Description

@niksirbi

Is your feature request related to a problem? Please describe.
This is mainly to facilitate data exchange with @roaldarbol's animovement R package, which represents data in a tidy dataframe. Read previous discussion on zulip.

Describe the solution you'd like
This would necessitate implementing two new I/O functions:

  • load_poses.from_tidy_df()
  • save_poses.to_tidy_df()

The tidy dataframe could be a pandas version of the table used as the primary data structure by animovement.
After that, we can rely on existing pandas to_parquet and read_parquet methods.

Describe alternatives you've considered
We could also consider wrapping the above functions into load_poses.from_animovement_file and to_animovement_file, which will do both things:

  • read/write parquet from/to a pandas Dataframe (using the aforementioned native pandas functions)
  • convert between a tidy dataframe and a movement xarray dataset (using the two new functions proposed above).

This is similar to how we handle DeepLabCut dataframes and files.

Additional context
Having the ability to convert movement datasets into 2D "tidy" format unlocks all sorts of new possibilities of saving them to formats optimised for "tables" Having the dataset in this form (where every variable is a columns) also makes it easier to use certain plotting libraries, like seaborn.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    🚧 In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions