Skip to content

Keeping track of Polars DataTypes for Polars schemas support #1422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
AndriiG13 opened this issue Nov 13, 2023 · 0 comments
Open

Keeping track of Polars DataTypes for Polars schemas support #1422

AndriiG13 opened this issue Nov 13, 2023 · 0 comments

Comments

@AndriiG13
Copy link
Contributor

This issue is for keeping track of more niche Polars datatypes during the 'adding basic Polars support' effort #1064. As far as I understand, for these datatypes we should first consider if we want to support them at all and then if we want to support built-in checks for them. The considerations below came out of adding built-in checks to Polars schemas #1421, which is part of the larger basic support effort.

These considerations are limited to my experience and research when implementing built-in checks, so any additions/feedback is much appreciated.

Unknown Datatype:

I feel that supporting this would go against the point of having a schema in the first place

Object Datatype:

This is for wrapping arbitrary Python objects. I believe (correct me if I'm wrong) that this is supported for Pandas schemas. Regarding implementing checks for this, this type may be too general to know which checks to support or not. Additionally, running some comparisons with this type (like equality) resulted in a Polars exception: cannot coerce datatypes: ComputeError(ErrString('failed to determine supertype of object and i32')).

Decimal Datatype:

Useful to support this, but looks like the datatype is labeled as 'experimental' by (Polars). So we should consider if we want to implement support for it already, or wait till it loses the 'experimental' label. Further, there are some issues when creating a DF with this type, see issue issue

Array Datatype:

This is different from the List datatype, in that Array values in a column are a fixed length.
Equality comparison for this type was not working until recently, see issue. So if we want to support checks for it we need to make sure that the user has at least the minimum required Polars version.

Struct Datatype:

I did not find a way to compare a struct type column to a Python dictionary (or other Python types), which would make adding check support difficult. We should investigate this further if we want to support checks for Structs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant