Skip to content

Add polars engine dtypes #1465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

FilipAisot
Copy link
Contributor

@FilipAisot FilipAisot commented Jan 23, 2024

Description

This commit/PR includes the following enhancements and additions to Pandera with Polars support:

Accomplished:

  • Added Polars type support to Pandera.
  • Implemented the Polars engine.
  • Introduced Polars utilities for detecting incoercible items.
  • Added unit tests for the above implementations.

Additional Information on Category Type:

  • The Category type is considered optional. It involves implementing a logical type utilizing the polars.Utf8 type, with additional logic to replicate the functionality of the Pandera Category type.
  • This addition is essential as Polars lacks built-in support for the Pandera Category type, ensuring compatibility with Pandera's functionality.
  • Note that this functionality is not currently available in the core Polars library, making this enhancement particularly valuable.

Additional Information on Decimal Type:

  • The current implementation of the polars.Decimal type is currently considered unstable, hence it was not used for implementing this type.
  • This type is a standard polars.Float64 with a logical layer which makes it behave as a pandera.Decimal type.

Todos

  • Implement support for Array, List, and Struct types.
  • Test Decimal and Category types.

@FilipAisot FilipAisot force-pushed the polars-dev-add-engine-dtypes branch from e6a5a6f to f46fa9f Compare January 28, 2024 18:13
Copy link
Collaborator

@cosmicBboy cosmicBboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @FilipAisot and thanks for the review @AndriiG13 🚀

Comment on lines +388 to +390
###############################################################################
# Nested types
###############################################################################
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a placeholder comment for nested types? https://docs.pola.rs/py-polars/html/reference/datatypes.html#nested

Would this be implemented in a follow-up PR?

Copy link
Contributor Author

@FilipAisot FilipAisot Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to first get the fundamental stuff in, now I will follow up with this.

@cosmicBboy cosmicBboy merged commit edd85db into unionai-oss:polars-dev Feb 19, 2024
cosmicBboy pushed a commit that referenced this pull request Feb 20, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy pushed a commit that referenced this pull request Feb 21, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy pushed a commit that referenced this pull request Feb 23, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy pushed a commit that referenced this pull request Feb 23, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy added a commit that referenced this pull request Mar 8, 2024
* add stub classes

Signed-off-by: Niels Bantilan <[email protected]>

* Add polars engine dtypes (#1465)

* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>

* implement polars backend methods

Signed-off-by: cosmicBboy <[email protected]>

* implement methods for polars backend

Signed-off-by: cosmicBboy <[email protected]>

* implement data type coercion, strictness logic

Signed-off-by: cosmicBboy <[email protected]>

* implement add_missing_columns

Signed-off-by: cosmicBboy <[email protected]>

* implement core check methods

Signed-off-by: cosmicBboy <[email protected]>

* add dataframe model and components for polars

Signed-off-by: cosmicBboy <[email protected]>

* revert model component FieldInfo

Signed-off-by: cosmicBboy <[email protected]>

* fix core unit test regressions

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* move extract config logic into class definition

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* polars DataFrameModel uses new dataframe model api

Signed-off-by: cosmicBboy <[email protected]>

* simplify FieldInfo: decouple framework-specific model component

Signed-off-by: cosmicBboy <[email protected]>

* remove unused types

Signed-off-by: cosmicBboy <[email protected]>

* add more polars tests, clean-up pandas/polars api and backends

Signed-off-by: cosmicBboy <[email protected]>

* add more container and component checks

Signed-off-by: cosmicBboy <[email protected]>

* add polars component tests

Signed-off-by: cosmicBboy <[email protected]>

---------

Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: filipAisot <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
Co-authored-by: FilipAisot <[email protected]>
cosmicBboy pushed a commit that referenced this pull request Mar 11, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy added a commit that referenced this pull request Mar 11, 2024
* add stub classes

Signed-off-by: Niels Bantilan <[email protected]>

* Add polars engine dtypes (#1465)

* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>

* implement polars backend methods

Signed-off-by: cosmicBboy <[email protected]>

* implement methods for polars backend

Signed-off-by: cosmicBboy <[email protected]>

* implement data type coercion, strictness logic

Signed-off-by: cosmicBboy <[email protected]>

* implement add_missing_columns

Signed-off-by: cosmicBboy <[email protected]>

* implement core check methods

Signed-off-by: cosmicBboy <[email protected]>

* add dataframe model and components for polars

Signed-off-by: cosmicBboy <[email protected]>

* revert model component FieldInfo

Signed-off-by: cosmicBboy <[email protected]>

* fix core unit test regressions

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* move extract config logic into class definition

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* polars DataFrameModel uses new dataframe model api

Signed-off-by: cosmicBboy <[email protected]>

* simplify FieldInfo: decouple framework-specific model component

Signed-off-by: cosmicBboy <[email protected]>

* remove unused types

Signed-off-by: cosmicBboy <[email protected]>

* add more polars tests, clean-up pandas/polars api and backends

Signed-off-by: cosmicBboy <[email protected]>

* add more container and component checks

Signed-off-by: cosmicBboy <[email protected]>

* add polars component tests

Signed-off-by: cosmicBboy <[email protected]>

---------

Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: filipAisot <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
Co-authored-by: FilipAisot <[email protected]>
cosmicBboy pushed a commit that referenced this pull request Mar 11, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy added a commit that referenced this pull request Mar 11, 2024
* add stub classes

Signed-off-by: Niels Bantilan <[email protected]>

* Add polars engine dtypes (#1465)

* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>

* implement polars backend methods

Signed-off-by: cosmicBboy <[email protected]>

* implement methods for polars backend

Signed-off-by: cosmicBboy <[email protected]>

* implement data type coercion, strictness logic

Signed-off-by: cosmicBboy <[email protected]>

* implement add_missing_columns

Signed-off-by: cosmicBboy <[email protected]>

* implement core check methods

Signed-off-by: cosmicBboy <[email protected]>

* add dataframe model and components for polars

Signed-off-by: cosmicBboy <[email protected]>

* revert model component FieldInfo

Signed-off-by: cosmicBboy <[email protected]>

* fix core unit test regressions

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* move extract config logic into class definition

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* polars DataFrameModel uses new dataframe model api

Signed-off-by: cosmicBboy <[email protected]>

* simplify FieldInfo: decouple framework-specific model component

Signed-off-by: cosmicBboy <[email protected]>

* remove unused types

Signed-off-by: cosmicBboy <[email protected]>

* add more polars tests, clean-up pandas/polars api and backends

Signed-off-by: cosmicBboy <[email protected]>

* add more container and component checks

Signed-off-by: cosmicBboy <[email protected]>

* add polars component tests

Signed-off-by: cosmicBboy <[email protected]>

---------

Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: filipAisot <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
Co-authored-by: FilipAisot <[email protected]>
cosmicBboy pushed a commit that referenced this pull request Mar 15, 2024
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
cosmicBboy added a commit that referenced this pull request Mar 15, 2024
* add stub classes

Signed-off-by: Niels Bantilan <[email protected]>

* Add polars engine dtypes (#1465)

* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>

* implement polars backend methods

Signed-off-by: cosmicBboy <[email protected]>

* implement methods for polars backend

Signed-off-by: cosmicBboy <[email protected]>

* implement data type coercion, strictness logic

Signed-off-by: cosmicBboy <[email protected]>

* implement add_missing_columns

Signed-off-by: cosmicBboy <[email protected]>

* implement core check methods

Signed-off-by: cosmicBboy <[email protected]>

* add dataframe model and components for polars

Signed-off-by: cosmicBboy <[email protected]>

* revert model component FieldInfo

Signed-off-by: cosmicBboy <[email protected]>

* fix core unit test regressions

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* move extract config logic into class definition

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* polars DataFrameModel uses new dataframe model api

Signed-off-by: cosmicBboy <[email protected]>

* simplify FieldInfo: decouple framework-specific model component

Signed-off-by: cosmicBboy <[email protected]>

* remove unused types

Signed-off-by: cosmicBboy <[email protected]>

* add more polars tests, clean-up pandas/polars api and backends

Signed-off-by: cosmicBboy <[email protected]>

* add more container and component checks

Signed-off-by: cosmicBboy <[email protected]>

* add polars component tests

Signed-off-by: cosmicBboy <[email protected]>

---------

Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: filipAisot <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
Co-authored-by: FilipAisot <[email protected]>
max-raphael pushed a commit to max-raphael/pandera that referenced this pull request Jan 24, 2025
* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>
max-raphael pushed a commit to max-raphael/pandera that referenced this pull request Jan 24, 2025
* add stub classes

Signed-off-by: Niels Bantilan <[email protected]>

* Add polars engine dtypes (unionai-oss#1465)

* Add polars engine and dtypes.

Signed-off-by: filipAisot <[email protected]>

* Add polars dependency.

Signed-off-by: filipAisot <[email protected]>

* Fix polars tests for polars >= 0.20.0

Signed-off-by: filipAisot <[email protected]>

* Fix polars engine. Add unittests for equivalence checks.

Signed-off-by: filipAisot <[email protected]>

---------

Signed-off-by: filipAisot <[email protected]>

* implement polars backend methods

Signed-off-by: cosmicBboy <[email protected]>

* implement methods for polars backend

Signed-off-by: cosmicBboy <[email protected]>

* implement data type coercion, strictness logic

Signed-off-by: cosmicBboy <[email protected]>

* implement add_missing_columns

Signed-off-by: cosmicBboy <[email protected]>

* implement core check methods

Signed-off-by: cosmicBboy <[email protected]>

* add dataframe model and components for polars

Signed-off-by: cosmicBboy <[email protected]>

* revert model component FieldInfo

Signed-off-by: cosmicBboy <[email protected]>

* fix core unit test regressions

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* move extract config logic into class definition

Signed-off-by: cosmicBboy <[email protected]>

* implement generic DataFrameModel

Signed-off-by: cosmicBboy <[email protected]>

* polars DataFrameModel uses new dataframe model api

Signed-off-by: cosmicBboy <[email protected]>

* simplify FieldInfo: decouple framework-specific model component

Signed-off-by: cosmicBboy <[email protected]>

* remove unused types

Signed-off-by: cosmicBboy <[email protected]>

* add more polars tests, clean-up pandas/polars api and backends

Signed-off-by: cosmicBboy <[email protected]>

* add more container and component checks

Signed-off-by: cosmicBboy <[email protected]>

* add polars component tests

Signed-off-by: cosmicBboy <[email protected]>

---------

Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: filipAisot <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
Co-authored-by: FilipAisot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants