Best way to check if one of the columns present or not #767
Unanswered
adityaguru149
asked this question in
Q&A
Replies: 1 comment 1 reply
-
hi @adityaguru149 good question!
You can set class DFSchema(pa.SchemaModel):
id1: pa.typing.Series[str]
id2: Optional[pa.typing.Series[str]]
id3: Optional[pa.typing.Series[str]]
data: pa.typing.Series[float]
# private attributes can contain arbitrary metadata
_column_options = {"id1", "id2"}
@pa.dataframe_check(
# error keyword arg gives you custom error messages
error=f"does not contain at least one of {_column_options}"
)
def atleast_one_from_column_options_present_check(cls, df: pd.DataFrame) -> bool:
columns_found = cls._column_options.intersection(df.columns)
return len(columns_found) > 0 The error summary looks like this:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Question about pandera
Note: If you'd still like to submit a question, please read this guide detailing how to provide the necessary information for us to reproduce your question.
Use case - User provides a group_by column, code needs to groupby that column (at least one column is user supplied rest can be considered fixed) and then aggregate on another column
ex- groupby id1 and either of id2 or id3 and aggregate on data
Issue very similar to this issue in pydantic
At present, I have coded it as the following (def atleast_one_from_column_options_present_check)
Is there a better method? pandera checks? decorators?
How do I show the column_options (none of which is present) in Error Message?
Can this be taken up as a feature request to add it as a generic decorator function that can be used on schemas or schema models?
Beta Was this translation helpful? Give feedback.
All reactions