Skip to content

[SPARK-52751][PYTHON][CONNECT] Don't eagerly validate column name in dataframe['col_name'] #51400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Jul 8, 2025

What changes were proposed in this pull request?

Don't eagerly validate column name in dataframe['col_name']

Why are the changes needed?

to save ANALYZE RPC, fail the query on connect server side

Does this PR introduce any user-facing change?

yes, df['bad_column'] will fail on analysis or execution

How was this patch tested?

updated tests

Was this patch authored or co-authored using generative AI tooling?

no

@zhengruifeng zhengruifeng force-pushed the test_fail_col branch 2 times, most recently from d7c67da to 5405e02 Compare July 10, 2025 05:35
@zhengruifeng zhengruifeng changed the title [WIP] Delay column name validation [SPARK-52751][PYTHON][CONNECT] Don't eagerly validate column name in dataframe['col_name'] Jul 10, 2025
@zhengruifeng zhengruifeng marked this pull request as ready for review July 10, 2025 10:18
@xinrong-meng
Copy link
Member

I'm wondering if there is a user facing error type/message change introduced?

@zhengruifeng
Copy link
Contributor Author

I'm wondering if there is a user facing error type/message change introduced?

It is kind of behavior change, let me add a flag

test

test

fix test

fix
add flag

add flag

lint

lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants