-
Notifications
You must be signed in to change notification settings - Fork 637
feat(pyspark): support partition_by
key for PySpark file writes
#8900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
partition_by
key for PySparkpartition_by
key for PySpark file writes
My team came across this and need to use partition_by. Is this already being worked on, if not we'd love to open a PR and give it a shot |
It is not being worked on, please give it a go! |
…artitionBy argument to create_table method in pyspark backend to enable partitioned table creation\n\nfixes ibis-project#8900
Adds the partitionBy argument to create_table method in pyspark backend to enable partitioned table creation fixes ibis-project#8900
Threw up this PR (#10850) that adds Looks like for PySpark backend, partitioning is available for |
Override to_parquet method in pyspark backend to leverage pyspark.sql.DataFrameWriter to enable partitioning and other kwargs fixes ibis-project#8900
Adds the partitionBy argument to create_table method in pyspark backend to enable partitioned table creation fixes ibis-project#8900
Override to_parquet method in pyspark backend to leverage pyspark.sql.DataFrameWriter to enable partitioning and other kwargs fixes ibis-project#8900
Adds the partitionBy argument to create_table method in pyspark backend to enable partitioned table creation fixes ibis-project#8900
Adds the partitionBy argument to create_table method in pyspark backend to enable partitioned table creation fixes #8900
Is your feature request related to a problem?
I can't specify partition key.
Describe the solution you'd like
I'd like to be able to specify partition key. Basically, PySpark takes
partitionBy
key. Need to figure out if just want to aliaspartition_by
passed to the write method aspartitionBy
.Need to also verify read path works; current test works using
/ * / *
wildcard pattern; PySpark may require resolving that to a set of paths to read.What version of ibis are you running?
8
What backend(s) are you using, if any?
PySpark
Code of Conduct
The text was updated successfully, but these errors were encountered: