-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add col to partition-spec #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis change refactors how partition columns are handled across the codebase. The explicit passing of partition column names as parameters is removed from various methods and constructors. Instead, the partition column is now encapsulated within the Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant PartitionSpec
participant DataRange/Column/OtherConsumers
Caller->>PartitionSpec: Construct with (column, format, spanMillis)
Caller->>DataRange/Column/OtherConsumers: Call methods (no explicit partitionColumn)
DataRange/Column/OtherConsumers->>PartitionSpec: Access .column as needed
Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (6)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms (31)
🔇 Additional comments (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@@ -1133,27 +1132,11 @@ object Extensions { | |||
result | |||
} | |||
|
|||
// mutationsOnSnapshot table appends default values for mutation_ts and is_before column on the snapshotTable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx for doing this! !
groupByServingInfo.setBatchEndDate(nextDay) | ||
groupByServingInfo.setGroupBy(groupByConf) | ||
groupByServingInfo.setKeyAvroSchema(groupBy.keySchema.toAvroSchema("Key").toString(true)) | ||
groupByServingInfo.setSelectedAvroSchema(groupBy.preAggSchema.toAvroSchema("Value").toString(true)) | ||
groupByServingInfo.setDateFormat(tableUtils.partitionFormat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this come from tableUtils.partitionSpec
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its the same thing essentially
## Summary ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added optional fields for partition format and partition interval to query definitions, allowing greater flexibility in specifying partitioning behavior. - **Refactor** - Simplified partition specification usage across the platform by consolidating partition column, format, and interval into a single object. - Updated multiple interfaces and methods to derive partition column and related metadata from the unified partition specification, reducing explicit parameter passing. - Streamlined class and method signatures to improve consistency and maintainability. - Removed deprecated partition specs and adjusted related logic to use the updated partition specification format. - Enhanced SQL clause generation to internally use partition specification details, removing the need to pass partition column explicitly. - Adjusted data generation and query construction logic to rely on the updated partition specification model. - Simplified construction and usage of partition specifications in data processing and metadata components. - Improved handling of partition specs in Spark-related utilities and jobs for consistency. - **Chores** - Updated tests and internal utilities to align with the new partition specification structure. - Reduced test data volume in join tests to optimize test runtime and resource usage. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Thomas Chow <[email protected]>
## Summary ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added optional fields for partition format and partition interval to query definitions, allowing greater flexibility in specifying partitioning behavior. - **Refactor** - Simplified partition specification usage across the platform by consolidating partition column, format, and interval into a single object. - Updated multiple interfaces and methods to derive partition column and related metadata from the unified partition specification, reducing explicit parameter passing. - Streamlined class and method signatures to improve consistency and maintainability. - Removed deprecated partition specs and adjusted related logic to use the updated partition specification format. - Enhanced SQL clause generation to internally use partition specification details, removing the need to pass partition column explicitly. - Adjusted data generation and query construction logic to rely on the updated partition specification model. - Simplified construction and usage of partition specifications in data processing and metadata components. - Improved handling of partition specs in Spark-related utilities and jobs for consistency. - **Chores** - Updated tests and internal utilities to align with the new partition specification structure. - Reduced test data volume in join tests to optimize test runtime and resource usage. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Thomas Chow <[email protected]>
## Summary ## Cheour clientslist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added optional fields for partition format and partition interval to query definitions, allowing greater flexibility in specifying partitioning behavior. - **Refactor** - Simplified partition specification usage across the platform by consolidating partition column, format, and interval into a single object. - Updated multiple interfaces and methods to derive partition column and related metadata from the unified partition specification, reducing explicit parameter passing. - Streamlined class and method signatures to improve consistency and maintainability. - Removed deprecated partition specs and adjusted related logic to use the updated partition specification format. - Enhanced SQL clause generation to internally use partition specification details, removing the need to pass partition column explicitly. - Adjusted data generation and query construction logic to rely on the updated partition specification model. - Simplified construction and usage of partition specifications in data processing and metadata components. - Improved handling of partition specs in Spark-related utilities and jobs for consistency. - **Chores** - Updated tests and internal utilities to align with the new partition specification structure. - Reduced test data volume in join tests to optimize test runtime and resource usage. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Thomas Chow <[email protected]>
Summary
Checklist
Summary by CodeRabbit
New Features
Refactor
Chores