-
Notifications
You must be signed in to change notification settings - Fork 0
Etsy Patina / MMX shading reverts and other fixes #544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ush/etsy_nobt_shading
WalkthroughThis PR updates dependency management across multiple modules. Shaded versions of Bigtable and SnakeYAML libraries are replaced with their standard Maven artifacts in both BUILD files and source code. Import statements in several Scala and Java files are updated accordingly. Additionally, a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GroupByFetcher
participant FetchContext
participant ResponseHandler
Client->>GroupByFetcher: Streaming request
GroupByFetcher->>FetchContext: isTilingEnabled?
alt Tiling enabled
GroupByFetcher->>ResponseHandler: Build tile key & process tiled data
else Tiling disabled
GroupByFetcher->>ResponseHandler: Process raw events
end
Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions and Pipeline Checks: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
💤 Files with no reviewable changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (17)
🔇 Additional comments (3)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cloud_gcp/BUILD.bazel (1)
89-128
: Consider cleanup of unused shaded library definitionsThe shaded library definitions remain in the file but are no longer referenced in shared_deps. Consider removing them if no longer needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (12)
cloud_gcp/BUILD.bazel
(2 hunks)cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/BigTableKVStoreImpl.scala
(1 hunks)cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/DataprocSubmitter.scala
(1 hunks)cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/GcpApiImpl.scala
(1 hunks)cloud_gcp/src/test/scala/ai/chronon/integrations/cloud_gcp/BigTableKVStoreTest.scala
(1 hunks)online/src/main/java/ai/chronon/online/ThriftDecoder.java
(1 hunks)online/src/main/scala/ai/chronon/online/fetcher/FetchContext.scala
(1 hunks)online/src/main/scala/ai/chronon/online/fetcher/GroupByFetcher.scala
(1 hunks)online/src/main/scala/ai/chronon/online/fetcher/GroupByResponseHandler.scala
(1 hunks)spark/BUILD.bazel
(1 hunks)spark/src/main/scala/ai/chronon/spark/Driver.scala
(1 hunks)tools/build_rules/cloud_gcp_embedded/BUILD
(0 hunks)
💤 Files with no reviewable changes (1)
- tools/build_rules/cloud_gcp_embedded/BUILD
🧰 Additional context used
🧬 Code Definitions (3)
online/src/main/scala/ai/chronon/online/fetcher/GroupByFetcher.scala (1)
online/src/main/scala/ai/chronon/online/fetcher/FetchContext.scala (1)
isTilingEnabled
(20-24)
cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/BigTableKVStoreImpl.scala (2)
api/src/main/scala/ai/chronon/api/Builders.scala (1)
Query
(41-66)online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (1)
Range
(163-175)
online/src/main/scala/ai/chronon/online/fetcher/GroupByResponseHandler.scala (1)
online/src/main/scala/ai/chronon/online/fetcher/FetchContext.scala (1)
isTilingEnabled
(20-24)
⏰ Context from checks skipped due to timeout of 90000ms (16)
- GitHub Check: streaming_tests
- GitHub Check: streaming_tests
- GitHub Check: spark_tests
- GitHub Check: spark_tests
- GitHub Check: join_tests
- GitHub Check: non_spark_tests
- GitHub Check: join_tests
- GitHub Check: groupby_tests
- GitHub Check: groupby_tests
- GitHub Check: fetcher_tests
- GitHub Check: fetcher_tests
- GitHub Check: analyzer_tests
- GitHub Check: analyzer_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: bazel_config_tests
- GitHub Check: non_spark_tests
🔇 Additional comments (14)
online/src/main/java/ai/chronon/online/ThriftDecoder.java (1)
20-24
: Thrift imports updated.
Using vendored imports fromai.chronon.api.thrift
is consistent with dependency changes.online/src/main/scala/ai/chronon/online/fetcher/FetchContext.scala (1)
20-24
: LGTM! Good method implementation.The new
isTilingEnabled
method safely handles null flagStore and properly checks the TILING_ENABLED flag. This centralizes tiling feature flag logic.online/src/main/scala/ai/chronon/online/fetcher/GroupByFetcher.scala (1)
89-89
: LGTM! Consistent usage of new method.Changed to use centralized
fetchContext.isTilingEnabled
method instead of direct flag access.online/src/main/scala/ai/chronon/online/fetcher/GroupByResponseHandler.scala (1)
115-115
: LGTM! Good refactoring.Using the centralized
fetchContext.isTilingEnabled
method for consistency.cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/DataprocSubmitter.scala (1)
12-12
: Import updated to standard library version.Changed from shaded to standard SnakeYAML import, aligning with the PR's goal to revert shading.
spark/src/main/scala/ai/chronon/spark/Driver.scala (1)
55-55
: Import updated to standard library version.Switched from shaded to standard SnakeYAML import, maintaining consistency with other files.
cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/BigTableKVStoreImpl.scala (1)
25-34
: Updated Bigtable imports to standard library.All Bigtable-related imports now use standard version instead of shaded version, aligning with PR objective to revert BT jar shading.
cloud_gcp/src/main/scala/ai/chronon/integrations/cloud_gcp/GcpApiImpl.scala (2)
14-18
: Updated Bigtable imports to standard library.Imports now reference standard Bigtable library instead of shaded version.
26-35
: Flag store implementation enables tiling.This implementation ensures tiling is always enabled when the TILING_ENABLED flag is checked.
spark/BUILD.bazel (1)
36-36
: Dependency updated to standard library.Replaced shaded_snakeyaml dependency with standard Maven artifact, matching import changes.
cloud_gcp/BUILD.bazel (3)
16-16
: Standard Bigtable artifact replacing shaded versionDirect Maven artifact now used instead of the
:shaded_bigtable
dependency, matching the PR goal of moving away from shading logic for BT jars.
30-30
: Standard SnakeYAML artifact replacing shaded versionDirect Maven artifact now used instead of
:shaded_snakeyaml
, aligning with dependency updates.
33-33
: Added reload4j dependencyNew logging dependency added to shared deps.
cloud_gcp/src/test/scala/ai/chronon/integrations/cloud_gcp/BigTableKVStoreTest.scala (1)
13-19
: Updated import statements for standard Bigtable librariesImport paths updated from shaded to standard versions, consistent with BUILD file changes.
## Summary We decided to push the bulk of the shading logic into a script on the Etsy side. Shading our BT jars was painful as they were shading things again so we decided to roll that back. We still have the two forked bazel targets (cloud_gcp_lib and cloud_gcp_embedded_lib) with the embedded one being used by Etsy. Also made a couple of other updates: * Update the thrift decoder class to use our vendored thrift -> A couple of follow up AIs on the thrift front would be to: a) [add thrift version guards while publishing locally](https://linear.app/zipline-ai/issue/ZIP-651/add-thrift-version-guards-while-publishing-locally) b) revisit using shading instead of vendoring our thrift (more long term) * Fetcher changes to use flag store to detect tiling / not for now - I have a Flink PR upcoming that yanks out the untiled implementation and can put up a follow up to drop the flag store references to use untiled and default to tiled across the board. ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Updated dependency imports to use the official versions of Bigtable, YAML, and Thrift libraries, ensuring improved stability and compatibility. - **New Features** - Introduced a tiling configuration check for streaming requests, providing more flexible data handling. - Added a method to check if tiling is enabled in the FetchContext class. - **Chores** - Streamlined build configurations by removing deprecated dependencies and leveraging direct Maven artifacts, enabling clients to manage specific dependency versions. - Introduced a new flag store for managing tiling functionality during tests. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary We decided to push the bulk of the shading logic into a script on the Etsy side. Shading our BT jars was painful as they were shading things again so we decided to roll that back. We still have the two forked bazel targets (cloud_gcp_lib and cloud_gcp_embedded_lib) with the embedded one being used by Etsy. Also made a couple of other updates: * Update the thrift decoder class to use our vendored thrift -> A couple of follow up AIs on the thrift front would be to: a) [add thrift version guards while publishing locally](https://linear.app/zipline-ai/issue/ZIP-651/add-thrift-version-guards-while-publishing-locally) b) revisit using shading instead of vendoring our thrift (more long term) * Fetcher changes to use flag store to detect tiling / not for now - I have a Flink PR upcoming that yanks out the untiled implementation and can put up a follow up to drop the flag store references to use untiled and default to tiled across the board. ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Updated dependency imports to use the official versions of Bigtable, YAML, and Thrift libraries, ensuring improved stability and compatibility. - **New Features** - Introduced a tiling configuration check for streaming requests, providing more flexible data handling. - Added a method to check if tiling is enabled in the FetchContext class. - **Chores** - Streamlined build configurations by removing deprecated dependencies and leveraging direct Maven artifacts, enabling clients to manage specific dependency versions. - Introduced a new flag store for managing tiling functionality during tests. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ixes (#544) ## Summary We decided to push the bulk of the shading logic into a script on the our clients side. Shading our BT jars was painful as they were shading things again so we decided to roll that back. We still have the two forked bazel targets (cloud_gcp_lib and cloud_gcp_embedded_lib) with the embedded one being used by our clients. Also made a couple of other updates: * Update the thrift decoder class to use our vendored thrift -> A couple of follow up AIs on the thrift front would be to: a) [add thrift version guards while publishing locally](https://linear.app/zipline-ai/issue/ZIP-651/add-thrift-version-guards-while-publishing-locally) b) revisit using shading instead of vendoring our thrift (more long term) * Fetcher changes to use flag store to detect tiling / not for now - I have a Flink PR upcoming that yanks out the untiled implementation and can put up a follow up to drop the flag store references to use untiled and default to tiled across the board. ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Updated dependency imports to use the official versions of Bigtable, YAML, and Thrift libraries, ensuring improved stability and compatibility. - **New Features** - Introduced a tiling configuration check for streaming requests, providing more flexible data handling. - Added a method to check if tiling is enabled in the FetchContext class. - **Chores** - Streamlined build configurations by removing deprecated dependencies and leveraging direct Maven artifacts, enabling clients to manage specific dependency versions. - Introduced a new flag store for managing tiling functionality during tests. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ixes (#544) ## Summary We decided to push the bulk of the shading logic into a script on the our clients side. Shading our BT jars was painful as they were shading things again so we decided to roll that back. We still have the two forked bazel targets (cloud_gcp_lib and cloud_gcp_embedded_lib) with the embedded one being used by our clients. Also made a couple of other updates: * Update the thrift decoder class to use our vendored thrift -> A couple of follow up AIs on the thrift front would be to: a) [add thrift version guards while publishing locally](https://linear.app/zipline-ai/issue/ZIP-651/add-thrift-version-guards-while-publishing-locally) b) revisit using shading instead of vendoring our thrift (more long term) * Fetcher changes to use flag store to detect tiling / not for now - I have a Flink PR upcoming that yanks out the untiled implementation and can put up a follow up to drop the flag store references to use untiled and default to tiled across the board. ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Updated dependency imports to use the official versions of Bigtable, YAML, and Thrift libraries, ensuring improved stability and compatibility. - **New Features** - Introduced a tiling configuration check for streaming requests, providing more flexible data handling. - Added a method to check if tiling is enabled in the FetchContext class. - **Chores** - Streamlined build configurations by removing deprecated dependencies and leveraging direct Maven artifacts, enabling clients to manage specific dependency versions. - Introduced a new flag store for managing tiling functionality during tests. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ixes (#544) ## Summary We decided to push the bulk of the shading logic into a script on the our clients side. Shading our BT jars was painful as they were shading things again so we decided to roll that baour clients. We still have the two forked bazel targets (cloud_gcp_lib and cloud_gcp_embedded_lib) with the embedded one being used by our clients. Also made a couple of other updates: * Update the thrift decoder class to use our vendored thrift -> A couple of follow up AIs on the thrift front would be to: a) [add thrift version guards while publishing locally](https://linear.app/zipline-ai/issue/ZIP-651/add-thrift-version-guards-while-publishing-locally) b) revisit using shading instead of vendoring our thrift (more long term) * Fetcher changes to use flag store to detect tiling / not for now - I have a Flink PR upcoming that yanks out the untiled implementation and can put up a follow up to drop the flag store references to use untiled and default to tiled across the board. ## Cheour clientslist - [ ] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Updated dependency imports to use the official versions of Bigtable, YAML, and Thrift libraries, ensuring improved stability and compatibility. - **New Features** - Introduced a tiling configuration cheour clients for streaming requests, providing more flexible data handling. - Added a method to cheour clients if tiling is enabled in the FetchContext class. - **Chores** - Streamlined build configurations by removing deprecated dependencies and leveraging direct Maven artifacts, enabling clients to manage specific dependency versions. - Introduced a new flag store for managing tiling functionality during tests. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
We decided to push the bulk of the shading logic into a script on the Etsy side. Shading our BT jars was painful as they were shading things again so we decided to roll that back. We still have the two forked bazel targets (cloud_gcp_lib and cloud_gcp_embedded_lib) with the embedded one being used by Etsy.
Also made a couple of other updates:
Checklist
Summary by CodeRabbit