-
Notifications
You must be signed in to change notification settings - Fork 0
Bazel migration for cloud_gcp module #240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Caution Review failedThe pull request is closed. WalkthroughThis pull request introduces comprehensive updates to build configurations across multiple projects, focusing on Scala library definitions, dependency management, and testing infrastructure. The changes span several Bazel build files, Maven repository configurations, and test files, with a primary emphasis on standardizing dependency declarations, updating library versions, and enhancing test suite configurations. Changes
Sequence DiagramsequenceDiagram
participant Build Config
participant Maven Repo
participant Scala Libraries
participant Test Suites
Build Config->>Maven Repo: Update artifact versions
Maven Repo->>Scala Libraries: Provide new dependencies
Scala Libraries->>Test Suites: Configure test dependencies
Test Suites->>Build Config: Validate configuration
Possibly related PRs
Suggested Reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (9)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
cloud_gcp/BUILD.bazel (3)
1-37
: Clean up logging dependencies and add version constraints.
- Remove redundant logging deps - keep either reload4j or log4j-slf4j-impl
- Add version constraints for critical deps
39-79
: Reduce dependency duplication.
- Remove deps already included in ":lib"
- Keep only test-specific dependencies here
81-99
: Document the need for JVM flags.Add comments explaining why each module needs to be opened.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (5)
WORKSPACE
(1 hunks)cloud_gcp/BUILD.bazel
(1 hunks)spark/BUILD.bazel
(2 hunks)tools/build_rules/dependencies/maven_repository.bzl
(2 hunks)tools/build_rules/dependencies/spark_repository.bzl
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: fetcher_spark_tests
- GitHub Check: join_spark_tests
- GitHub Check: mutation_spark_tests
- GitHub Check: table_utils_delta_format_spark_tests
- GitHub Check: other_spark_tests
🔇 Additional comments (6)
tools/build_rules/dependencies/spark_repository.bzl (1)
16-17
: LGTM! Verify Apache Curator compatibility.Apache Curator 5.5.0 is a solid choice for distributed coordination.
WORKSPACE (1)
35-37
:⚠️ Potential issueReconsider downgrading rules_jvm_external.
Downgrading from 6.6 to 4.5 could introduce compatibility issues with newer dependencies.
✅ Verification successful
Version downgrade appears safe
The codebase only uses core rules_jvm_external features that are stable across versions. No version 6.6-specific functionality detected.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for any version-specific rules_jvm_external features in use rg "rules_jvm_external.*6\.6"Length of output: 31
Script:
#!/bin/bash # Check maven_install usage rg "maven_install\(" -A 5 # Check for rules_jvm_external specific rules and configs rg "load\(.*rules_jvm_external" -A 2Length of output: 3385
tools/build_rules/dependencies/maven_repository.bzl (1)
21-21
: LGTM! Mockito version update.Update to mockito-core 5.12.0 aligns with modern testing practices.
spark/BUILD.bazel (2)
4-4
: Verify public visibility requirement.Consider using more restrictive visibility if possible.
✅ Verification successful
Public visibility is appropriate
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for direct dependencies on spark/lib rg "//spark:lib" --type bazelLength of output: 164
24-25
: Verify logging configuration.Log4j API and Core added, but log4j2.properties removed. Ensure logging is configured programmatically.
✅ Verification successful
Logging is properly configured programmatically
Complete Log4j2 configuration found in
SparkSessionBuilder.configureLogging()
with console appender and pattern layout.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Look for programmatic log4j configuration rg "LogManager\.getContext|ConfigurationBuilder" --type scalaLength of output: 347
Script:
#!/bin/bash # Get full context of logging configuration rg -B 5 -A 10 "ConfigurationBuilderFactory" --type scalaLength of output: 2750
Script:
#!/bin/bash # Find where configureLogging is called rg "configureLogging" --type scalaLength of output: 220
cloud_gcp/BUILD.bazel (1)
81-137
: Verify test failures are not Bazel-specific.PR mentions failing tests. Confirm failures occur in both sbt and Bazel builds.
# Google Cloud | ||
"com.google.cloud:google-cloud-bigquery:2.42.0", | ||
"com.google.cloud:google-cloud-bigtable:2.41.0", | ||
"com.google.cloud:google-cloud-pubsub:1.131.0", | ||
"com.google.cloud:google-cloud-dataproc:4.52.0", | ||
"com.google.cloud.bigdataoss:gcs-connector:hadoop3-2.2.6", | ||
"com.google.cloud.bigdataoss:gcsio:3.0.3", | ||
"com.google.cloud.bigdataoss:util-hadoop:3.0.0", | ||
"com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.41.0", | ||
"com.google.cloud:google-cloud-bigtable-emulator:0.178.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Update gcs-connector version in maven_repository.bzl
Version mismatch found: gcs-connector is defined as hadoop3-2.2.6 in maven_repository.bzl but hadoop3-2.2.26 is used in build.sbt. Update maven_repository.bzl to use version hadoop3-2.2.26 for consistency.
🔗 Analysis chain
Verify Google Cloud dependencies compatibility.
Comprehensive set of Google Cloud dependencies added. Ensure version compatibility across the stack.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for existing Google Cloud API usage
rg "com\.google\.cloud\." --type scala
Length of output: 7932
Script:
#!/bin/bash
# Check build.sbt for version declarations
cat build.sbt
Length of output: 21640
cloud_gcp/BUILD.bazel
Outdated
deps = [ | ||
":lib", | ||
":test-lib", | ||
"//api:thrift", | ||
"//api:lib", | ||
"//online:lib", | ||
"//spark:lib", | ||
"//tools/build_rules/spark:spark-exec", | ||
# Libraries | ||
scala_artifact("org.scala-lang.modules:scala-java8-compat"), | ||
scala_artifact("org.scala-lang.modules:scala-collection-compat"), | ||
maven_artifact("com.google.cloud:google-cloud-bigtable"), | ||
maven_artifact("com.google.cloud:google-cloud-dataproc"), | ||
maven_artifact("com.google.cloud.bigdataoss:gcs-connector"), | ||
maven_artifact("com.google.cloud.bigdataoss:gcsio"), | ||
maven_artifact("com.google.cloud.bigdataoss:util-hadoop"), | ||
maven_artifact("com.google.cloud:google-cloud-bigtable-emulator"), | ||
maven_artifact("com.google.api:api-common"), | ||
maven_artifact("com.google.api.grpc:proto-google-cloud-dataproc-v1"), | ||
scala_artifact("com.google.cloud.spark:spark-bigquery-with-dependencies"), | ||
maven_artifact("com.google.api:gax"), | ||
maven_artifact("com.google.protobuf:protobuf-java"), | ||
# Testing | ||
scala_artifact("org.scalatest:scalatest-matchers-core"), | ||
scala_artifact("org.scalatest:scalatest-core"), | ||
scala_artifact("org.scalatest:scalatest"), | ||
scala_artifact("org.scalatest:scalatest-flatspec"), | ||
scala_artifact("org.scalatest:scalatest-funsuite"), | ||
scala_artifact("org.scalatest:scalatest-shouldmatchers"), | ||
scala_artifact("org.scalactic:scalactic"), | ||
scala_artifact("org.scalatestplus:mockito-3-4"), | ||
scala_artifact("org.mockito:mockito-scala"), | ||
maven_artifact("org.mockito:mockito-core"), | ||
maven_artifact("org.scalatest:scalatest-compatible"), | ||
maven_artifact("junit:junit"), | ||
maven_artifact("com.novocode:junit-interface"), | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove duplicated dependencies.
Dependencies already exist in test-lib. Just depend on ":test-lib".
deps = [
":lib",
":test-lib",
- "//api:thrift",
- "//api:lib",
- # ... remove all other deps
],
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
deps = [ | |
":lib", | |
":test-lib", | |
"//api:thrift", | |
"//api:lib", | |
"//online:lib", | |
"//spark:lib", | |
"//tools/build_rules/spark:spark-exec", | |
# Libraries | |
scala_artifact("org.scala-lang.modules:scala-java8-compat"), | |
scala_artifact("org.scala-lang.modules:scala-collection-compat"), | |
maven_artifact("com.google.cloud:google-cloud-bigtable"), | |
maven_artifact("com.google.cloud:google-cloud-dataproc"), | |
maven_artifact("com.google.cloud.bigdataoss:gcs-connector"), | |
maven_artifact("com.google.cloud.bigdataoss:gcsio"), | |
maven_artifact("com.google.cloud.bigdataoss:util-hadoop"), | |
maven_artifact("com.google.cloud:google-cloud-bigtable-emulator"), | |
maven_artifact("com.google.api:api-common"), | |
maven_artifact("com.google.api.grpc:proto-google-cloud-dataproc-v1"), | |
scala_artifact("com.google.cloud.spark:spark-bigquery-with-dependencies"), | |
maven_artifact("com.google.api:gax"), | |
maven_artifact("com.google.protobuf:protobuf-java"), | |
# Testing | |
scala_artifact("org.scalatest:scalatest-matchers-core"), | |
scala_artifact("org.scalatest:scalatest-core"), | |
scala_artifact("org.scalatest:scalatest"), | |
scala_artifact("org.scalatest:scalatest-flatspec"), | |
scala_artifact("org.scalatest:scalatest-funsuite"), | |
scala_artifact("org.scalatest:scalatest-shouldmatchers"), | |
scala_artifact("org.scalactic:scalactic"), | |
scala_artifact("org.scalatestplus:mockito-3-4"), | |
scala_artifact("org.mockito:mockito-scala"), | |
maven_artifact("org.mockito:mockito-core"), | |
maven_artifact("org.scalatest:scalatest-compatible"), | |
maven_artifact("junit:junit"), | |
maven_artifact("com.novocode:junit-interface"), | |
], | |
deps = [ | |
":lib", | |
":test-lib", | |
], |
WORKSPACE
Outdated
strip_prefix = "rules_jvm_external-6.6", | ||
sha256 = "3afe5195069bd379373528899c03a3072f568d33bd96fe037bd43b1f590535e7", | ||
url = "https://github.com/bazel-contrib/rules_jvm_external/releases/download/6.6/rules_jvm_external-6.6.tar.gz" | ||
strip_prefix = "rules_jvm_external-4.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why downgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally downgrading so "group:artifact:version" specification works for dependency versions not starting with number but looks like in the new 6.6 package we need to specify using "group:artifact:packaging:version" format. Added a comment with code reference and we no longer need this change.
cloud_gcp/BUILD.bazel
Outdated
"//spark:lib", | ||
"//tools/build_rules/spark:spark-exec", | ||
# Libraries | ||
scala_artifact("org.scala-lang.modules:scala-java8-compat"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out shared deps into a common
array and re-use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactoring nit - but lgtm
## Summary Migrated `cloud_gcp` module to Bazel from sbt. Below are still pending and will fix them in a separate PR * Few tests are failing, this is actually a general problem across the project and working on it in parallel * cloud_gcp_submitter target logic also need to be migrated ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **Dependency Updates** - Downgraded `rules_jvm_external` from version 6.6 to 4.5 - Updated Mockito version to 5.12.0 - Added new Google Cloud service libraries (BigQuery, Bigtable, PubSub, Dataproc) - Added Apache Curator artifact - **Build Configuration** - Updated library visibility settings - Modified logging dependencies in Spark library - Enhanced Maven repository with additional artifacts - **Development Improvements** - Expanded testing and dependency management capabilities <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Migrated `cloud_gcp` module to Bazel from sbt. Below are still pending and will fix them in a separate PR * Few tests are failing, this is actually a general problem across the project and working on it in parallel * cloud_gcp_submitter target logic also need to be migrated ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **Dependency Updates** - Downgraded `rules_jvm_external` from version 6.6 to 4.5 - Updated Mockito version to 5.12.0 - Added new Google Cloud service libraries (BigQuery, Bigtable, PubSub, Dataproc) - Added Apache Curator artifact - **Build Configuration** - Updated library visibility settings - Modified logging dependencies in Spark library - Enhanced Maven repository with additional artifacts - **Development Improvements** - Expanded testing and dependency management capabilities <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Migrated `cloud_gcp` module to Bazel from sbt. Below are still pending and will fix them in a separate PR * Few tests are failing, this is actually a general problem across the project and working on it in parallel * cloud_gcp_submitter target logic also need to be migrated ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **Dependency Updates** - Downgraded `rules_jvm_external` from version 6.6 to 4.5 - Updated Mockito version to 5.12.0 - Added new Google Cloud service libraries (BigQuery, Bigtable, PubSub, Dataproc) - Added Apache Curator artifact - **Build Configuration** - Updated library visibility settings - Modified logging dependencies in Spark library - Enhanced Maven repository with additional artifacts - **Development Improvements** - Expanded testing and dependency management capabilities <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Migrated `cloud_gcp` module to Bazel from sbt. Below are still pending and will fix them in a separate PR * Few tests are failing, this is actually a general problem across the project and working on it in parallel * cloud_gcp_submitter target logic also need to be migrated ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **Dependency Updates** - Downgraded `rules_jvm_external` from version 6.6 to 4.5 - Updated Mockito version to 5.12.0 - Added new Google Cloud service libraries (BigQuery, Bigtable, PubSub, Dataproc) - Added Apache Curator artifact - **Build Configuration** - Updated library visibility settings - Modified logging dependencies in Spark library - Enhanced Maven repository with additional artifacts - **Development Improvements** - Expanded testing and dependency management capabilities <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Migrated `cloud_gcp` module to Bazel from sbt. Below are still pending and will fix them in a separate PR * Few tests are failing, this is actually a general problem across the project and working on it in parallel * cloud_gcp_submitter target logic also need to be migrated ## Cheour clientslist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **Dependency Updates** - Downgraded `rules_jvm_external` from version 6.6 to 4.5 - Updated Moour clientsito version to 5.12.0 - Added new Google Cloud service libraries (BigQuery, Bigtable, PubSub, Dataproc) - Added Apache Curator artifact - **Build Configuration** - Updated library visibility settings - Modified logging dependencies in Spark library - Enhanced Maven repository with additional artifacts - **Development Improvements** - Expanded testing and dependency management capabilities <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Migrated
cloud_gcp
module to Bazel from sbt. Below are still pending and will fix them in a separate PRChecklist
Summary by CodeRabbit
Release Notes
New Dependencies
Build Configuration
Testing Improvements