Skip to content

Bazel migration for api module #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 8, 2025
Merged

Bazel migration for api module #183

merged 10 commits into from
Jan 8, 2025

Conversation

kumar-zlai
Copy link
Contributor

@kumar-zlai kumar-zlai commented Jan 8, 2025

Summary

Migrated api module to Bazel from sbt. Started with api module as it doesn't have any dependency on other modules. will work on rest of the modules in coming PR's.

Both bazel, sbt builds and tests are working at this point. Updated documentation with build and test instructions using bazel.

Also cleaned up version specific Scala to Java conversions logic as they are no longer needed and we only support scala 2.12 version.

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

Release Notes

  • Build System

    • Upgraded to Bazel 6.4.0
    • Introduced comprehensive build rules for Maven and Spark dependencies
    • Added utility functions for artifact and repository management
  • Scala Conversions

    • Replaced standard library Scala Java conversions with custom ScalaJavaConversions
    • Enhanced collection conversion methods with improved null handling
    • Standardized conversion imports across project modules
  • Dependency Management

    • Added support for Scala 2.12.18
    • Configured repositories for Maven, Spark, and other core libraries
    • Implemented flexible artifact and JAR library management
  • Thrift Support

    • Added custom Thrift Java file generation
    • Implemented package name replacement for generated files
    • Created utility for generating Thrift Java libraries

These changes improve build reproducibility, dependency management, and code consistency across the Chronon project.

Copy link
Contributor

coderabbitai bot commented Jan 8, 2025

Walkthrough

This pull request introduces significant updates to the Chronon project's build system and dependency management. Key changes include the addition of a specific Bazel version in the .bazeliskrc file, updates to the .gitignore file to manage build artifacts, and the creation of a new WORKSPACE file that specifies dependencies for Scala, Java, JVM, and Protobuf. Additionally, several utility functions for handling Maven artifacts and dependencies have been introduced across various build rule files.

Changes

File/Group Change Summary
.bazeliskrc Added Bazel version 6.4.0 configuration
.gitignore Added /bazel-* entry, restored /project/**/metals.sbt.lock
WORKSPACE New workspace declaration, added dependencies for Scala, Java, JVM, and Protobuf
ScalaJavaConversions.scala Refactored collection conversion methods, introduced toJava and toScala
tools/build_rules/common.bzl Added functions for managing JAR dependencies
tools/build_rules/dependencies/all_repositories.bzl Introduced repository management functions
tools/build_rules/dependencies/defs.bzl Enhanced handling of versioned artifacts and repository definitions
tools/build_rules/dependencies/load_dependencies.bzl Added function to load Maven dependencies
tools/build_rules/dependencies/maven_repository.bzl Defined a new Maven repository for managing artifacts
tools/build_rules/dependencies/spark_repository.bzl Introduced a new repository configuration for Spark dependencies
tools/build_rules/jar_library.bzl Added function for creating Java libraries with dependencies
tools/build_rules/maven_artifact.bzl Defined functions for managing Maven artifacts
tools/build_rules/prelude_bazel Introduced default rules and functions for BUILD files
tools/build_rules/spark/BUILD Created a new BUILD file for Spark-related dependencies
tools/build_rules/thrift/thrift.bzl Added functionality for generating Java files from Thrift definitions
tools/build_rules/utils.bzl Introduced several utility functions for list management

Possibly related PRs

Suggested Reviewers

  • nikhil-zlai
  • piyush-zlai
  • varant-zlai
  • chewy-zlai

Poem

🏗️ Bazel's dance of code and might,
Dependencies aligned just right,
Conversions smooth as silk above,
A build system crafted with love!
🚀 Chronon rises, build complete!

Warning

Review ran into problems

🔥 Problems

GitHub Actions: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository.

Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 997ec6e and b55e922.

📒 Files selected for processing (25)
  • hub/src/main/scala/ai/chronon/hub/handlers/TimeSeriesHandler.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/DataStreamBuilder.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/GroupBy.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Join.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/TableUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/CompareJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/ConsistencyJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/utils/MockApi.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/GroupByTest.scala (5 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/GroupByUploadTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/LocalDataLoaderTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/LocalTableExporterTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/MigrationCompareTest.scala (3 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/MockKVStore.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/SchemaEvolutionTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/StagingQueryTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/DerivationTest.scala (4 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala (6 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/udafs/ApproxDistinctTest.scala (3 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/udafs/HistogramTest.scala (2 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/udafs/NullnessCountersAggregatorTest.scala (3 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/udafs/UDAFSQLUsageTest.scala (2 hunks)
✅ Files skipped from review due to trivial changes (10)
  • spark/src/test/scala/ai/chronon/spark/test/LocalDataLoaderTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/StagingQueryTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/udafs/UDAFSQLUsageTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/udafs/ApproxDistinctTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/udafs/HistogramTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/udafs/NullnessCountersAggregatorTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/MigrationCompareTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/LocalTableExporterTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/MockKVStore.scala
  • spark/src/test/scala/ai/chronon/spark/test/GroupByTest.scala
🚧 Files skipped from review as they are similar to previous changes (15)
  • spark/src/main/scala/ai/chronon/spark/utils/MockApi.scala
  • spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala
  • spark/src/main/scala/ai/chronon/spark/GroupBy.scala
  • spark/src/main/scala/ai/chronon/spark/stats/ConsistencyJob.scala
  • spark/src/test/scala/ai/chronon/spark/test/SchemaEvolutionTest.scala
  • online/src/main/scala/ai/chronon/online/DataStreamBuilder.scala
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala
  • spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala
  • spark/src/test/scala/ai/chronon/spark/test/GroupByUploadTest.scala
  • spark/src/main/scala/ai/chronon/spark/TableUtils.scala
  • spark/src/main/scala/ai/chronon/spark/stats/CompareJob.scala
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/DerivationTest.scala
  • spark/src/main/scala/ai/chronon/spark/Join.scala
  • spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala
  • hub/src/main/scala/ai/chronon/hub/handlers/TimeSeriesHandler.scala
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: join_spark_tests
  • GitHub Check: mutation_spark_tests
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: fetcher_spark_tests

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (13)
tools/build_rules/utils.bzl (1)

4-5: Use isinstance() pattern instead of type()

Replace with return isinstance(x, type([])) for better type checking.

online/src/main/java/ai/chronon/online/JavaResponse.java (1)

38-42: Consider moving null check to ScalaJavaConversions

Would improve reusability across all conversion calls.

api/src/main/scala/ai/chronon/api/ScalaJavaConversions.scala (1)

8-13: Consider using Option instead of null

While the null checks are good, using Option would be more idiomatic in Scala.

Example for map conversion:

-  def toJava[K, V](map: Map[K, V]): java.util.Map[K, V] = {
-    if (map == null) {
-      null
-    } else {
-      map.asJava
-    }
+  def toJava[K, V](map: Option[Map[K, V]]): Option[java.util.Map[K, V]] = {
+    map.map(_.asJava)

Also applies to: 16-21, 24-29, 32-37

online/src/main/java/ai/chronon/online/JavaExternalSourceHandler.java (1)

41-41: Remove TODO comment since the migration is complete.

The TODO about deprecating ScalaVersionSpecificCollectionsConverter is now obsolete.

tools/build_rules/thrift/thrift.bzl (1)

63-67: Use Bazel's Java rules for jar creation

Prefer java_common APIs over shell commands for portability.

tools/build_rules/jar_library.bzl (1)

4-4: Fix typo in comment

"compatability" → "compatibility"

Apply this diff:

-DEFAULT_PROVIDED_REPO = "maven"  # For backwards compatability
+DEFAULT_PROVIDED_REPO = "maven"  # For backwards compatibility
tools/build_rules/common.bzl (1)

4-12: LGTM! But consider adding docstring.

The jar() function correctly handles JAR dependency strings.

Add docstring explaining parameters and return value.

tools/build_rules/dependencies/load_dependencies.bzl (1)

6-8: Add fallback Maven repositories.

Single repository URL is risky.

Add mirrors and enterprise repositories if available.

tools/build_rules/maven_artifact.bzl (1)

5-6: Consider enhancing character replacement.

Add handling for additional Maven coordinate characters like @ and $.

-    return coord.replace(":", "_").replace(".", "_").replace("-", "_")
+    return coord.replace(":", "_").replace(".", "_").replace("-", "_").replace("@", "_").replace("$", "_")
api/BUILD.bazel (3)

4-20: Add version constraints for critical dependencies.

Consider pinning versions for jackson and slf4j dependencies.


22-41: Add test coverage configuration.

Consider adding Jacoco coverage reporting.


48-59: Consider separating generated and hand-written sources.

Split into separate targets for better maintainability.

WORKSPACE (1)

18-30: Consider upgrading Java version.

Java 8 is aging. Consider moving to a newer LTS version.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between db26196 and a20e103.

📒 Files selected for processing (77)
  • .bazeliskrc (1 hunks)
  • .gitignore (1 hunks)
  • WORKSPACE (1 hunks)
  • aggregator/src/main/scala/ai/chronon/aggregator/row/MapColumnAggregator.scala (1 hunks)
  • aggregator/src/main/scala/ai/chronon/aggregator/row/StatsGenerator.scala (1 hunks)
  • aggregator/src/main/scala/ai/chronon/aggregator/windowing/Resolution.scala (1 hunks)
  • api/BUILD.bazel (1 hunks)
  • api/src/main/scala-2.11/scala/util/ScalaVersionSpecificCollectionsConverter.scala (0 hunks)
  • api/src/main/scala-2.13/scala/util/ScalaVersionSpecificCollectionsConverter.scala (0 hunks)
  • api/src/main/scala/ai/chronon/api/Builders.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/DataType.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/Extensions.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/ScalaJavaConversions.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/ThriftJsonCodec.scala (1 hunks)
  • api/src/test/scala/ai/chronon/api/test/ExtensionsTest.scala (1 hunks)
  • hub/src/main/scala/ai/chronon/hub/handlers/TimeSeriesHandler.scala (1 hunks)
  • online/src/main/java/ai/chronon/online/JavaExternalSourceHandler.java (3 hunks)
  • online/src/main/java/ai/chronon/online/JavaRequest.java (3 hunks)
  • online/src/main/java/ai/chronon/online/JavaResponse.java (3 hunks)
  • online/src/main/java/ai/chronon/online/JavaSeriesStatsResponse.java (2 hunks)
  • online/src/main/java/ai/chronon/online/JavaStatsRequest.java (1 hunks)
  • online/src/main/java/ai/chronon/online/JavaStatsResponse.java (2 hunks)
  • online/src/main/scala-2.11/ai/chronon/online/ScalaVersionSpecificCatalystHelper.scala (0 hunks)
  • online/src/main/scala-2.13/ai/chronon/online/ScalaVersionSpecificCatalystHelper.scala (0 hunks)
  • online/src/main/scala/ai/chronon/online/AvroCodec.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/DataStreamBuilder.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/JoinCodec.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/Metrics.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/SparkInternalRowConversions.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/TileCodec.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala (1 hunks)
  • online/src/test/scala/ai/chronon/online/test/DataStreamBuilderTest.scala (1 hunks)
  • online/src/test/scala/ai/chronon/online/test/stats/DriftMetricsTest.scala (1 hunks)
  • orchestration/src/main/scala/ai/chronon/orchestration/utils/RelevantLeftForJoinPart.scala (1 hunks)
  • service_commons/src/main/java/ai/chronon/service/ApiProvider.java (2 hunks)
  • spark/src/main/scala/ai/chronon/spark/Analyzer.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/BootstrapInfo.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Extensions.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/GroupBy.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Join.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinBase.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/LogFlattenerJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/LoggingSchema.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/StagingQuery.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/TableUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/CompareJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/CompareMetrics.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/ConsistencyJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Expressions.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/utils/MockApi.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/ChainingFetcherTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/GroupByUploadTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/SchemaEvolutionTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/TestUtils.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/DerivationTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/LogBootstrapTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/TableBootstrapTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala (1 hunks)
  • tools/build_rules/common.bzl (1 hunks)
  • tools/build_rules/dependencies/all_repositories.bzl (1 hunks)
  • tools/build_rules/dependencies/defs.bzl (1 hunks)
  • tools/build_rules/dependencies/load_dependencies.bzl (1 hunks)
  • tools/build_rules/dependencies/maven_repository.bzl (1 hunks)
  • tools/build_rules/dependencies/spark_repository.bzl (1 hunks)
  • tools/build_rules/jar_library.bzl (1 hunks)
  • tools/build_rules/maven_artifact.bzl (1 hunks)
  • tools/build_rules/prelude_bazel (1 hunks)
  • tools/build_rules/spark/BUILD (1 hunks)
  • tools/build_rules/thrift/thrift.bzl (1 hunks)
  • tools/build_rules/utils.bzl (1 hunks)
💤 Files with no reviewable changes (4)
  • online/src/main/scala-2.13/ai/chronon/online/ScalaVersionSpecificCatalystHelper.scala
  • online/src/main/scala-2.11/ai/chronon/online/ScalaVersionSpecificCatalystHelper.scala
  • api/src/main/scala-2.13/scala/util/ScalaVersionSpecificCollectionsConverter.scala
  • api/src/main/scala-2.11/scala/util/ScalaVersionSpecificCollectionsConverter.scala
✅ Files skipped from review due to trivial changes (41)
  • .bazeliskrc
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala
  • spark/src/test/scala/ai/chronon/spark/test/SchemaEvolutionTest.scala
  • spark/src/main/scala/ai/chronon/spark/LogFlattenerJob.scala
  • online/src/test/scala/ai/chronon/online/test/stats/DriftMetricsTest.scala
  • aggregator/src/main/scala/ai/chronon/aggregator/row/StatsGenerator.scala
  • spark/src/main/scala/ai/chronon/spark/Analyzer.scala
  • spark/src/test/scala/ai/chronon/spark/test/GroupByUploadTest.scala
  • online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala
  • online/src/main/scala/ai/chronon/online/TileCodec.scala
  • tools/build_rules/prelude_bazel
  • online/src/main/java/ai/chronon/online/JavaStatsRequest.java
  • api/src/test/scala/ai/chronon/api/test/ExtensionsTest.scala
  • spark/src/main/scala/ai/chronon/spark/StagingQuery.scala
  • api/src/main/scala/ai/chronon/api/ThriftJsonCodec.scala
  • spark/src/main/scala/ai/chronon/spark/utils/MockApi.scala
  • online/src/test/scala/ai/chronon/online/test/DataStreamBuilderTest.scala
  • spark/src/main/scala/ai/chronon/spark/Join.scala
  • spark/src/main/scala/ai/chronon/spark/JoinBase.scala
  • spark/src/test/scala/ai/chronon/spark/test/TestUtils.scala
  • spark/src/main/scala/ai/chronon/spark/stats/CompareJob.scala
  • api/src/main/scala/ai/chronon/api/DataType.scala
  • spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala
  • online/src/main/scala/ai/chronon/online/Metrics.scala
  • .gitignore
  • spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala
  • hub/src/main/scala/ai/chronon/hub/handlers/TimeSeriesHandler.scala
  • spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala
  • spark/src/main/scala/ai/chronon/spark/Extensions.scala
  • aggregator/src/main/scala/ai/chronon/aggregator/windowing/Resolution.scala
  • spark/src/main/scala/ai/chronon/spark/stats/CompareMetrics.scala
  • spark/src/main/scala/ai/chronon/spark/BootstrapInfo.scala
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Expressions.scala
  • spark/src/main/scala/ai/chronon/spark/TableUtils.scala
  • spark/src/main/scala/ai/chronon/spark/stats/ConsistencyJob.scala
  • spark/src/main/scala/ai/chronon/spark/GroupBy.scala
  • spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala
  • spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala
  • spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/DerivationTest.scala
🧰 Additional context used
📓 Learnings (1)
service_commons/src/main/java/ai/chronon/service/ApiProvider.java (1)
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#70
File: service/src/main/java/ai/chronon/service/ApiProvider.java:6-6
Timestamp: 2024-12-03T04:04:33.809Z
Learning: The import `scala.util.ScalaVersionSpecificCollectionsConverter` in `service/src/main/java/ai/chronon/service/ApiProvider.java` is correct and should not be flagged in future reviews.
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: mutation_spark_tests
  • GitHub Check: fetcher_spark_tests
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: join_spark_tests
  • GitHub Check: scala_compile_fmt_fix
🔇 Additional comments (30)
tools/build_rules/utils.bzl (1)

17-29: Consider recursion depth limit

Add validation for max_depth to prevent stack overflow.

online/src/main/java/ai/chronon/online/JavaSeriesStatsResponse.java (1)

36-36: LGTM! Clean conversion update

Consistent with the new standardized approach.

Also applies to: 42-42

online/src/main/java/ai/chronon/online/JavaStatsResponse.java (1)

45-45: LGTM! Consistent conversion

Aligned with the new pattern.

online/src/main/java/ai/chronon/online/JavaResponse.java (1)

48-48: LGTM! Clean conversion

Consistent with new pattern.

online/src/main/java/ai/chronon/online/JavaRequest.java (1)

20-20: LGTM: Clean migration to new conversion utilities

The switch to ScalaJavaConversions aligns with the PR's goal of removing version-specific logic.

Also applies to: 41-41, 55-55

spark/src/main/scala/ai/chronon/spark/LoggingSchema.scala (1)

26-26: LGTM: Consistent import update

Import change aligns with the standardization effort.

service_commons/src/main/java/ai/chronon/service/ApiProvider.java (1)

6-6: LGTM: Consistent conversion update

Clean migration to new conversion utilities.

Also applies to: 55-55

online/src/main/java/ai/chronon/online/JavaExternalSourceHandler.java (1)

41-41: LGTM! Migration to ScalaJavaConversions looks good.

The conversion changes align with the Bazel migration objectives.

Also applies to: 52-52

aggregator/src/main/scala/ai/chronon/aggregator/row/MapColumnAggregator.scala (1)

23-23: LGTM! Import change is consistent.

The wildcard import of ScalaJavaConversions aligns with the codebase-wide standardization.

orchestration/src/main/scala/ai/chronon/orchestration/utils/RelevantLeftForJoinPart.scala (1)

11-11: LGTM! Import change is consistent.

The wildcard import matches the standardization pattern.

online/src/main/scala/ai/chronon/online/JoinCodec.scala (1)

31-31: LGTM! Import change is consistent.

The wildcard import aligns with the standardization effort.

online/src/main/scala/ai/chronon/online/AvroCodec.scala (1)

32-32: LGTM!

The import change aligns with the centralized Scala-Java conversions approach.

online/src/main/scala/ai/chronon/online/SparkInternalRowConversions.scala (1)

30-30: LGTM!

The import change is appropriate for the collection conversions used in this file.

spark/src/test/scala/ai/chronon/spark/test/bootstrap/LogBootstrapTest.scala (1)

40-40: LGTM!

The import change maintains consistency with the project-wide migration to centralized conversions.

api/src/main/scala/ai/chronon/api/Extensions.scala (1)

22-22: LGTM! Verify collection conversions.

The import change aligns with the Bazel migration.

✅ Verification successful

Custom ScalaJavaConversions implementation verified

The import change is safe. The custom implementation provides complete Scala-Java collection conversions with additional null safety.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify no remaining scala.util.ScalaJavaConversions imports
rg "scala\.util\.ScalaJavaConversions" 

Length of output: 40


Script:

#!/bin/bash
# Find custom ScalaJavaConversions implementation
ast-grep --pattern 'object ScalaJavaConversions'

# Find collection conversion usage patterns
rg "toScala|toJava" -A 2

Length of output: 85700

api/src/main/scala/ai/chronon/api/Builders.scala (1)

21-21: LGTM! Import aligned with Extensions.scala.

Consistent with the module-wide migration to custom collection conversions.

spark/src/test/scala/ai/chronon/spark/test/bootstrap/TableBootstrapTest.scala (1)

34-34: LGTM! Test imports aligned with core changes.

Consistent with the api module changes.

spark/src/test/scala/ai/chronon/spark/test/ChainingFetcherTest.scala (1)

47-47: LGTM! Test imports aligned with core changes.

Consistent with the api module changes.

tools/build_rules/dependencies/all_repositories.bzl (1)

1-18: Code looks good

tools/build_rules/dependencies/spark_repository.bzl (1)

14-21: Verify version compatibility.

Multiple dependencies with different versions.

✅ Verification successful

Dependencies are properly configured

The codebase consistently uses Apache DataSketches with no version conflicts or security vulnerabilities. Yahoo DataSketches packages are likely transitive Spark dependencies.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for version compatibility issues
for pkg in "apache-curator" "kryo" "sketches-core" "memory" "sketches-hive" "datasketches-java" "datasketches-memory"; do
  gh api graphql -f query='{
    securityVulnerabilities(first: 5, ecosystem: MAVEN, package: "'$pkg'") {
      nodes {
        advisory {
          severity
          publishedAt
        }
        vulnerableVersionRange
      }
    }
  }'
done

Length of output: 2907


Script:

#!/bin/bash
# Check for DataSketches usage patterns
rg -A 2 "import.*datasketches" 

Length of output: 4695

tools/build_rules/maven_artifact.bzl (2)

36-41: LGTM with version compatibility note.

Verify Scala version compatibility with dependencies.


8-34: Add version validation.

Consider adding version validation to prevent incompatible dependencies.

api/BUILD.bazel (1)

43-46: LGTM!

tools/build_rules/spark/BUILD (1)

95-99: LGTM!

WORKSPACE (4)

1-4: LGTM!


32-42: LGTM!


44-60: LGTM!


62-77: Add dependency verification.

Consider adding dependency verification step.

tools/build_rules/dependencies/maven_repository.bzl (2)

78-78: Compatibility risk: Kafka client version

Kafka 3.7.1 is newer than current date (January 2025).


132-133: Missing dependency overrides

Empty overrides block could lead to transitive dependency conflicts.

Comment on lines +100 to +104
thrift_binary = select({
"@platforms//os:linux": "/usr/local/bin/thrift",
"@platforms//os:macos": "/opt/homebrew/bin/thrift",
"//conditions:default": "/usr/local/bin/thrift",
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid hardcoding thrift binary paths

Hardcoded paths reduce portability; make thrift_binary configurable or use a toolchain.

Comment on lines +41 to +55
replace_command = """
shopt -s globstar
for input_file in {input_path}/**/*.java
do
output_file={output_path}/$(basename $input_file)
sed 's/org.apache.thrift/ai.chronon.api.thrift/g' $input_file > $output_file
done
""".format(input_path=input_directory.path, output_path=output_directory.path)

ctx.actions.run_shell(
inputs=[input_directory],
outputs=[output_directory],
command=replace_command,
progress_message="Replacing package names in input Java files for %s" % input_directory.short_path,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Specify bash shell for compatibility

Script uses bash-specific features; set shell="/bin/bash" in ctx.actions.run_shell.

Apply this diff:

             ctx.actions.run_shell(
                 inputs=[input_directory],
                 outputs=[output_directory],
                 command=replace_command,
+                shell="/bin/bash",
                 progress_message="Replacing package names in input Java files for %s" % input_directory.short_path,
             )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
replace_command = """
shopt -s globstar
for input_file in {input_path}/**/*.java
do
output_file={output_path}/$(basename $input_file)
sed 's/org.apache.thrift/ai.chronon.api.thrift/g' $input_file > $output_file
done
""".format(input_path=input_directory.path, output_path=output_directory.path)
ctx.actions.run_shell(
inputs=[input_directory],
outputs=[output_directory],
command=replace_command,
progress_message="Replacing package names in input Java files for %s" % input_directory.short_path,
)
replace_command = """
shopt -s globstar
for input_file in {input_path}/**/*.java
do
output_file={output_path}/$(basename $input_file)
sed 's/org.apache.thrift/ai.chronon.api.thrift/g' $input_file > $output_file
done
""".format(input_path=input_directory.path, output_path=output_directory.path)
ctx.actions.run_shell(
inputs=[input_directory],
outputs=[output_directory],
command=replace_command,
shell="/bin/bash",
progress_message="Replacing package names in input Java files for %s" % input_directory.short_path,
)

Comment on lines +17 to +18
def exclude(org, name):
return "@maven//:" + org + ":" + name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix Maven coordinate formatting in exclude().

The exclude() function incorrectly formats Maven coordinates.

-    return "@maven//:" + org + ":" + name
+    return struct(group = org, artifact = name)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def exclude(org, name):
return "@maven//:" + org + ":" + name
def exclude(org, name):
return struct(group = org, artifact = name)

Comment on lines +10 to +22
def load_all_dependencies():
for repository in all_repositories:
maven_install(
name = repository.name,
artifacts = repository.artifacts,
repositories = _repository_urls,
fetch_sources = True,
duplicate_version_warning = "error",
fail_if_repin_required = True,
resolve_timeout = 5000,
maven_install_json = None,
excluded_artifacts = repository.excluded_artifacts,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for failed downloads.

The function needs retry logic.

Add retry mechanism with exponential backoff.

Comment on lines +101 to +107
java_binary(
name = "spark",
main_class = "None", #hack
runtime_deps = [
"//third_party/java/spark:spark-exec",
],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace temporary hack.

Implement proper main class configuration.


maven_repository = repository(
name = "maven",
pinned = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security risk: Unpinned dependencies

Setting pinned = False allows dependency version float, risking supply chain attacks.

Set pinned = True and specify SHA-256 hashes.

Comment on lines 44 to 47
"org.slf4j:slf4j-api:1.7.30",
"org.slf4j:slf4j-log4j12:1.7.30",
"org.apache.logging.log4j:log4j-core:2.14.1",
"org.apache.logging.log4j:log4j-api:2.14.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Version conflict: Multiple logging frameworks

Mixing SLF4J 1.7.30 with Log4j 2.14.1 and Logback 1.2.3.

Standardize on a single logging framework to avoid runtime conflicts.

Comment on lines 48 to 52
"com.fasterxml.jackson.core:jackson-core:2.12.5",
"com.fasterxml.jackson.core:jackson-annotations:2.12.5",
"com.fasterxml.jackson.core:jackson-databind:2.12.5",
"com.fasterxml.jackson.module:jackson-module-scala_2.12:2.12.5",
"com.fasterxml.jackson.module:jackson-module-scala_2.13:2.12.5",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security vulnerability: Jackson version

Jackson 2.12.5 has known vulnerabilities.

Upgrade to latest 2.15.x version.

Comment on lines 121 to 129
"org.mongodb.scala:mongo-scala-driver_2.12:4.11.1",
"org.mongodb.scala:mongo-scala-bson_2.12:4.11.1",
"org.mongodb:mongodb-driver-core:4.11.1",
"org.mongodb:bson:4.11.1",
"org.mongodb:mongodb-driver-sync:4.11.1",
"org.mongodb:mongodb-driver-reactivestreams:4.11.1",
"org.reactivestreams:reactive-streams:1.0.4",
"org.mongodb:mongodb-driver-reactivestreams:4.11.1",
"org.reactivestreams:reactive-streams:1.0.4",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Duplicate dependencies

Lines 128-129 duplicate lines 126-127.

-        "org.mongodb:mongodb-driver-reactivestreams:4.11.1",
-        "org.reactivestreams:reactive-streams:1.0.4",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"org.mongodb.scala:mongo-scala-driver_2.12:4.11.1",
"org.mongodb.scala:mongo-scala-bson_2.12:4.11.1",
"org.mongodb:mongodb-driver-core:4.11.1",
"org.mongodb:bson:4.11.1",
"org.mongodb:mongodb-driver-sync:4.11.1",
"org.mongodb:mongodb-driver-reactivestreams:4.11.1",
"org.reactivestreams:reactive-streams:1.0.4",
"org.mongodb:mongodb-driver-reactivestreams:4.11.1",
"org.reactivestreams:reactive-streams:1.0.4",
"org.mongodb.scala:mongo-scala-driver_2.12:4.11.1",
"org.mongodb.scala:mongo-scala-bson_2.12:4.11.1",
"org.mongodb:mongodb-driver-core:4.11.1",
"org.mongodb:bson:4.11.1",
"org.mongodb:mongodb-driver-sync:4.11.1",
"org.mongodb:mongodb-driver-reactivestreams:4.11.1",
"org.reactivestreams:reactive-streams:1.0.4",

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
devnotes.md (2)

Line range hint 109-144: Add bazel test coverage info

Include info about test coverage reports with bazel.

Add:

+To generate test coverage:
+```shell
+bazel coverage //api:api-test
+```
🧰 Tools
🪛 LanguageTool

[style] ~150-~150: Consider removing “of” to be more concise
Context: ...` directory. ## Using sbt ### To build all of the Chronon artifacts locally (builds all t...

(ALL_OF_THE)


183-188: Add bazel clean command

Include clean command for troubleshooting.

Add:

+To clean bazel artifacts:
+```shell
+bazel clean
+```
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between a20e103 and 5034b75.

📒 Files selected for processing (2)
  • api/BUILD.bazel (1 hunks)
  • devnotes.md (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/BUILD.bazel
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: fetcher_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: mutation_spark_tests
  • GitHub Check: join_spark_tests
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: scala_compile_fmt_fix
🔇 Additional comments (1)
devnotes.md (1)

41-41: LGTM: Java version update

Update to corretto-17 aligns with modern Java practices.

Comment on lines +18 to +23
### Install latest version of thrift

Thrift is a dependency for compile. The latest version 0.14 is very new - feb 2021, and incompatible with hive metastore. So we force 0.13.
Thrift is a dependency for compile. The latest version is 0.22 jan 2025.

```shell
brew tap cartman-kai/thrift
brew install [email protected]
brew install thrift
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add version pinning for thrift

Specify the exact version to ensure consistent builds across environments.

-brew install thrift
+brew install [email protected]

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@piyush-zlai piyush-zlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a stab at this! Dropped a few questions / comments

devnotes.md Outdated
@@ -39,7 +38,7 @@ python3 -m pip install -U tox build
* ```asdf install``` (see `.tool-versions` for required runtimes and versions)


> NOTE: Use scala `2.12.18` and java `corretto-8` for the OSS Chronon distribution.
> NOTE: Use scala `2.12.18` and java `corretto-17` for the OSS Chronon distribution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oss chronon is still java 8 I think..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true. Updated the documentation accordingly.

@@ -130,9 +131,22 @@ sbt dependencyBrowseGraph
sbt dependencyBrowseTree
```

#### Using bazel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe drop a few more common incantations? Like clean / compile of the entire repo and sub-modules?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some instructions for bazel clean to do a fresh build. Planning to add more instructions in the coming PR's as we add support for more modules

import scala.util.ScalaJavaConversions.ListOps
import scala.util.ScalaJavaConversions.MapOps
import ai.chronon.api.ScalaJavaConversions._
import ai.chronon.api.ScalaJavaConversions._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

import scala.util.ScalaJavaConversions.JMapOps
import scala.util.ScalaJavaConversions.MapOps
import ai.chronon.api.ScalaJavaConversions._
import ai.chronon.api.ScalaJavaConversions._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

"org.scala-lang.modules:scala-collection-compat_2.12:2.6.0",
"org.scala-lang.modules:scala-collection-compat_2.13:2.6.0",
"org.scala-lang.modules:scala-parser-combinators_2.13:2.4.0",
"org.scala-lang.modules:scala-java8-compat_2.12:1.0.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets go with java11

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it for scala-java8-compat ? there doesn't seem to be one specific to java 11 or 17 but it seems to work for even newer versions of java.

artifacts = [
"org.scala-lang.modules:scala-collection-compat_2.12:2.6.0",
"org.scala-lang.modules:scala-collection-compat_2.13:2.6.0",
"org.scala-lang.modules:scala-parser-combinators_2.13:2.4.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.13 or 2.12?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed 2.13 as we only need 2.12. also cleaned up unused dependencies.

"com.google.protobuf:protobuf-java:3.21.7",

# Flink
versioned_artifacts("1.19.0", [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.17.0 here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed for now as we don't need it for api module. will add the right version later when we need it for other modules

provided = True,
artifacts = [
# Spark artifacts - for scala 2.12
"org.apache.spark:spark-sql_2.12:3.5.4",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.5.1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -15,13 +15,12 @@ export CHRONON_API=$CHRONON_OS/api/py
alias materialize="PYTHONPATH=$CHRONON_API:$PYTHONPATH $CHRONON_API/ai/chronon/repo/compile.py"
```

### Install specific version of thrift
### Install latest version of thrift
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything to call out in this file regarding intellij setup with bazel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been running the build and test commands on the command line for these initial changes. nothing at this point but will look into the intellij setup if that makes it any easier going forward.

@@ -26,8 +26,8 @@ import org.slf4j.LoggerFactory

import scala.collection.Seq
import scala.util.Failure
import scala.util.ScalaJavaConversions.ListOps
import scala.util.ScalaJavaConversions.MapOps
import ai.chronon.api.ScalaJavaConversions._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double imported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import scala.util.ScalaJavaConversions.JMapOps
import ai.chronon.api.ScalaJavaConversions._
import ai.chronon.api.ScalaJavaConversions._
import ai.chronon.api.ScalaJavaConversions._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triple imported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,134 @@
load("@rules_jvm_external//:specs.bzl", "maven")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to what piyush mentioned - we can minify this set to just the ones we need. The OSS PR pulls for multiple versions of spark & scala. we have only one version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the dependency artifacts and will add more as we migrate other modules

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
devnotes.md (2)

41-41: Make Java version requirements more prominent

Consider moving this critical version information to the top of the prerequisites section.


Line range hint 109-197: Improve documentation structure

Consider reorganizing the build and test sections to clearly separate sbt and Bazel instructions.

+## Build Tools
+
+We support two build tools:
+1. sbt (current)
+2. Bazel (migration in progress)
+
+### Building with sbt
+[existing sbt section]
+
+### Building with Bazel
+[existing Bazel section]
🧰 Tools
🪛 LanguageTool

[style] ~150-~150: Consider removing “of” to be more concise
Context: ...` directory. ## Using sbt ### To build all of the Chronon artifacts locally (builds all t...

(ALL_OF_THE)

tools/build_rules/dependencies/maven_repository.bzl (1)

14-23: Consider simplifying test dependencies

Multiple test frameworks increase complexity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 5034b75 and 997ec6e.

📒 Files selected for processing (51)
  • WORKSPACE (1 hunks)
  • aggregator/src/main/scala/ai/chronon/aggregator/row/MapColumnAggregator.scala (1 hunks)
  • aggregator/src/main/scala/ai/chronon/aggregator/row/StatsGenerator.scala (1 hunks)
  • aggregator/src/main/scala/ai/chronon/aggregator/windowing/Resolution.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/DataType.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/ThriftJsonCodec.scala (1 hunks)
  • api/src/test/scala/ai/chronon/api/test/ExtensionsTest.scala (1 hunks)
  • devnotes.md (5 hunks)
  • hub/src/main/scala/ai/chronon/hub/handlers/TimeSeriesHandler.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/AvroCodec.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/DataStreamBuilder.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/JoinCodec.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/Metrics.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/SparkInternalRowConversions.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/TileCodec.scala (1 hunks)
  • online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala (1 hunks)
  • online/src/test/scala/ai/chronon/online/test/DataStreamBuilderTest.scala (1 hunks)
  • online/src/test/scala/ai/chronon/online/test/stats/DriftMetricsTest.scala (1 hunks)
  • orchestration/src/main/scala/ai/chronon/orchestration/utils/RelevantLeftForJoinPart.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Analyzer.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/BootstrapInfo.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Extensions.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/GroupBy.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Join.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinBase.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/LogFlattenerJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/LoggingSchema.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/StagingQuery.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/TableUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/CompareJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/CompareMetrics.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/ConsistencyJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Expressions.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/utils/MockApi.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/ChainingFetcherTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/GroupByUploadTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/SchemaEvolutionTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/TestUtils.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/DerivationTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/LogBootstrapTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/TableBootstrapTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala (1 hunks)
  • tools/build_rules/dependencies/maven_repository.bzl (1 hunks)
  • tools/build_rules/dependencies/spark_repository.bzl (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (49)
  • online/src/test/scala/ai/chronon/online/test/stats/DriftMetricsTest.scala
  • aggregator/src/main/scala/ai/chronon/aggregator/windowing/Resolution.scala
  • online/src/main/scala/ai/chronon/online/AvroCodec.scala
  • online/src/main/scala/ai/chronon/online/SparkInternalRowConversions.scala
  • online/src/main/scala/ai/chronon/online/Metrics.scala
  • spark/src/main/scala/ai/chronon/spark/StagingQuery.scala
  • online/src/main/scala/ai/chronon/online/TileCodec.scala
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/DerivationTest.scala
  • spark/src/main/scala/ai/chronon/spark/JoinBase.scala
  • online/src/test/scala/ai/chronon/online/test/DataStreamBuilderTest.scala
  • spark/src/main/scala/ai/chronon/spark/utils/MockApi.scala
  • api/src/main/scala/ai/chronon/api/ThriftJsonCodec.scala
  • spark/src/test/scala/ai/chronon/spark/test/GroupByUploadTest.scala
  • spark/src/main/scala/ai/chronon/spark/Extensions.scala
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Expressions.scala
  • online/src/main/scala/ai/chronon/online/JoinCodec.scala
  • spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala
  • spark/src/main/scala/ai/chronon/spark/LoggingSchema.scala
  • spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala
  • spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala
  • aggregator/src/main/scala/ai/chronon/aggregator/row/MapColumnAggregator.scala
  • spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/ChainingFetcherTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/SchemaEvolutionTest.scala
  • spark/src/main/scala/ai/chronon/spark/stats/ConsistencyJob.scala
  • aggregator/src/main/scala/ai/chronon/aggregator/row/StatsGenerator.scala
  • tools/build_rules/dependencies/spark_repository.bzl
  • spark/src/main/scala/ai/chronon/spark/LogFlattenerJob.scala
  • spark/src/main/scala/ai/chronon/spark/GroupBy.scala
  • spark/src/main/scala/ai/chronon/spark/stats/CompareJob.scala
  • spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/TableBootstrapTest.scala
  • spark/src/test/scala/ai/chronon/spark/test/TestUtils.scala
  • api/src/main/scala/ai/chronon/api/DataType.scala
  • spark/src/main/scala/ai/chronon/spark/Analyzer.scala
  • api/src/test/scala/ai/chronon/api/test/ExtensionsTest.scala
  • spark/src/main/scala/ai/chronon/spark/BootstrapInfo.scala
  • spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala
  • orchestration/src/main/scala/ai/chronon/orchestration/utils/RelevantLeftForJoinPart.scala
  • hub/src/main/scala/ai/chronon/hub/handlers/TimeSeriesHandler.scala
  • online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala
  • spark/src/test/scala/ai/chronon/spark/test/bootstrap/LogBootstrapTest.scala
  • spark/src/main/scala/ai/chronon/spark/Join.scala
  • spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala
  • spark/src/main/scala/ai/chronon/spark/TableUtils.scala
  • online/src/main/scala/ai/chronon/online/DataStreamBuilder.scala
  • spark/src/main/scala/ai/chronon/spark/stats/CompareMetrics.scala
  • WORKSPACE
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: join_spark_tests
  • GitHub Check: mutation_spark_tests
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: fetcher_spark_tests
  • GitHub Check: scala_compile_fmt_fix
🔇 Additional comments (7)
devnotes.md (3)

18-23: Restore version pinning for Thrift

Version pinning ensures consistent builds across environments.

-brew install thrift
+brew install [email protected]

134-144: Add more common test scenarios

Include examples for frequently used test patterns like running all tests with specific memory settings.


183-197: Add IntelliJ setup instructions for Bazel

Document the steps needed to set up IntelliJ with Bazel.

tools/build_rules/dependencies/maven_repository.bzl (4)

6-6: Security risk: Set pinned=True


8-10: LGTM: Scala dependencies

Consistent Scala 2.12 usage.


11-12: LGTM: Apache Commons

Latest stable versions used.


26-30: ⚠️ Potential issue

Update dependency versions

  • Jackson 2.12.5 has vulnerabilities
  • SLF4J 1.7.30 and Gson 2.8.6 are outdated
-        "org.slf4j:slf4j-api:1.7.30",
-        "org.slf4j:slf4j-log4j12:1.7.30",
-        "com.fasterxml.jackson.core:jackson-core:2.12.5",
-        "com.fasterxml.jackson.core:jackson-databind:2.12.5",
-        "com.google.code.gson:gson:2.8.6",
+        "org.slf4j:slf4j-api:2.0.9",
+        "org.slf4j:slf4j-log4j12:2.0.9",
+        "com.fasterxml.jackson.core:jackson-core:2.15.3",
+        "com.fasterxml.jackson.core:jackson-databind:2.15.3",
+        "com.google.code.gson:gson:2.10.1",

Likely invalid or redundant comment.

Copy link
Contributor

@nikhil-zlai nikhil-zlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

@kumar-zlai kumar-zlai merged commit 98fb0d3 into main Jan 8, 2025
10 checks passed
@kumar-zlai kumar-zlai deleted the kumarteja/bazel-migration branch January 8, 2025 20:43
tchow-zlai pushed a commit that referenced this pull request Jan 9, 2025
## Summary

Migrated `api` module to Bazel from sbt. Started with `api` module as it
doesn't have any dependency on other modules. will work on rest of the
modules in coming PR's.

Both bazel, sbt builds and tests are working at this point. Updated
documentation with build and test instructions using bazel.

Also cleaned up version specific Scala to Java conversions logic as they
are no longer needed and we only support scala 2.12 version.

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Build System**
	- Upgraded to Bazel 6.4.0
	- Introduced comprehensive build rules for Maven and Spark dependencies
	- Added utility functions for artifact and repository management

- **Scala Conversions**
- Replaced standard library Scala Java conversions with custom
`ScalaJavaConversions`
	- Enhanced collection conversion methods with improved null handling
	- Standardized conversion imports across project modules

- **Dependency Management**
	- Added support for Scala 2.12.18
	- Configured repositories for Maven, Spark, and other core libraries
	- Implemented flexible artifact and JAR library management

- **Thrift Support**
	- Added custom Thrift Java file generation
	- Implemented package name replacement for generated files
	- Created utility for generating Thrift Java libraries

These changes improve build reproducibility, dependency management, and
code consistency across the Chronon project.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------
Co-authored-by: Thomas Chow <[email protected]>
@coderabbitai coderabbitai bot mentioned this pull request Jan 10, 2025
4 tasks
kumar-zlai added a commit that referenced this pull request Apr 25, 2025
## Summary

Migrated `api` module to Bazel from sbt. Started with `api` module as it
doesn't have any dependency on other modules. will work on rest of the
modules in coming PR's.

Both bazel, sbt builds and tests are working at this point. Updated
documentation with build and test instructions using bazel.

Also cleaned up version specific Scala to Java conversions logic as they
are no longer needed and we only support scala 2.12 version.

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Build System**
	- Upgraded to Bazel 6.4.0
	- Introduced comprehensive build rules for Maven and Spark dependencies
	- Added utility functions for artifact and repository management

- **Scala Conversions**
- Replaced standard library Scala Java conversions with custom
`ScalaJavaConversions`
	- Enhanced collection conversion methods with improved null handling
	- Standardized conversion imports across project modules

- **Dependency Management**
	- Added support for Scala 2.12.18
	- Configured repositories for Maven, Spark, and other core libraries
	- Implemented flexible artifact and JAR library management

- **Thrift Support**
	- Added custom Thrift Java file generation
	- Implemented package name replacement for generated files
	- Created utility for generating Thrift Java libraries

These changes improve build reproducibility, dependency management, and
code consistency across the Chronon project.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------
kumar-zlai added a commit that referenced this pull request Apr 29, 2025
## Summary

Migrated `api` module to Bazel from sbt. Started with `api` module as it
doesn't have any dependency on other modules. will work on rest of the
modules in coming PR's.

Both bazel, sbt builds and tests are working at this point. Updated
documentation with build and test instructions using bazel.

Also cleaned up version specific Scala to Java conversions logic as they
are no longer needed and we only support scala 2.12 version.

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Build System**
	- Upgraded to Bazel 6.4.0
	- Introduced comprehensive build rules for Maven and Spark dependencies
	- Added utility functions for artifact and repository management

- **Scala Conversions**
- Replaced standard library Scala Java conversions with custom
`ScalaJavaConversions`
	- Enhanced collection conversion methods with improved null handling
	- Standardized conversion imports across project modules

- **Dependency Management**
	- Added support for Scala 2.12.18
	- Configured repositories for Maven, Spark, and other core libraries
	- Implemented flexible artifact and JAR library management

- **Thrift Support**
	- Added custom Thrift Java file generation
	- Implemented package name replacement for generated files
	- Created utility for generating Thrift Java libraries

These changes improve build reproducibility, dependency management, and
code consistency across the Chronon project.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------
chewy-zlai pushed a commit that referenced this pull request May 15, 2025
## Summary

Migrated `api` module to Bazel from sbt. Started with `api` module as it
doesn't have any dependency on other modules. will work on rest of the
modules in coming PR's.

Both bazel, sbt builds and tests are working at this point. Updated
documentation with build and test instructions using bazel.

Also cleaned up version specific Scala to Java conversions logic as they
are no longer needed and we only support scala 2.12 version.

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Build System**
	- Upgraded to Bazel 6.4.0
	- Introduced comprehensive build rules for Maven and Spark dependencies
	- Added utility functions for artifact and repository management

- **Scala Conversions**
- Replaced standard library Scala Java conversions with custom
`ScalaJavaConversions`
	- Enhanced collection conversion methods with improved null handling
	- Standardized conversion imports across project modules

- **Dependency Management**
	- Added support for Scala 2.12.18
	- Configured repositories for Maven, Spark, and other core libraries
	- Implemented flexible artifact and JAR library management

- **Thrift Support**
	- Added custom Thrift Java file generation
	- Implemented package name replacement for generated files
	- Created utility for generating Thrift Java libraries

These changes improve build reproducibility, dependency management, and
code consistency across the Chronon project.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------
chewy-zlai pushed a commit that referenced this pull request May 16, 2025
## Summary

Migrated `api` module to Bazel from sbt. Started with `api` module as it
doesn't have any dependency on other modules. will work on rest of the
modules in coming PR's.

Both bazel, sbt builds and tests are working at this point. Updated
documentation with build and test instructions using bazel.

Also cleaned up version specific Scala to Java conversions logic as they
are no longer needed and we only support scala 2.12 version.

## Cheour clientslist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Build System**
	- Upgraded to Bazel 6.4.0
	- Introduced comprehensive build rules for Maven and Spark dependencies
	- Added utility functions for artifact and repository management

- **Scala Conversions**
- Replaced standard library Scala Java conversions with custom
`ScalaJavaConversions`
	- Enhanced collection conversion methods with improved null handling
	- Standardized conversion imports across project modules

- **Dependency Management**
	- Added support for Scala 2.12.18
	- Configured repositories for Maven, Spark, and other core libraries
	- Implemented flexible artifact and JAR library management

- **Thrift Support**
	- Added custom Thrift Java file generation
	- Implemented paour clientsage name replacement for generated files
	- Created utility for generating Thrift Java libraries

These changes improve build reproducibility, dependency management, and
code consistency across the Chronon project.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants