Connect drift metrics computation in Spark with Hub for serving to frontend #95

piyush-zlai · 2024-11-26T15:59:20Z

Summary

Builds on a couple of the summary computation PRs and data generation to wire things up so that Hub can serve them.

Yanked out mock data based endpoints (model perf / drift, join & feature skew) - decided it would be confusing to have a mix of mock and generated data so we just have the generated data served
Dropped a few of the scripts introduced in observability script for demo #87. We bring up our containers the way and we have a script load_summaries.sh that we can trigger that leverages the existing app container to load data.
DDB ingestion was taking too long and we were dropping a lot of data due to rejected execution exceptions. To unblock for now, we've gone with an approach of making a bulk put HTTP call from the ObservabilityDemo app -> Hub and Hub utilizing a InMemoryKV store to persist and serve up features.
Added an endpoint to serve the join that are configured as we've switched from the model based world.

There's still an issue to resolve around fetching individual feature series data. Once I resolve that, we can switch this PR out of wip mode.

To test / run:
start up our docker containers:

$ docker-compose -f docker-init/compose.yaml up --build
...

In a different term load data:

$ ./docker-init/demo/load_summaries.sh 
Done uploading summaries! 🥳

You can now curl join & feature time series data.
Join drift (null ratios)

curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=null&offset=10h&algorithm=psi'

Join drift (value drift)

curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=10h&algorithm=psi'

Feature drift:

curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=aggregates'

Feature summaries:

curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=percentile'

Join metadata

curl -X GET 'http://localhost:9000/api/v1/joins'
curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join'

Checklist

Added Unit Tests
Covered by existing CI
Integration tested
Documentation update

Summary by CodeRabbit

Release Notes

New Features
- Introduced a new JoinController for managing joins with pagination support.
- Added functionality for an in-memory key-value store with bulk data upload capabilities.
- Implemented observability demo data loading within a Spark application.
- Added a new HTTPKVStore class for remote key-value store interactions over HTTP.
Improvements
- Enhanced the ModelController and SearchController to align with the new join data structure.
- Updated the TimeSeriesController to support asynchronous operations and improved error handling.
- Refined dependency management in the build configuration for better clarity and maintainability.
- Updated API routes to include new endpoints for listing and retrieving joins.
- Updated configuration to replace the DynamoDBModule with ModelStoreModule, adding InMemoryKVStoreModule and DriftStoreModule.
Documentation
- Revised README instructions for Docker container setup and demo data loading.
- Updated API routes documentation to reflect new endpoints for joins and in-memory data operations.
Bug Fixes
- Resolved issues related to error handling in various controllers and improved logging for better traceability.

coderabbitai · 2024-11-26T15:59:33Z

Walkthrough

The changes in this pull request primarily focus on enhancing dependency management and restructuring various components within the application. Key modifications include the introduction of a new variable for Circe libraries in build.sbt, updates to Docker configurations, and the addition of new controllers and methods for handling joins. The MonitoringModelStore replaces the DynamoDBMonitoringStore, and several routes have been modified to reflect new functionality. Additionally, new modules for dependency injection have been created, and tests have been introduced or updated to ensure proper functionality across the application.

Changes

File Path	Change Summary
build.sbt	Added `circe` variable for Circe libraries; updated `hub` project dependencies; expanded logging exclusions.
docker-init/Dockerfile	Introduced a new stage for Bitnami Spark image; updated runtime environment and Python dependencies.
docker-init/demo/README.md	Updated instructions for running Docker containers and loading demo data.
docker-init/demo/load_summaries.sh	New script to initiate ObservabilityDemo Spark job within Docker.
docker-init/start.sh	Removed data summarization section; added `JAVA_OPTS` export for module accessibility.
hub/app/controllers/InMemKVStoreController.scala	Added `InMemKVStoreController` for managing in-memory key-value store operations.
hub/app/controllers/JoinController.scala	Introduced `JoinController` for managing Zipline Join entities with pagination.
hub/app/controllers/ModelController.scala	Updated constructor parameter type from `DynamoDBMonitoringStore` to `MonitoringModelStore`.
hub/app/controllers/Paginate.scala	Updated `paginateResults` method to accept a generic type.
hub/app/controllers/SearchController.scala	Updated to handle joins instead of models; changed response types accordingly.
hub/app/controllers/TimeSeriesController.scala	Enhanced handling of time series data; introduced `DriftStore` dependency for asynchronous operations.
hub/app/model/Model.scala	Updated `Join` case class and renamed response classes for clarity.
hub/app/module/DriftStoreModule.scala	Introduced `DriftStoreModule` for dependency injection of `DriftStore`.
hub/app/module/DynamoDBModule.scala	Removed `DynamoDBModule` class.
hub/app/module/InMemoryKVStoreModule.scala	Introduced `InMemoryKVStoreModule` for in-memory key-value store.
hub/app/module/ModelStoreModule.scala	Introduced `ModelStoreModule` for dependency injection of `MonitoringModelStore`.
hub/app/store/MonitoringModelStore.scala	Renamed and updated methods for `MonitoringModelStore`.
hub/conf/application.conf	Updated module settings to replace `DynamoDBModule` with `ModelStoreModule` and added new modules.
hub/conf/routes	Added new endpoints for joins and updated existing routes for features.
hub/test/controllers/JoinControllerSpec.scala	Introduced tests for `JoinController` methods.
hub/test/controllers/ModelControllerSpec.scala	Updated tests to use `MonitoringModelStore`.
hub/test/controllers/SearchControllerSpec.scala	Updated tests to reflect changes in data handling from models to joins.
hub/test/controllers/TimeSeriesControllerSpec.scala	Revised tests to focus on drift metrics and error handling.
hub/test/store/MonitoringModelStoreTest.scala	Renamed tests to reflect changes in `MonitoringModelStore`.
online/src/main/scala/ai/chronon/online/HTTPKVStore.scala	Introduced `HTTPKVStore` class for remote key-value store interactions.
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala	Renamed constant for metadata endpoint.
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala	Added `percentileLabels` variable in `DriftStore`.
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala	Corrected a comment typo in the `TileDriftCalculator`.
spark/src/main/scala/ai/chronon/spark/Driver.scala	Updated `metaDataStore` instantiation to use `MetadataDataset`.
.github/workflows/test_scala_no_spark.yaml	Updated workflow to include new testing steps for the hub module.
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala	Introduced a new data loader for observability demo purposes.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant JoinController
    participant MonitoringModelStore

    User->>JoinController: GET /api/v1/joins
    JoinController->>MonitoringModelStore: list(offset, limit)
    MonitoringModelStore-->>JoinController: List of joins
    JoinController-->>User: JSON response with joins

Possibly related PRs

Implement stubbed backend APIs for model list, search and time series #22: Updates to the build.sbt file, specifically the addition of Circe library dependencies.
Downgrade Play to 2.x to stick to Scala 2.12 across the repo #43: Downgrades Play to maintain compatibility with Scala 2.12, which is relevant as it affects the dependency management aspect of the project, similar to the changes in the main PR.
Register DynamoDB models at Play application startup #44: The registration of DynamoDB models at application startup is relevant as it ties into the dependency management and project structure changes made in the main PR.
Summary upload #50: The introduction of the SummaryUploader class for uploading summary statistics to a KVStore relates to the main PR's focus on improving dependency management and project relationships.
Driver Summarizer #62: Modifications to the Driver class for summarizing and uploading data enhance the functionality of the project, which is consistent with the main PR's objectives of improving dependency management.
metrics fixes #78: Fixes related to metrics improve the handling of drift metrics, which is relevant to the overall project functionality and aligns with the main PR's focus on enhancing maintainability.
feat: refactor github workflows to do conditional CI/CD #98: Refactoring of GitHub workflows to implement conditional CI/CD processes is relevant as it impacts the overall project management and testing strategies, which are part of the main PR's objectives.

Suggested reviewers

ken-zlai
nikhil-zlai
chewy-zlai

Poem

🐰 In the code where rabbits play,
Dependencies hop and sway.
Circe's magic, joins unite,
Docker dances, all feels right.
With tests that leap and bounds that soar,
Our project grows, forevermore! 🌟

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 22

🧹 Outside diff range and nitpick comments (56)

hub/app/controllers/Paginate.scala (1)
8-10: Consider adding input validation and documentation.

The method could benefit from:

Input validation for negative offset/limit values

Documentation for the type parameter and parameters

Handling of edge cases when offset exceeds the sequence length

Consider this improvement:
+ /** Paginates a sequence of elements
+  * @tparam T the type of elements in the sequence
+  * @param results the sequence to paginate
+  * @param offset the starting position (must be non-negative)
+  * @param limit the maximum number of elements to return (must be positive)
+  * @return a subsequence of elements based on the pagination parameters
+  */
 def paginateResults[T](results: Seq[T], offset: Int, limit: Int): Seq[T] = {
+  require(offset >= 0, "offset must be non-negative")
+  require(limit > 0, "limit must be positive")
+  require(limit <= maxLimit, s"limit must not exceed $maxLimit")
   results.slice(offset, offset + limit)
 }
hub/app/module/ModelStoreModule.scala (3)
13-13: Update variable name to reflect current architecture

The variable name dynamoDBMonitoringStore is outdated since we're moving away from DynamoDB to an in-memory store.
- val dynamoDBMonitoringStore = new MonitoringModelStore(awsApiImpl)
+ val monitoringStore = new MonitoringModelStore(awsApiImpl)
9-16: Architectural concerns with AWS dependency

According to the PR objectives, we're moving away from DynamoDB to an InMemoryKV store. However, this module still creates an AWS dependency. This seems inconsistent with the stated architectural direction.

Consider:

Creating a separate module for in-memory store configuration

Using an interface/trait to abstract the storage implementation

Providing a configuration flag to switch between storage implementations

Example approach:
trait StorageBackend
class InMemoryStorage extends StorageBackend
class AwsStorage(awsApi: AwsApiImpl) extends StorageBackend

class ModelStoreModule(environment: Environment, configuration: Configuration) extends AbstractModule {
  override def configure(): Unit = {
    val storageType = configuration.get[String]("storage.type")
    val backend = storageType match {
      case "memory" => new InMemoryStorage()
      case "aws" => new AwsStorage(new AwsApiImpl(configuration.get[Configuration]("aws").underlying))
      case _ => throw new IllegalArgumentException(s"Unsupported storage type: $storageType")
    }
    bind(classOf[StorageBackend]).toInstance(backend)
    bind(classOf[MonitoringModelStore]).toInstance(new MonitoringModelStore(backend))
  }
}
9-9: Unused constructor parameters

The environment parameter is not used in the module. Consider removing it if it's not needed.
- class ModelStoreModule(environment: Environment, configuration: Configuration)
+ class ModelStoreModule(configuration: Configuration)
hub/app/module/DriftStoreModule.scala (3)
10-16: Document lifecycle implications of eager singleton.

While using asEagerSingleton() is appropriate for in-memory storage, please add documentation explaining:

Why eager initialization is necessary

Memory implications

Startup time considerations

Consider adding a comment like:
 override def configure(): Unit = {
   // TODO swap to concrete api impl in a follow up
+  // Using eager singleton to ensure immediate initialization of in-memory store
+  // and prevent concurrent initialization issues during requests
   bind(classOf[DriftStore]).toProvider(classOf[DriftStoreProvider]).asEagerSingleton()
 }
13-13: Create tracking issue for the TODO.

The TODO comment indicates a pending task to implement a concrete API.

Would you like me to create a GitHub issue to track this TODO for implementing the concrete API implementation?

18-22: Enhance provider robustness with validation and logging.

Consider adding validation and logging to make the provider more robust and debuggable.
 class DriftStoreProvider @Inject() (kvStore: KVStore) extends Provider[DriftStore] {
+  private val logger = play.api.Logger(getClass)
+
   override def get(): DriftStore = {
+    logger.info("Initializing DriftStore with KVStore")
+    require(kvStore != null, "KVStore cannot be null")
     new DriftStore(kvStore)
   }
 }
docker-init/demo/README.md (1)
1-9: Add prerequisites and environment setup details

While the Docker launch instructions are clear, the documentation would benefit from additional context:

Required prerequisites (Docker, minimum versions)

Expected initial setup (repository clone, environment variables if any)

System requirements (minimum memory, disk space)

Consider adding a prerequisites section:
+# Prerequisites
+
+- Docker Engine 20.10.0 or higher
+- Docker Compose v2.0.0 or higher
+- Minimum 16GB RAM recommended
+- At least 10GB free disk space
+
 To populate the observability demo data:
hub/conf/application.conf (1)
32-32: Consider adding configuration for in-memory store limits and persistence strategy.

While using an in-memory KV store addresses the immediate performance concerns, consider:

Memory consumption limits to prevent OOM issues

Data persistence strategy across service restarts

Scalability implications in a production environment

Consider adding configuration parameters for:
+# In-memory store configuration
+inmemory.store {
+  # Maximum memory allocation (e.g., 75% of available heap)
+  max.memory = "75%"
+  # Persistence strategy (none, file, redis)
+  persistence.strategy = "none"
+  # Backup interval if persistence enabled
+  backup.interval = "15m"
+}
hub/app/controllers/ModelController.scala (1)

7-7: Architectural change: Transition to in-memory storage

The switch to MonitoringModelStore aligns with the PR objective to use in-memory storage for better performance. However, we should consider data persistence implications.

Consider:

Implementing a persistence strategy for the in-memory store

Adding monitoring for memory usage

Documenting the maximum data size limitations
hub/app/controllers/JoinController.scala (3)
3-8: Consider adding error handling and logging imports

The controller might benefit from explicit error handling and logging capabilities for better observability and debugging.

Consider adding:
+import play.api.Logger
+import scala.util.{Try, Success, Failure}
11-13: Enhance documentation for Join entities

The documentation could be more descriptive about what Join entities represent and their significance in the context of drift metrics.

Consider expanding the documentation:
-  * Controller for the Zipline Join entities
+  * Controller for managing Zipline Join entities, which represent the relationships
+  * between different data sources used in drift metric computations.
+  * 
+  * @see Related documentation about drift metrics and data joining strategy
24-40: Consider implementing caching for performance

Given that this endpoint serves drift metrics data which might be computationally expensive, consider implementing caching.

You could:

Use Play's built-in caching mechanism

Implement HTTP caching headers

Consider implementing ETags for efficient caching

Example implementation:
def list(offset: Option[Int], limit: Option[Int]): Action[AnyContent] = {
  Action.async { implicit request =>
    // Generate cache key based on pagination parameters
    val cacheKey = s"joins_${offset}_${limit}"
    
    cache.getOrElseUpdate(cacheKey, duration = 5.minutes) {
      // Existing implementation wrapped in Future
      Future.successful {
        // ... existing code ...
      }
    }
  }
}
hub/app/controllers/SearchController.scala (2)
20-21: Enhance API documentation for clarity

While the documentation updates correctly reflect the transition to joins, consider adding:

Search behavior details (e.g., case sensitivity, partial matches)

Example search terms

Response format description
  /**
-   * Powers the /api/v1/search endpoint. Returns a list of joins
-   * @param term - Search term to search for (currently we only support searching join names)
+   * Powers the /api/v1/search endpoint. Returns a list of joins matching the search criteria
+   * @param term - Search term to search for (case-sensitive partial match on join names)
+   * @param offset - For pagination. We skip over offset entries before returning results (default: 0)
+   * @param limit - Number of elements to return (default: 10, max: configured limit)
+   * @return JSON response containing matched joins with format: {"offset": number, "joins": Join[]}
    */
43-46: Consider performance and reliability improvements

While the current implementation is functional and aligns with the learning note about future enhancements, consider:

Performance: The current string matching (contains) might not scale well with large datasets

Reliability: Add error handling for potential store failures
  private def searchRegistry(term: String): Seq[Join] = {
-   val joins = monitoringStore.getJoins
-   joins.filter(j => j.name.contains(term))
+   try {
+     val joins = monitoringStore.getJoins
+     // TODO: Consider adding index-based search for better performance
+     joins.filter(j => j.name.toLowerCase.contains(term.toLowerCase))
+   } catch {
+     case e: Exception =>
+       logger.error(s"Failed to search joins: ${e.getMessage}", e)
+       Seq.empty
+   }
  }
docker-init/Dockerfile (4)
46-46: Consider optimizing Spark binary copying

Instead of copying all Spark binaries, consider copying only the essential ones needed for your specific use case (e.g., spark-submit, spark-shell). This would reduce the image size.
-COPY --from=spark-source /opt/bitnami/spark/bin /opt/spark/bin
+COPY --from=spark-source /opt/bitnami/spark/bin/spark-submit /opt/spark/bin/
+COPY --from=spark-source /opt/bitnami/spark/bin/spark-shell /opt/spark/bin/
Line range hint 1-2: Pin the base image version for reproducibility

Using latest tag for the base image can lead to inconsistent builds over time. Consider pinning to a specific version.
-FROM base_image:latest AS build_env
+FROM base_image:1.x.x AS build_env  # Replace with specific version
Line range hint 1-58: Consider adding Spark resource configuration

Given that this Dockerfile supports drift metrics computation in Spark and bulk HTTP calls, consider adding environment variables for Spark resource configuration (memory, cores) to optimize performance for your specific workload.

Example configuration to add:
+# Set default Spark configuration
+ENV SPARK_DRIVER_MEMORY=4g
+ENV SPARK_EXECUTOR_MEMORY=4g
+ENV SPARK_WORKER_CORES=2
Update NumPy and Python versions for enhanced security

Based on the verification results:

NumPy 1.22.0 is no longer the most secure version, with 2.1.3 being the latest stable release

All known security vulnerabilities are patched in versions newer than 1.22.0

Python 3.8 is not required for newer NumPy versions and is approaching end-of-life

Recommended changes:

Upgrade to Python 3.11+ which offers better security features and longer support

Update NumPy to the latest stable version (2.1.3)

Update the corresponding dependencies in requirements.txt

🔗 Analysis chain

Line range hint 11-15: Verify if Python/NumPy versions are still optimal for security

While Python 3.8 was chosen for numpy 1.22.0 compatibility due to security concerns, please verify if:

There are newer Python versions that support secure numpy versions

numpy 1.22.0 is still the recommended version for security

Let me verify Python version compatibility with newer NumPy versions to provide a complete assessment.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check for latest numpy versions and their Python compatibility
curl -s https://pypi.org/pypi/numpy/json | jq -r '.releases | keys[]' | sort -V | tail -n 5

# Check for any security advisories
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: PIP, package: "numpy") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}'
Length of output: 1457

Script:
#!/bin/bash
# Check Python version in Dockerfile
rg -U "FROM python:|RUN.*python" docker-init/Dockerfile

# Check numpy version in requirements
fd requirements.txt --exec cat {} \; || fd setup.py --exec cat {} \;
Length of output: 565
online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (2)

10-11: Remove or clarify the "hacky" comment

The comment suggests this is a temporary solution, but the code is being integrated into the main workflow. Either remove the comment if this is intended to be a permanent solution, or document the planned replacement approach.

52-56: Consider using a JSON library for safer serialization

Manual JSON string construction is error-prone and potentially unsafe. Consider using a lightweight JSON library that's compatible with Spark serialization.
hub/app/model/Model.scala (3)
8-13: Convert implementation notes into proper documentation.

These numbered comments appear to outline a planned feature pipeline for metadata and tile processing. Consider converting these implementation notes into proper ScalaDoc documentation that explains the overall workflow, each step's purpose, and their relationships.

Example structure:
/**
 * Metadata Processing Pipeline
 *
 * The system processes metadata and tiles through the following stages:
 * 1. Metadata Upload: Processes join to mapping of group-by columns
 * 2. Configuration Fetching: Retrieves join configuration and column listings
 * ...
 */
59-59: Update or remove outdated comment.

The comment "Currently search only covers joins" seems outdated given the introduction of both Search and List responses for joins.

59-62: Consider consolidating duplicate response models.

SearchJoinResponse and ListJoinResponse have identical structures. Consider:

Using a single response type if they serve the same purpose

Adding documentation to clarify the distinction if they serve different purposes

Using a type parameter for consistency: PagedResponse[T](offset: Int, items: Seq[T])

Example refactor:
case class PagedResponse[T](offset: Int, items: Seq[T])
type ListModelResponse = PagedResponse[Model]
type ListJoinResponse = PagedResponse[Join]
type SearchJoinResponse = PagedResponse[Join]
hub/test/controllers/JoinControllerSpec.scala (5)
1-18: Consider organizing imports for better maintainability

The test class setup looks good, but consider grouping related imports together (e.g., Play Framework, Circe, testing utilities) for better readability.
 package controllers

-import controllers.MockJoinService.mockJoinRegistry
-import io.circe._
-import io.circe.generic.auto._
-import io.circe.parser._
-import model.ListJoinResponse
-import org.mockito.Mockito._
-import org.scalatest.EitherValues
-import org.scalatestplus.play._
-import play.api.http.Status.BAD_REQUEST
-import play.api.http.Status.OK
-import play.api.mvc._
-import play.api.test.Helpers._
-import play.api.test._
-import store.MonitoringModelStore
+// Play Framework
+import play.api.mvc._
+import play.api.test._
+import play.api.test.Helpers._
+import play.api.http.Status.{BAD_REQUEST, OK}
+
+// Circe
+import io.circe._
+import io.circe.generic.auto._
+import io.circe.parser._
+
+// Testing
+import org.scalatest.EitherValues
+import org.scalatestplus.play._
+import org.mockito.Mockito._
+
+// Application
+import controllers.MockJoinService.mockJoinRegistry
+import model.ListJoinResponse
+import store.MonitoringModelStore
20-26: Update comment to reflect current implementation

The comment mentions "DynDB store" but the class is now called MonitoringModelStore. Update the comment to maintain consistency with the current implementation.
-  // Create a mocked DynDB store
+  // Create a mocked monitoring model store
   val mockedStore: MonitoringModelStore = mock(classOf[MonitoringModelStore])
27-64: Enhance test coverage with additional scenarios

While the current tests cover basic functionality, consider adding:

Edge cases:

Empty result set

Maximum limit scenarios

Boundary conditions for offset/limit

Verification of mock interactions

Validation of total count in pagination response

Example implementation:
"handle empty results correctly" in {
  when(mockedStore.getJoins).thenReturn(Seq.empty)
  
  val result = controller.list(None, None).apply(FakeRequest())
  status(result) mustBe OK
  val response = decode[ListJoinResponse](contentAsString(result)).right.value
  response.items mustBe empty
}

"respect maximum limit" in {
  when(mockedStore.getJoins).thenReturn(mockJoinRegistry)
  
  val result = controller.list(None, Some(1000)).apply(FakeRequest())
  status(result) mustBe OK
  val response = decode[ListJoinResponse](contentAsString(result)).right.value
  response.items.length mustBe controller.maxLimit // Assuming maxLimit is defined
}
54-56: Extract magic numbers into named constants

The test uses magic numbers (25, 20) for offset and limit. Consider extracting these into named constants for better maintainability and clarity.
+  private val TestOffset = 25
+  private val TestLimit = 20
+
   "send results in a paginated fashion correctly" in {
     when(mockedStore.getJoins).thenReturn(mockJoinRegistry)

-    val startOffset = 25
-    val number = 20
+    val startOffset = TestOffset
+    val number = TestLimit
40-41: Verify mock interactions

The tests set up mock expectations but don't verify the number of times the mock was called. Add verification to ensure the mock is used as expected.
   "send valid results on a correctly formed request" in {
     when(mockedStore.getJoins).thenReturn(mockJoinRegistry)
     val result = controller.list(None, None).apply(FakeRequest())
+    verify(mockedStore, times(1)).getJoins
Also applies to: 52-53
hub/test/controllers/SearchControllerSpec.scala (4)
43-49: Consider making test data verification more maintainable

The test correctly verifies the response type and content, but relies on hardcoded values for verification. Consider using values derived from mockJoinRegistry to make the test more maintainable.
-      items.map(_.name.toInt).toSet mustBe Set(1, 10, 11, 12, 13, 14, 15, 16, 17, 18)
+      val expectedItems = mockJoinRegistry.map(_.name.toInt).filter(_ toString contains "1").take(controller.defaultLimit).toSet
+      items.map(_.name.toInt).toSet mustBe expectedItems
Line range hint 54-69: Document the pagination test logic

The pagination test case could benefit from clearer documentation explaining the expected results and the test data structure.
     "send results in a paginated fashion correctly" in {
       when(mockedStore.getJoins).thenReturn(mockJoinRegistry)
 
+      // Test pagination with:
+      // - mockJoinRegistry containing joins with IDs 0-99
+      // - filtering for IDs containing "1"
+      // - starting from offset 3 (4th item)
+      // - limiting to 6 items
       val startOffset = 3
       val number = 6
       val result = controller.search("1", Some(startOffset), Some(number)).apply(FakeRequest())
-      // we have names: 0, 1, 2, .. 99
-      // our result should give us: 1, 10, 11, 12, .. 19, 21, 31, .. 91
73-80: Enhance mock data generation for better test coverage

The mock join generation could be improved to:

Support different types of group by configurations

Vary the number of features

Include edge cases

This would help ensure the tests cover a wider range of scenarios.
 object MockJoinService {
   def generateMockJoin(id: String): Join = {
-    val groupBys = Seq(GroupBy("my_groupBy", Seq("g1", "g2")))
-    Join(id, Seq("ext_f1", "ext_f2", "d_1", "d2"), groupBys)
+    // Vary the number of group bys and their values based on the id
+    val numGroupBys = (id.toInt % 3) + 1  // 1 to 3 group bys
+    val groupBys = (1 to numGroupBys).map(i => 
+      GroupBy(s"groupBy_$i", Seq(s"g${i}_1", s"g${i}_2"))
+    )
+    
+    // Vary the number of features based on the id
+    val numFeatures = (id.toInt % 5) + 2  // 2 to 6 features
+    val features = (1 to numFeatures).map(i => s"feature_$i")
+    
+    Join(id, features, groupBys)
   }
Inconsistency found in store dependency

While the test cases correctly use getJoins and join-based functionality, the MonitoringModelStore type is still being used for mocking. This should be updated to use the join-specific store type for consistency with the join-based transition.

val mockedStore: MonitoringModelStore = mock(classOf[MonitoringModelStore]) should be updated to use the join-specific store type

🔗 Analysis chain

Line range hint 3-26: Verify complete transition from models to joins

The import changes and class setup correctly reflect the transition to join-based functionality. However, let's verify that all model-related code has been properly migrated.

Let me gather more information to verify if there are any remaining model-related dependencies in the SearchControllerSpec file specifically.

Let me check one final aspect to see if there are any actual usages of the MonitoringModelStore in the test cases.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for any remaining model-related references that might have been missed

# Search for any remaining model-related imports or usages
rg -i "mockdataservice|mockmodelregistry" --type scala

# Search for any remaining model-related classes or methods
ast-grep --pattern 'class $_ extends $_ {
  $$$
  getModels($$$) {
    $$$
  }
  $$$
}'
Length of output: 653

Script:
#!/bin/bash
# Check for any model-related code specifically in SearchControllerSpec
rg -l "Model" hub/test/controllers/SearchControllerSpec.scala

# Check for any references to monitoring models or stores
rg "MonitoringModel|getModels" hub/test/controllers/SearchControllerSpec.scala

# Look for any test cases that might still be using model-based approach
ast-grep --pattern 'describe("$_") {
  $$$
  "should $$$" in {
    $$$
    getModels($$$)
    $$$
  }
  $$$
}'
Length of output: 415

Script:
#!/bin/bash
# Get the full content of SearchControllerSpec to analyze the test cases
rg -A 50 "class SearchControllerSpec" hub/test/controllers/SearchControllerSpec.scala

# Check for any method calls on mockedStore
rg "mockedStore\." hub/test/controllers/SearchControllerSpec.scala
Length of output: 2315
hub/test/store/MonitoringModelStoreTest.scala (2)
44-47: Consider updating variable name for consistency.

The variable name dynamoDBMonitoringStore should be updated to match the new implementation:
-    val dynamoDBMonitoringStore = new MonitoringModelStore(api)
-    when(kvStore.list(any())).thenReturn(generateListResponse())
-
-    validateLoadedConfigs(dynamoDBMonitoringStore)
+    val monitoringStore = new MonitoringModelStore(api)
+    when(kvStore.list(any())).thenReturn(generateListResponse())
+
+    validateLoadedConfigs(monitoringStore)
50-50: Update parameter name for consistency.

The parameter name should reflect the new type:
-  private def validateLoadedConfigs(dynamoDBMonitoringStore: MonitoringModelStore): Unit = {
+  private def validateLoadedConfigs(monitoringStore: MonitoringModelStore): Unit = {
hub/conf/routes (3)
16-16: Consider documenting query parameters.

The endpoint has numerous query parameters that would benefit from documentation:

metricType, metrics, granularity, offset, algorithm

This documentation would help API consumers understand:

Valid values for each parameter

Default values

Parameter dependencies or constraints

18-19: Address concerns about temporary in-memory KV store.

While this endpoint addresses the immediate DynamoDB ingestion issues mentioned in the PR objectives, several concerns need attention:

Data persistence strategy in case of service restart

Memory management and potential OOM issues

Request size limits for bulk operations

Timeline for replacing this temporary solution

Consider:

Adding request size limits

Implementing data persistence

Planning for a more permanent solution

Remove or document commented API endpoints

The commented-out endpoints appear to be unused across the codebase, as our searches found no references to these endpoints outside of the routes file. Since these endpoints are part of a transition from a model-based to join-based approach, they should either be:

Removed if they're no longer needed in the new architecture

Documented with a clear migration timeline if they're planned for future implementation

hub/conf/routes: Remove the commented endpoints or add a documentation block explaining the migration plan and timeline

🔗 Analysis chain

Line range hint 22-31: Document the API transition plan.

There's a significant block of commented-out endpoints with a TODO note about moving to fine-grained endpoints. This raises several concerns:

The timeline for implementing these fine-grained endpoints

Whether these endpoints are still needed given the new join-based approach

Potential impact on API consumers who might be expecting these endpoints
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check if any of these endpoints are referenced in the codebase
rg "GET.*?/api/v1/feature/[^/]+/(percentiles|values|drift|skew|compare)" --type ts --type js
Length of output: 94

Script:
#!/bin/bash
# Let's try a broader search to see if these endpoints are referenced anywhere in the codebase
rg -l "/api/v1/feature.*?/(percentiles|values|drift|skew|compare)"

# Also check for any documentation or migration plans
fd -e md -e txt | xargs rg -l "api/v1/feature.*?/(percentiles|values|drift|skew)"

# Check the TimeSeriesController implementation
ast-grep --pattern 'class TimeSeriesController {
  $$$
  fetchFeature($$$) {
    $$$
  }
  $$$
}'
Length of output: 268

Script:
#!/bin/bash
# Let's check the git history of these routes to understand when they were commented out
git log -p hub/conf/routes | grep -A 5 "^[-+].*api/v1/feature.*/(percentiles|values|drift|skew|compare)"

# Check for any related configuration or feature flags
rg -l "feature.*?(percentiles|values|drift|skew|compare)" --type yaml --type conf --type properties

# Look for any TODO comments related to API migration
rg "TODO.*api" --type scala --type conf
Length of output: 201
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala (2)
105-106: Consider adding detailed success logging

While the completion message is clear, consider adding more detailed logging about the number of summaries uploaded and their status. This would help with monitoring and debugging in production.
-    println("Done uploading summaries! \uD83E\uDD73".green)
+    val summaryCount = tableUtils.loadTable(packedTable).count()
+    logger.info(s"Successfully uploaded $summaryCount summaries! \uD83E\uDD73".green)
Line range hint 1-110: Document the script's role in the new architecture

As this script is now part of the critical path for connecting Spark-based drift metrics computation with Hub, consider adding documentation that explains:

The script's role in the overall data flow

The interaction pattern with Hub via HTTPKVStore

Expected data volumes and performance characteristics

Recovery procedures in case of failures
.github/workflows/test_scala_and_python.yaml (2)
185-188: Consider moving hub tests to a dedicated job

Given that the hub component now handles bulk operations and drift metrics computation, it would be better to:

Move these tests to a dedicated job for better isolation

Configure appropriate resource limits

Enable parallel execution with other tests

Example structure:
  hub_tests:
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/${{ github.repository }}-ci:latest
      credentials:
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
    steps:
      - uses: actions/checkout@v4
      - name: Run hub tests
        run: |
          export SBT_OPTS="-Xmx8G -Xms2G"
          sbt "++ 2.12.18 hub/test"
🧰 Tools

🪛 yamllint (1.35.1)

[error] 188-188: no new line character at the end of file

(new-line-at-end-of-file)

188-188: Add newline at end of file

Add a newline character at the end of the file to follow YAML best practices and resolve the yamllint warning.

🧰 Tools

🪛 yamllint (1.35.1)

[error] 188-188: no new line character at the end of file

(new-line-at-end-of-file)
hub/app/store/MonitoringModelStore.scala (1)

Line range hint 39-45: Review TTLCache configuration with InMemoryKV store.

Since we're transitioning to an InMemoryKV store, consider reviewing the TTLCache configuration (10 seconds timeout) to ensure it aligns with the new storage backend's characteristics. You might want to:

Adjust the timeout based on the InMemoryKV store's performance characteristics

Consider adding metrics for cache hits/misses to monitor effectiveness

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (1)

202-202: Track and clarify the TODO comment.

The TODO suggests two different approaches but lacks context about the tradeoffs and implementation plan.

Would you like me to:

Create a GitHub issue to track this technical debt?

Help evaluate the tradeoffs between using a well-known list vs. exposing as part of summaries?

The issue should capture:

Current implementation limitations

Pros and cons of each approach

Impact on the drift metrics computation pipeline

Dependencies and prerequisites

build.sbt (1)

136-139: Consider managing STTP client version consistently.

While the STTP client is marked for removal, consider defining its version as a variable at the top of the file, consistent with other dependency version management practices.

spark/src/main/scala/ai/chronon/spark/TableUtils.scala (1)

Line range hint 1-1000: Consider implementing a consistent logging strategy

The codebase shows a pattern of replacing structured logging with println statements. For a production-grade Spark application, it's crucial to:

Use a consistent logging framework throughout

Include appropriate log levels (INFO, ERROR, etc.)

Maintain structured logging for better observability

Consider implementing a logging facade for better flexibility

This will improve:

Production monitoring capabilities

Debugging efficiency

Log aggregation and analysis
hub/app/controllers/InMemKVStoreController.scala (3)
43-44: Avoid exposing internal error messages to clients

Returning the raw error message from decode can expose internal details to the client, which is not a best practice for security reasons.

Consider logging the detailed error message and returning a generic message to the client:
case Left(error) =>
+ logger.error("Failed to decode PutRequests", error)
- Future.successful(BadRequest(error.getMessage))
+ Future.successful(BadRequest("Invalid request payload"))
29-30: Simplify request parsing using parse.json

The current implementation uses parse.raw and manually handles byte conversion and JSON decoding, which can be simplified.

Consider using Action.async(parse.json) to automatically parse the JSON body into a JsValue, reducing manual parsing and potential errors:
-import play.api.mvc.RawBuffer
+import play.api.libs.json.JsValue
+import play.api.libs.json.Json
...

-def bulkPut(): mvc.Action[RawBuffer] =
-  Action(parse.raw).async { request =>
-    request.body.asBytes() match {
-      case Some(bytes) =>
-        decode[Array[PutRequest]](bytes.utf8String) match {
+def bulkPut(): Action[JsValue] =
+  Action.async(parse.json) { request =>
+    request.body.validate[Array[PutRequest]] match {
+      case JsSuccess(putRequests, _) =>
         // Handle success
-      case Left(error) => Future.successful(BadRequest(error.getMessage))
+      case JsError(errors) =>
+        Future.successful(BadRequest("Invalid JSON format"))
This approach leverages Play's JSON utilities for parsing and validating the request body.

50-60: Validate Base64 encoding and decoding

The custom encoders and decoders for Array[Byte] using Base64 do not handle potential decoding errors.

Ensure that any invalid Base64 strings are properly handled to prevent runtime exceptions. Consider adding error handling or input validation when decoding:
implicit val byteArrayDecoder: Decoder[Array[Byte]] =
  Decoder.decodeString.emap { str =>
    try {
      Right(Base64.getDecoder.decode(str))
    } catch {
      case _: IllegalArgumentException => Left("Invalid Base64 string")
    }
  }
hub/app/controllers/TimeSeriesController.scala (5)
87-112: Simplify nested Future and Try handling in doFetchJoinDrift

The current implementation matches over a Try[Future[...]], which introduces nested Future and Try constructs. This can complicate error handling and readability. Consider using Future.fromTry to flatten the Try[Future[T]] into a Future[T], simplifying the subsequent code.

Apply this refactor to simplify the handling:
- val maybeDriftSeries = driftStore.getDriftSeries(joinPath, driftMetric, window, startTs, endTs)
- maybeDriftSeries match {
-   case Failure(exception) =>
-     Future.successful(InternalServerError(s"Error computing join drift - ${exception.getMessage}"))
-   case Success(driftSeriesFuture) =>
-     driftSeriesFuture.map { driftSeries =>
-       // processing logic
-     }
- }
+ val driftSeriesFuture = Future.fromTry(maybeDriftSeries).flatMap(identity)
+ driftSeriesFuture.map { driftSeries =>
+   // processing logic
+ }.recover {
+   case exception =>
+     InternalServerError(s"Error computing join drift - ${exception.getMessage}")
+ }
140-196: Simplify asynchronous error handling in doFetchFeatureDrift

In the doFetchFeatureDrift method, the nested matching over Try[Future[T]] can be refactored to improve clarity. Utilize Future.fromTry and combine the futures appropriately to handle errors more gracefully.

Refactor suggestion:
- val maybeDriftSeries = driftStore.getDriftSeries(...)
- maybeDriftSeries match {
-   case Failure(exception) =>
-     Future.successful(InternalServerError(...))
-   case Success(driftSeriesFuture) =>
-     driftSeriesFuture.map { driftSeries =>
-       // processing logic
-     }
- }
+ val driftSeriesFuture = Future.fromTry(maybeDriftSeries).flatMap(identity)
+ driftSeriesFuture.map { driftSeries =>
+   // processing logic
+ }.recover {
+   case exception =>
+     InternalServerError(s"Error computing feature drift - ${exception.getMessage}")
+ }
Similarly, when dealing with multiple futures, consider using Future.sequence and handle errors using combinators like recover or recoverWith.

265-269: Handle unsupported algorithms more gracefully in parseAlgorithm

In the parseAlgorithm method, any algorithm not matching the predefined options results in None. Consider providing a default algorithm or a more informative error message to guide users.

Suggestion:
- case _                 => None
+ case Some(unknown)     => Some(DriftMetric.DEFAULT)
+ // or return an error message indicating the unsupported algorithm
277-279: Remove commented-out code or annotate with TODO if planned for future use

The cases for "skew" and "ooc" are commented out in parseMetricChoice. If these metrics are not intended to be supported, consider removing the commented code to keep the codebase clean. If they are planned for future implementation, add a TODO comment for clarity.

131-136: Improve error messages for invalid parameters

The error responses for invalid metricChoice, metricRollup, and granularity provide basic feedback. Enhance these messages to include the invalid value received, aiding debugging and user experience.

Apply this diff to improve error messages:
- Future.successful(BadRequest("Invalid metric choice. Expect drift"))
+ Future.successful(BadRequest(s"Invalid metric choice '$metricType'. Expect 'drift'"))
hub/test/controllers/TimeSeriesControllerSpec.scala (2)
38-38: Use a dedicated ExecutionContext for test isolation

At line 38, ExecutionContext.global is used as the implicit ExecutionContext. Using a global execution context in tests can lead to unpredictable behavior due to shared resources. Consider using a dedicated or stubbed ExecutionContext for better test isolation.

Apply this diff to use the execution context from stubControllerComponents():
- implicit val ec: ExecutionContext = ExecutionContext.global
+ implicit val ec: ExecutionContext = stubCC.executionContext
269-275: Optimize data generation for test efficiency

Generating a large number of timestamps and data points between startTs and endTs with hourly intervals can slow down tests if the time range is extensive.

Consider limiting the number of data points for tests or adjusting the interval to ensure tests run efficiently without sacrificing coverage.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0bad656 and defd1bc.

📒 Files selected for processing (35)

.github/workflows/test_scala_and_python.yaml (1 hunks)
build.sbt (3 hunks)
docker-init/Dockerfile (1 hunks)
docker-init/demo/Dockerfile (0 hunks)
docker-init/demo/README.md (1 hunks)
docker-init/demo/build.sh (0 hunks)
docker-init/demo/load_summaries.sh (1 hunks)
docker-init/demo/run.sh (0 hunks)
docker-init/start.sh (0 hunks)
hub/app/controllers/InMemKVStoreController.scala (1 hunks)
hub/app/controllers/JoinController.scala (1 hunks)
hub/app/controllers/ModelController.scala (1 hunks)
hub/app/controllers/Paginate.scala (1 hunks)
hub/app/controllers/SearchController.scala (2 hunks)
hub/app/controllers/TimeSeriesController.scala (3 hunks)
hub/app/model/Model.scala (2 hunks)
hub/app/module/DriftStoreModule.scala (1 hunks)
hub/app/module/DynamoDBModule.scala (0 hunks)
hub/app/module/InMemoryKVStoreModule.scala (1 hunks)
hub/app/module/ModelStoreModule.scala (1 hunks)
hub/app/store/MonitoringModelStore.scala (4 hunks)
hub/conf/application.conf (1 hunks)
hub/conf/routes (1 hunks)
hub/test/controllers/JoinControllerSpec.scala (1 hunks)
hub/test/controllers/ModelControllerSpec.scala (2 hunks)
hub/test/controllers/SearchControllerSpec.scala (4 hunks)
hub/test/controllers/TimeSeriesControllerSpec.scala (4 hunks)
hub/test/store/MonitoringModelStoreTest.scala (2 hunks)
online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (1 hunks)
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala (1 hunks)
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (1 hunks)
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/Driver.scala (2 hunks)
spark/src/main/scala/ai/chronon/spark/TableUtils.scala (2 hunks)
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala (3 hunks)

💤 Files with no reviewable changes (5)

docker-init/demo/Dockerfile
docker-init/demo/build.sh
docker-init/demo/run.sh
docker-init/start.sh
hub/app/module/DynamoDBModule.scala

✅ Files skipped from review due to trivial changes (2)

online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala

🧰 Additional context used

📓 Learnings (3)

hub/app/controllers/SearchController.scala (1)

Learnt from: piyush-zlai
PR: zipline-ai/chronon#22
File: hub/app/controllers/SearchController.scala:55-58
Timestamp: 2024-11-12T09:38:33.532Z
Learning: Enhancements to search functionality in `SearchController` are planned for future updates; the current implementation is acceptable for now.

hub/test/controllers/SearchControllerSpec.scala (1)

Learnt from: piyush-zlai
PR: zipline-ai/chronon#44
File: hub/app/controllers/ModelController.scala:15-18
Timestamp: 2024-11-12T09:38:33.532Z
Learning: References to `MockDataService` in `hub/test/controllers/SearchControllerSpec.scala` and `hub/test/controllers/ModelControllerSpec.scala` are needed for tests and should not be removed.

hub/test/store/MonitoringModelStoreTest.scala (1)

Learnt from: piyush-zlai
PR: zipline-ai/chronon#44
File: hub/test/store/DynamoDBMonitoringStoreTest.scala:69-86
Timestamp: 2024-11-12T09:38:33.532Z
Learning: In `hub/test/store/DynamoDBMonitoringStoreTest.scala`, the current implementation of the `generateListResponse` method is acceptable as-is, and changes for resource handling and error management are not necessary at this time.

🪛 yamllint (1.35.1)

.github/workflows/test_scala_and_python.yaml

[error] 188-188: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Shellcheck (0.10.0)

docker-init/demo/load_summaries.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🔇 Additional comments (36)

hub/app/controllers/Paginate.scala (1)

8-8: LGTM! Good generalization to support multiple types.

The change from Seq[Model] to Seq[T] improves reusability and type safety, allowing the pagination functionality to work with any type while maintaining compile-time type checking.

docker-init/demo/load_summaries.sh (1)

3-12: ⚠️ Potential issue

Verify resource allocation and add error handling

The Spark configuration needs attention in several areas:

The memory allocation (8GB) and result size (6GB) are quite high for local testing
Network binding to 0.0.0.0 might expose the service unnecessarily
Missing error handling for container state and job execution

Run this script to verify container memory limits:

Apply this diff to add error handling and container validation:

 #!/bin/bash
+
+set -euo pipefail
+
+# Verify container is running
+if ! docker-compose -f docker-init/compose.yaml ps app | grep -q "Up"; then
+    echo "Error: app container is not running"
+    echo "Please start containers first using docker-compose up -d"
+    exit 1
+fi
+
+# Configure memory based on available resources
+DRIVER_MEMORY="8g"
+RESULT_SIZE="6g"
+
+# Validate memory configuration
+if ! docker-compose -f docker-init/compose.yaml exec app cat /sys/fs/cgroup/memory/memory.limit_in_bytes > /dev/null 2>&1; then
+    echo "Warning: Unable to verify container memory limits"
+    echo "Proceeding with default memory configuration"
+fi
 
 docker-compose -f docker-init/compose.yaml exec app /opt/spark/bin/spark-submit \
   --master "local[*]" \
-  --driver-memory 8g \
-  --conf "spark.driver.maxResultSize=6g" \
-  --conf "spark.driver.memory=8g" \
+  --driver-memory "${DRIVER_MEMORY}" \
+  --conf "spark.driver.maxResultSize=${RESULT_SIZE}" \
+  --conf "spark.driver.memory=${DRIVER_MEMORY}" \
   --driver-class-path "/opt/spark/jars/*:/app/cli/*" \
   --conf "spark.driver.host=localhost" \
-  --conf "spark.driver.bindAddress=0.0.0.0" \
+  --conf "spark.driver.bindAddress=127.0.0.1" \
   --class ai.chronon.spark.scripts.ObservabilityDemo \
-  /app/cli/spark.jar
+  /app/cli/spark.jar || {
+    echo "Error: Spark job failed"
+    exit 1
+  }

hub/app/module/ModelStoreModule.scala (2)

12-12: ⚠️ Potential issue

Empty AWS configuration is potentially problematic

The AWS API is being initialized with an empty configuration map. This seems incorrect and could lead to runtime issues.

Consider utilizing the injected configuration parameter to properly configure AWS:

- val awsApiImpl = new AwsApiImpl(Map.empty)
+ val awsConfig = configuration.get[Configuration]("aws").underlying
+ val awsApiImpl = new AwsApiImpl(awsConfig)

Likely invalid or redundant comment.

14-14: Verify singleton binding implications

The MonitoringModelStore is bound as a singleton instance. Please verify that this is the intended behavior and won't cause issues with concurrent access to the in-memory store.

✅ Verification successful

Let me gather more information about the MonitoringModelStore implementation and its usage.

Singleton binding is appropriate for this implementation

The MonitoringModelStore implementation is thread-safe and designed for concurrent access:

It uses Future-based async operations for data retrieval
The state is managed through a TTLCache which is designed for concurrent access
The underlying KVStore provides its own ExecutionContext for handling concurrent operations
All mutable state is properly encapsulated in thread-safe constructs

The singleton binding is the correct approach here as the store acts as a shared cache with its own concurrency handling.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for concurrent access patterns to MonitoringModelStore
rg -A 5 "MonitoringModelStore" | grep -i "concurrent|async|future|promise"

# Look for other singleton bindings for comparison
ast-grep --pattern 'bind($_).toInstance($_)'

Length of output: 339

Script:

#!/bin/bash
# Look for MonitoringModelStore implementation and usage
rg -A 10 "class MonitoringModelStore"

# Look for DynamoDBMonitoringStore implementation
rg -A 10 "class DynamoDBMonitoringStore"

# Search for any concurrent/async operations in files related to monitoring store
rg -l "MonitoringModelStore|DynamoDBMonitoringStore" | xargs rg -A 5 "concurrent|async|Future|Promise|synchronized"

Length of output: 5414

hub/app/module/DriftStoreModule.scala (1)

1-9: LGTM! Clean and well-organized imports.

The imports are logically grouped and all are being used in the implementation.

hub/app/module/InMemoryKVStoreModule.scala (2)

8-9: LGTM! Clear class purpose documentation.

The comment clearly explains the module's purpose for Docker testing scenarios.

16-16: Verify thread-safety of the KVStore binding.

Since this is bound as a singleton, verify that InMemoryKvStore is thread-safe for concurrent access in the testing environment.

✅ Verification successful

Let me gather more information about the thread safety implementation details.

Thread-safety is properly implemented in InMemoryKVStore

The implementation is thread-safe through the use of ConcurrentHashMap for both the main database and individual tables:

The class uses ConcurrentHashMap[DataSet, Table] for the main database
Each table is implemented as ConcurrentHashMap[Key, VersionedData]
The singleton store management is also thread-safe using ConcurrentHashMap[String, InMemoryKvStore]

These concurrent data structures provide the necessary thread-safety guarantees for the testing environment without requiring additional synchronization mechanisms.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for concurrent usage patterns
rg -A 5 "InMemoryKvStore" 

# Look for any synchronization mechanisms
ast-grep --pattern 'class InMemoryKvStore {
  $$$
  synchronized $$$
  $$$
}'

Length of output: 12488

Script:

#!/bin/bash
# Look for concurrent data structures in InMemoryKvStore
rg -A 10 "ConcurrentHashMap|AtomicReference|Concurrent|volatile|synchronized" spark/src/main/scala/ai/chronon/spark/utils/InMemoryKvStore.scala

# Check the implementation of key operations
ast-grep --pattern 'def put($$$) {
  $$$
}'

ast-grep --pattern 'def get($$$) {
  $$$
}'

ast-grep --pattern 'def create($$$) {
  $$$
}'

Length of output: 2247

docker-init/demo/README.md (1)

1-16: Verify the documented workflow

Let's verify if the documented workflow aligns with the actual implementation and PR objectives.

✅ Verification successful

Let me check the content of devnotes.md to see if the references to old scripts are just documentation or if they indicate potential inconsistencies.

Documentation accurately reflects the setup and workflow

The verification confirms that:

The load_summaries.sh script exists and is executable
The docker-compose.yaml exists and correctly exposes port 9000 as documented
The reference to build.sh in devnotes.md is unrelated to the demo setup, as it's part of the cloud release workflow

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and permissions of the load_summaries.sh script
# and check for any remaining references to old scripts

# Check if load_summaries.sh exists and is executable
echo "Checking load_summaries.sh..."
if [ -x "docker-init/demo/load_summaries.sh" ]; then
    echo "✓ load_summaries.sh exists and is executable"
else
    echo "✗ load_summaries.sh not found or not executable"
fi

# Check for any remaining references to old scripts
echo -e "\nChecking for references to removed scripts..."
rg -l "build\.sh|run\.sh" --type md

# Verify docker-compose file exists
echo -e "\nChecking docker-compose.yaml..."
if [ -f "docker-init/compose.yaml" ]; then
    echo "✓ compose.yaml exists"
else
    echo "✗ compose.yaml not found"
fi

Length of output: 542

Script:

#!/bin/bash
# Check the context of build.sh/run.sh references in devnotes.md
echo "Content of devnotes.md with context around build.sh/run.sh:"
rg "build\.sh|run\.sh" -C 3 devnotes.md

# Also verify the docker-compose.yaml content to ensure it matches the documented command
echo -e "\nVerifying docker-compose.yaml path in compose.yaml:"
rg "9000" docker-init/compose.yaml

Length of output: 542

hub/conf/application.conf (1)

31-33: LGTM! Module changes align with PR objectives.

The replacement of DynamoDBModule with these new modules aligns well with the PR's goal of addressing DynamoDB performance issues and implementing in-memory storage for data serving.

hub/app/controllers/ModelController.scala (2)

Line range hint 22-38: LGTM: Pagination logic remains unchanged

The pagination and error handling logic is unaffected by the store change, which is good. The controller maintains its responsibility for input validation and response formatting.

15-15: Verify MonitoringModelStore compatibility

The constructor parameter type change from DynamoDBMonitoringStore to MonitoringModelStore needs verification to ensure interface compatibility.

hub/app/controllers/JoinController.scala (1)

35-35: Verify MonitoringModelStore implementation

Need to ensure that monitoringStore.getJoins is properly connected to the InMemoryKV store mentioned in the PR objectives.

✅ Verification successful

Let me gather more information about the MonitoringModelStore implementation and its connection to InMemoryKV.

Let me check one more thing to verify the getJoins implementation and its connection to the KVStore.

Implementation properly connects to InMemoryKV store

The verification confirms that:

MonitoringModelStore retrieves joins from a cached configuration registry that's backed by the KVStore
InMemoryKVStoreModule correctly sets up and injects the InMemoryKV store implementation
The getJoins method properly transforms the stored thrift join configurations into the required format

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for MonitoringModelStore implementation
ast-grep --pattern 'class MonitoringModelStore {
  $$$
  getJoins($$$) {
    $$$
  }
  $$$
}'

# Check for InMemoryKV store usage
rg -l "InMemoryKV"

Length of output: 795

Script:

#!/bin/bash
# Search for MonitoringModelStore class definition and implementation
rg -A 10 "class MonitoringModelStore"

# Check InMemoryKVStoreModule implementation
cat hub/app/module/InMemoryKVStoreModule.scala

# Search for MonitoringModelStore instantiation
rg "new MonitoringModelStore"

Length of output: 2687

Script:

#!/bin/bash
# Search for getJoins implementation in MonitoringModelStore
rg -A 10 "def getJoins" hub/app/store/MonitoringModelStore.scala

# Search for any join-related operations in KVStore
rg "def.*[Jj]oin" --type scala

Length of output: 17084

hub/app/controllers/SearchController.scala (3)

5-8: LGTM: Import changes align with architectural transition

The import changes correctly reflect the transition from model-based to join-based architecture.

38-38: LGTM: Response handling properly updated

The response handling correctly uses the new SearchJoinResponse type while maintaining the existing pagination logic.

15-15: Verify consistent store implementation usage across controllers

The change from DynamoDBMonitoringStore to MonitoringModelStore is a good abstraction. Let's verify this pattern is consistently applied.

✅ Verification successful

Let me gather more information to verify if there are other controllers that should be using MonitoringModelStore.

Based on the results, I can now generate the final response as we have enough information about the store usage patterns across controllers.

Store implementation change is consistently applied

The migration from DynamoDBMonitoringStore to MonitoringModelStore is consistently implemented across all relevant controllers:

JoinController
ModelController
SearchController

Other controllers (TimeSeriesController, InMemKVStoreController) use different store types (DriftStore, KVStore) for their specific purposes, which is correct as they serve different functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining DynamoDBMonitoringStore usage in controllers
# and verify MonitoringModelStore is used consistently

echo "Checking for remaining DynamoDBMonitoringStore usage in controllers:"
rg "DynamoDBMonitoringStore" "hub/app/controllers/"

echo -e "\nVerifying MonitoringModelStore usage pattern:"
ast-grep --pattern 'class $_ @Inject() ($$$, monitoringStore: MonitoringModelStore)'

Length of output: 2919

Script:

#!/bin/bash
# Find all controller files to analyze the dependency pattern
echo "Finding all controllers:"
fd "Controller.scala$" hub/app/controllers/

# Check for any monitoring store related patterns in controllers
echo -e "\nChecking monitoring store patterns in controllers:"
rg -A 2 "Store" hub/app/controllers/

Length of output: 5959

hub/test/controllers/ModelControllerSpec.scala (2)

18-18: LGTM! Store type changes align with architectural shift

The transition from DynamoDBMonitoringStore to MonitoringModelStore aligns with the PR's objective of moving away from DynamoDB to an InMemoryKV store. The test cases remain valid as they focus on controller behavior rather than store implementation.

Also applies to: 27-27

27-27: Verify interface compatibility between store implementations

Let's ensure the new MonitoringModelStore maintains the same interface as the previous DynamoDBMonitoringStore to prevent any breaking changes.

hub/test/store/MonitoringModelStoreTest.scala (2)

24-24: LGTM! Class rename aligns with architectural changes.

The rename from DynamoDBMonitoringStoreTest to MonitoringModelStoreTest appropriately reflects the shift from DynamoDB to a more generic store implementation.

Line range hint 24-89: Verify the test coverage for new store implementation.

The test appears to focus on configuration loading and model validation. Given the architectural shift from DynamoDB to InMemoryKV store, we should verify that all critical paths are tested.

Let's check for any missing test coverage:

✅ Verification successful

Let me gather more information about the implementation and test coverage.

Let me check for more details about the public interface and configuration registry.

Based on the gathered information, I can now provide the final response:

Test coverage appears adequate for the current implementation

The MonitoringModelStore has a minimal public interface with just two main methods:

getConfigRegistry: Tested through multiple assertions on models, groupBys, joins, and stagingQueries
getModels: Explicitly tested with detailed validation of model properties

The implementation uses a TTL cache for configuration and handles the DynamoDB to InMemoryKV store transition internally. The test covers both the loading of configurations and validation of the loaded data, which are the critical paths for this store implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for the new MonitoringModelStore implementation

# Find the main store implementation
echo "Looking for MonitoringModelStore implementation..."
rg -l "class MonitoringModelStore"

# Check for additional test files
echo "Looking for related test files..."
fd -e scala -e java . | rg -i "test.*monitoring.*store"

# Look for untested public methods
echo "Looking for potentially untested public methods..."
ast-grep --pattern 'class MonitoringModelStore {
  $$$
  def $name($$$): $ret {
    $$$
  }
  $$$
}'

Length of output: 629

Script:

#!/bin/bash
# Get the implementation details of MonitoringModelStore
echo "=== MonitoringModelStore Implementation ==="
rg -A 5 "class MonitoringModelStore" hub/app/store/MonitoringModelStore.scala

# Find all public methods in the implementation
echo -e "\n=== Public Methods in MonitoringModelStore ==="
rg "def \w+\(" hub/app/store/MonitoringModelStore.scala

# Find test method calls to these public methods
echo -e "\n=== Test Coverage in MonitoringModelStoreTest ==="
rg "dynamoDBMonitoringStore\.\w+" hub/test/store/MonitoringModelStoreTest.scala

Length of output: 1321

Script:

#!/bin/bash
# Get more context around the MonitoringModelStore implementation
echo "=== Full MonitoringModelStore Class ==="
rg -A 20 "class MonitoringModelStore" hub/app/store/MonitoringModelStore.scala

# Look for ConfigRegistry related code
echo -e "\n=== ConfigRegistry Implementation ==="
rg "ConfigRegistry" -A 5 hub/app/store/MonitoringModelStore.scala

# Check for any initialization or setup methods
echo -e "\n=== Initialization Methods ==="
rg "def (init|load|setup|refresh)" hub/app/store/MonitoringModelStore.scala

Length of output: 1355

hub/conf/routes (2)

7-8: Clarify the transition plan for model timeseries endpoint.

The endpoint is marked as TBD, but there's no clear indication of:

The replacement approach for model predictions and drift
Timeline for implementing the new approach
Impact on existing consumers of this API

Would you like me to help create a GitHub issue to track this transition?

4-4: LGTM! Verify JoinController implementation.

The new joins endpoint follows RESTful conventions and maintains consistency with existing list endpoints by supporting pagination.

✅ Verification successful

JoinController implementation verified and properly handles pagination

The controller exists and is properly implemented with:

Input validation for offset and limit parameters
Appropriate error responses for invalid inputs
Default values and max limit handling
Unit tests are also present in JoinControllerSpec.scala

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the JoinController implementation exists
ast-grep --pattern 'class JoinController {
  $$$
  def list(offset: Option[Int], limit: Option[Int]) {
    $$$
  }
}'

Length of output: 119

Script:

#!/bin/bash
# Let's try a broader search for JoinController using ripgrep
rg "class JoinController" -A 10

# Also search for the file itself
fd "JoinController.scala"

Length of output: 1956

Script:

#!/bin/bash
# Let's examine the full implementation of the list method
rg "def list" -A 10 hub/app/controllers/JoinController.scala

Length of output: 597

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala (1)

5-5: Verify complete removal of drift-related imports

The addition of HTTPKVStore import aligns with the PR's objective of connecting with Hub. However, let's ensure all drift-related imports have been properly cleaned up.

✅ Verification successful

Let me verify if there are any remaining drift-related imports in the ObservabilityDemo.scala file specifically.

Drift-related imports have been properly cleaned up

The verification confirms that there are no remaining drift-related imports or references in ObservabilityDemo.scala. The import changes are clean and align with the PR's objectives of removing drift-related functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining drift-related imports that might have been missed
rg "import.*Drift" --type scala

Length of output: 2622

Script:

#!/bin/bash
# Search for any remaining drift-related imports or usages in ObservabilityDemo.scala
rg "Drift" spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala

Length of output: 80

.github/workflows/test_scala_and_python.yaml (1)

183-184: LGTM! Integration follows established patterns

The new hub tests step:

Maintains consistency with other test configurations
Uses the same SBT version (2.12.18)
Follows the same memory configuration pattern

Also applies to: 185-188

hub/app/store/MonitoringModelStore.scala (3)

34-37: LGTM! Good architectural change.

The renaming from DynamoDBMonitoringStore to MonitoringModelStore and the corresponding variable changes make the implementation more generic and not tied to a specific storage backend. This aligns well with the PR objectives of transitioning to an InMemoryKV store.

62-62: Verify the impact of output columns change.

The change to use getOrElse("derivations", Array.empty) instead of flattening all values might exclude output columns from other groups. Please verify this is intentional and doesn't break existing functionality.

95-95: Verify bulk operation handling in InMemoryKV store.

Given the PR objectives mention previous performance issues with DynamoDB ingestion, please verify that the new InMemoryKV store properly handles bulk operations without rejected execution exceptions.

✅ Verification successful

Let me gather more information about the InMemoryKV store implementation.

Let me try a different approach to locate the InMemoryKvStore implementation.

Let me check the specific list implementation in the InMemoryKvStore.

Based on the implementation details of InMemoryKvStore, I can now provide the final response:

InMemoryKV store is safe for bulk operations

The InMemoryKvStore implementation is well-suited for bulk operations because:

It uses ConcurrentHashMap for thread-safe operations
The bulkPut method is specifically designed for batch uploads
There are no throttling or rate limits that could cause rejected execution exceptions
Operations are performed in-memory without the network/IO constraints that affected DynamoDB

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for InMemoryKV store implementation details
rg -A 10 "class.*InMemoryKV.*Store"

# Look for any bulk operation handling or rate limiting
rg -l "bulk|batch|throttle|limit" --type scala

Length of output: 11975

Script:

#!/bin/bash
# Search for InMemoryKvStore implementation details
ast-grep --pattern 'class InMemoryKvStore {
  $$$
}'

# Look for list method implementation in InMemoryKvStore
rg -A 10 "def list.*InMemoryKvStore"

# Check for any rate limiting or bulk operation handling in InMemoryKvStore
rg -A 10 "throttle|limit|bulk|batch" --type scala "InMemoryKvStore"

Length of output: 217

Script:

#!/bin/bash
# Search for InMemoryKvStore implementation with correct casing
rg -A 20 "class.*InMemoryKvStore" --type scala

# Look for the list method implementation
rg -A 10 "def list.*KVStore" --type scala

# Search in utils directory since previous output showed InMemoryKvStore in spark/utils
fd -t f "InMemoryKvStore.scala"

Length of output: 2521

Script:

#!/bin/bash
# Get the full content of InMemoryKvStore.scala to examine list implementation
cat spark/src/main/scala/ai/chronon/spark/utils/InMemoryKvStore.scala

# Look for any test files that verify bulk operations
rg -l "InMemoryKvStore.*list" --type scala

Length of output: 6241

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (1)

201-223: Verify integration points with Spark and frontend.

Since this code connects drift metrics computation between Spark and the frontend, please ensure:

These percentile values match the ones used in Spark computations
The frontend is prepared to handle this fixed set of percentiles
Performance impact of processing all these percentiles has been evaluated

Let's verify the consistency across the codebase:

build.sbt (2)

81-85: LGTM! Well-structured circe dependency definition.

The grouping of circe dependencies with a shared version variable follows good practices for dependency management.

251-251: Verify completeness of hub dependencies for new features.

The addition of spark and circe dependencies aligns with the PR objectives for drift metrics computation and JSON handling. However, let's verify that all required dependencies for the new InMemoryKV store and HTTP endpoints are included.

Also applies to: 262-262

✅ Verification successful

All required dependencies are present for new features

Based on the imports analysis:

Spark dependencies are correctly included, used by InMemoryKvStore in InMemoryKVStoreModule
Circe dependencies are present and used across controllers for JSON handling
Play Framework dependencies are properly imported for HTTP endpoints
AWS dependencies are available for DynamoDB integration
All necessary testing dependencies (mockito, scalatest) are included

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all required dependencies are declared for new features

# Check for InMemoryKV and HTTP endpoint related imports in hub
echo "Checking for imports in hub module..."
rg -t scala "^import" hub/app/

# Check for any potential missing dependencies in hub tests
echo "Checking for imports in hub tests..."
rg -t scala "^import" hub/test/

Length of output: 14769

spark/src/main/scala/ai/chronon/spark/Driver.scala (2)

21-21: LGTM!

The import statement is correctly placed and necessary for the MetadataDataset constant usage.

569-569: Great improvement using MetadataDataset constant!

Replacing the string literal "ZIPLINE_METADATA" with the MetadataDataset constant is a good practice as it:

Centralizes the dataset name definition
Reduces the risk of typos
Makes maintenance easier

hub/app/controllers/InMemKVStoreController.scala (1)

32-33: Ensure consistent character encoding

Using bytes.utf8String assumes that the request body is encoded in UTF-8, which might not always be the case.

Verify that all clients sending requests to this endpoint use UTF-8 encoding. If not, consider specifying the character encoding explicitly or handling different encodings as needed.

hub/app/controllers/TimeSeriesController.scala (3)

149-149: Double-check the restriction on raw granularity for drift metrics

The message "We don't support Raw granularity for drift metric types" may limit future functionality. Confirm whether this restriction is necessary, and if not, consider implementing support for raw granularity.

Would you like assistance in evaluating support for raw granularity?

53-54: ⚠️ Potential issue

Update all calls to fetchFeature to include the new join parameter

The fetchFeature method now requires a join parameter. Ensure that all existing calls to this method have been updated to include the new parameter to prevent compilation errors or unexpected behavior.

Run the following script to identify method calls to fetchFeature that may need updating:

✅ Verification successful

All calls to fetchFeature already include the new join parameter

Based on the search results, all invocations of fetchFeature in the codebase are in the test file TimeSeriesControllerSpec.scala, and they all correctly include the join parameter as the first argument with the value "my_join". No updates are needed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all invocations of 'fetchFeature' that may not include the new 'join' parameter.

# Search for invocations of 'fetchFeature' in Scala code
rg --type scala 'fetchFeature\s*\('

Length of output: 2053

26-28: ⚠️ Potential issue

Ensure proper injection of DriftStore and update instantiations accordingly

The TimeSeriesController constructor now includes DriftStore and an implicit ExecutionContext. Verify that all instances where TimeSeriesController is instantiated, including in tests and configuration files, have been updated to include these new parameters to prevent runtime errors.

Run the following script to identify instantiations of TimeSeriesController that may need updating:

✅ Verification successful

Let me check the test file to verify if the instantiation is properly handling the implicit ExecutionContext.

The instantiation of TimeSeriesController is properly handled

The test file shows that the TimeSeriesController is correctly instantiated with all required parameters:

The ControllerComponents is provided via stubCC
The DriftStore is provided via mockedStore
The ExecutionContext is properly defined as an implicit value

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all instantiations of TimeSeriesController that may not include the new parameters.

# Search for instantiations of TimeSeriesController in Scala files
rg --type scala 'new\s+TimeSeriesController\s*\('

Length of output: 168

Script:

#!/bin/bash
# Check the content of the test file to verify the ExecutionContext handling
cat hub/test/controllers/TimeSeriesControllerSpec.scala

Length of output: 14201

hub/test/controllers/TimeSeriesControllerSpec.scala (1)

310-310: Verify the structure of histogramMap

At line 310, when constructing histogramMap, ensure that it aligns with the expected structure required by tileSummarySeries.setHistogram. Misalignment may lead to unexpected behavior in tests.

Please confirm that the map keys and values match the expected types and formats.

coderabbitai · 2024-11-26T16:10:17Z

docker-init/demo/load_summaries.sh

+# Kick off the ObsDemo spark job in the app container
+


🛠️ Refactor suggestion

Add shell script essentials

The script is missing crucial shell script components:

Shebang directive

Execution permissions

Usage instructions and requirements

Apply this diff to improve the script header:

+#!/bin/bash + +# Script: load_summaries.sh +# Purpose: Initiates ObservabilityDemo Spark job in the app container +# +# Requirements: +# - Docker and docker-compose must be running +# - Containers must be up (see docker-init/demo/README.md) +# +# Usage: ./load_summaries.sh + # Kick off the ObsDemo spark job in the app container

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Kick off the ObsDemo spark job in the app container

#!/bin/bash

# Script: load_summaries.sh

# Purpose: Initiates ObservabilityDemo Spark job in the app container

#

# Requirements:

# - Docker and docker-compose must be running

# - Containers must be up (see docker-init/demo/README.md)

#

# Usage: ./load_summaries.sh

# Kick off the ObsDemo spark job in the app container

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

coderabbitai · 2024-11-26T16:10:17Z

hub/app/module/DriftStoreModule.scala

+package module
+
+import ai.chronon.online.KVStore
+import ai.chronon.online.stats.DriftStore
+import com.google.inject.AbstractModule
+
+import javax.inject.Inject
+import javax.inject.Provider
+
+class DriftStoreModule extends AbstractModule {
+
+  override def configure(): Unit = {
+    // TODO swap to concrete api impl in a follow up
+    bind(classOf[DriftStore]).toProvider(classOf[DriftStoreProvider]).asEagerSingleton()
+  }
+}
+
+class DriftStoreProvider @Inject() (kvStore: KVStore) extends Provider[DriftStore] {
+  override def get(): DriftStore = {
+    new DriftStore(kvStore)
+  }
+}


💡 Codebase verification

DynamoDB dependencies still present in ModelStoreModule

The transition from DynamoDB to InMemoryKV is incomplete. While DriftStoreModule has been updated, ModelStoreModule still uses DynamoDB:

hub/app/module/ModelStoreModule.scala is still creating and binding AwsApiImpl

DynamoDB related constants and metrics are present in MonitoringModelStore.scala

Test files still reference DynamoDB implementation

🔗 Analysis chain

Verify DynamoDB module replacement.

Since this module replaces the DynamoDB implementation, we should verify the configuration changes.

I'll search for any remaining DynamoDB or AWS related configurations to ensure complete transition.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check if DynamoDB module is properly disabled and replaced rg -g "*.conf" "DynamoDBModule|DriftStoreModule" hub/conf/

Length of output: 136

Script:

#!/bin/bash # Let's check for any remaining DynamoDB related configurations or modules rg -g '*.{conf,scala}' -i 'dynamodb|aws' hub/

Length of output: 1753

coderabbitai · 2024-11-26T16:10:18Z

hub/app/module/InMemoryKVStoreModule.scala

+    inMemoryKVStore.create(Constants.MetadataDataset)
+    inMemoryKVStore.create(Constants.TiledSummaryDataset)


🛠️ Refactor suggestion

Add error handling for dataset creation.

The dataset creation calls could fail silently. Consider:

Adding error handling

Logging the creation status

Ensuring idempotent creation

- inMemoryKVStore.create(Constants.MetadataDataset) - inMemoryKVStore.create(Constants.TiledSummaryDataset) + try { + inMemoryKVStore.create(Constants.MetadataDataset) + inMemoryKVStore.create(Constants.TiledSummaryDataset) + logger.info("Successfully created KV store datasets") + } catch { + case e: Exception => + logger.error("Failed to create KV store datasets", e) + throw e + }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-26T16:10:18Z

hub/app/module/InMemoryKVStoreModule.scala

+class InMemoryKVStoreModule extends AbstractModule {
+
+  override def configure(): Unit = {
+    val inMemoryKVStore = InMemoryKvStore.build("hub", () => null)


💡 Codebase verification

⚠️ Potential issue

Passing null as TableUtils provider is unsafe and needs to be fixed

The tableUtils parameter is used in the bulkPut method where it's directly invoked, which would lead to a NullPointerException. While other methods might work without TableUtils, providing null is dangerous because:

The bulkPut method explicitly calls tableUtils() and uses its return value

The code comment indicates TableUtils contains SparkContext which is needed for executors

Other test files properly initialize it with a SparkSession (e.g., in OnlineUtils.scala)

Locations that need attention:

hub/app/module/InMemoryKVStoreModule.scala: Replace () => null with proper TableUtils initialization

spark/src/test/scala/ai/chronon/spark/test/stats/drift/DriftTest.scala: Also using null unsafely

🔗 Analysis chain

Review the null function provider.

The () => null function provider could lead to runtime issues if the underlying implementation attempts to use this provider. Consider providing a meaningful default or documenting why null is safe here.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for usages of the value provider in InMemoryKvStore ast-grep --pattern 'class InMemoryKvStore { $$$ build($_, $provider) { $$$ } $$$ }'

Length of output: 97

Script:

#!/bin/bash # Let's try to find the InMemoryKvStore implementation using ripgrep rg -A 10 "class InMemoryKvStore" # Also search for any references to InMemoryKvStore.build rg "InMemoryKvStore.build" # Search for files containing InMemoryKvStore to ensure we're not missing anything fd "InMemoryKvStore"

Length of output: 1878

Script:

#!/bin/bash # Let's examine the InMemoryKvStore implementation more closely cat spark/src/main/scala/ai/chronon/spark/utils/InMemoryKvStore.scala # Also check the companion object where build method should be defined rg -A 10 "object InMemoryKvStore"

Length of output: 7580

coderabbitai · 2024-11-26T16:10:18Z

hub/app/module/InMemoryKVStoreModule.scala

+class InMemoryKVStoreModule extends AbstractModule {
+
+  override def configure(): Unit = {
+    val inMemoryKVStore = InMemoryKvStore.build("hub", () => null)
+    // create relevant datasets in kv store
+    inMemoryKVStore.create(Constants.MetadataDataset)
+    inMemoryKVStore.create(Constants.TiledSummaryDataset)
+    bind(classOf[KVStore]).toInstance(inMemoryKVStore)
+  }
+}


💡 Codebase verification

The lifecycle management concern is valid and needs attention

The verification reveals that:

InMemoryKvStore uses ConcurrentHashMap for storage but lacks cleanup mechanisms

The store is singleton-like with a static stores map that keeps references to all instances

Neither the KVStore trait nor the implementation provides cleanup/shutdown methods

The store is used across multiple test files without clear cleanup patterns

Key issues:

Memory leaks: Static stores map in InMemoryKvStore object retains all instances

No cleanup API: Missing methods to clear data between tests

Resource management: No way to release resources when store is no longer needed

🔗 Analysis chain

Consider adding lifecycle management.

The in-memory store lacks explicit cleanup mechanisms. This could lead to:

Memory leaks in long-running tests

Data bleeding between tests

Resource management issues

Consider implementing a cleanup mechanism or documenting the lifecycle management approach.

Let me gather more information about the KVStore implementation and its usage in tests.

Let me check one more aspect - the KVStore interface and its implementation details.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for potential test files using this module rg -l "InMemoryKVStoreModule" # Check if there's any existing cleanup mechanism ast-grep --pattern 'class InMemoryKvStore { $$$ (cleanup|close|dispose|shutdown)($$$) { $$$ } $$$ }'

Length of output: 213

Script:

#!/bin/bash # Check the InMemoryKvStore implementation rg -A 5 "class InMemoryKvStore" # Look for test files that might use KVStore fd -e scala test rg -l "KVStore.*test" # Check if there's any cleanup in test configurations rg -A 5 "beforeEach|afterEach|beforeAll|afterAll.*KVStore"

Length of output: 9522

Script:

#!/bin/bash # Check the KVStore interface definition rg -A 10 "trait KVStore" # Look for any cleanup/shutdown patterns in the actual implementation rg -A 10 "object InMemoryKvStore" # Check if there are any memory management related imports or patterns rg "import.*concurrent.*Map|ConcurrentHashMap|WeakHashMap" spark/src/main/scala/ai/chronon/spark/utils/InMemoryKvStore.scala

Length of output: 3804

coderabbitai · 2024-11-26T16:10:20Z

hub/app/controllers/InMemKVStoreController.scala

+  def bulkPut(): mvc.Action[RawBuffer] =
+    Action(parse.raw).async { request =>
+      request.body.asBytes() match {
+        case Some(bytes) =>
+          decode[Array[PutRequest]](bytes.utf8String) match {
+            case Right(putRequests) =>
+              logger.info(s"Attempting a bulkPut with ${putRequests.length} items")
+              val resultFuture = kvStore.multiPut(putRequests)
+              resultFuture.map { responses =>
+                if (responses.contains(false)) {
+                  logger.warn("Some write failures encountered")
+                }
+                Ok("Success")
+              }
+            case Left(error) => Future.successful(BadRequest(error.getMessage))
+          }
+        case None => Future.successful(BadRequest("Empty body"))
+      }
+    }


⚠️ Potential issue

Improve error handling and response status codes in bulkPut method

Currently, the bulkPut method returns an Ok("Success") response even if some writes fail during kvStore.multiPut(putRequests). This could lead to clients believing that all data was successfully written when it wasn't.

Consider modifying the response to accurately reflect the success or failure of the operation. Here's a suggested change:

resultFuture.map { responses => if (responses.forall(identity)) { Ok("Success") } else { logger.warn("Some write failures encountered") - Ok("Success") + InternalServerError("Some writes failed during bulkPut") } }

Alternatively, you could return detailed information about which writes failed to provide more transparency to the client.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def bulkPut(): mvc.Action[RawBuffer] =

Action(parse.raw).async { request =>

request.body.asBytes() match {

case Some(bytes) =>

decode[Array[PutRequest]](bytes.utf8String) match {

case Right(putRequests) =>

logger.info(s"Attempting a bulkPut with ${putRequests.length} items")

val resultFuture = kvStore.multiPut(putRequests)

resultFuture.map { responses =>

if (responses.contains(false)) {

logger.warn("Some write failures encountered")

}

Ok("Success")

}

case Left(error) => Future.successful(BadRequest(error.getMessage))

}

case None => Future.successful(BadRequest("Empty body"))

}

}

def bulkPut(): mvc.Action[RawBuffer] =

Action(parse.raw).async { request =>

request.body.asBytes() match {

case Some(bytes) =>

decode[Array[PutRequest]](bytes.utf8String) match {

case Right(putRequests) =>

logger.info(s"Attempting a bulkPut with ${putRequests.length} items")

val resultFuture = kvStore.multiPut(putRequests)

resultFuture.map { responses =>

if (responses.forall(identity)) {

Ok("Success")

} else {

logger.warn("Some write failures encountered")

InternalServerError("Some writes failed during bulkPut")

}

}

case Left(error) => Future.successful(BadRequest(error.getMessage))

}

case None => Future.successful(BadRequest("Empty body"))

}

}

coderabbitai · 2024-11-26T16:10:20Z

hub/app/controllers/TimeSeriesController.scala

+  private def convertTileDriftSeriesInfoToTimeSeries(tileDriftSeries: TileDriftSeries,
+                                                     metric: Metric): FeatureTimeSeries = {
+    val lhsList = if (metric == NullMetric) {
+      tileDriftSeries.nullRatioChangePercentSeries.asScala
    } else {
-      val featureTsJson = if (granularity == Raw) {
-        val featureTs = ComparedFeatureTimeSeries(name,
-                                                  generateMockRawTimeSeriesPoints(startTs, 100),
-                                                  generateMockRawTimeSeriesPoints(startTs, 100))
-        featureTs.asJson.noSpaces
+      // check if we have a numeric / categorical feature. If the percentile drift series has non-null doubles
+      // then we have a numeric feature at hand
+      val isNumeric =
+        tileDriftSeries.percentileDriftSeries.asScala != null && tileDriftSeries.percentileDriftSeries.asScala
+          .exists(_ != null)
+      if (isNumeric) tileDriftSeries.percentileDriftSeries.asScala
+      else tileDriftSeries.histogramDriftSeries.asScala
+    }
+    val points = lhsList.zip(tileDriftSeries.timestamps.asScala).map {
+      case (v, ts) => TimeSeriesPoint(v, ts)
+    }
+
+    FeatureTimeSeries(tileDriftSeries.getKey.getColumn, points)
+  }


⚠️ Potential issue

Add null checks to prevent NullPointerException in data conversion methods

In convertTileDriftSeriesInfoToTimeSeries, accessing properties like tileDriftSeries.percentileDriftSeries and tileDriftSeries.histogramDriftSeries without null checks may lead to NullPointerException. Ensure that null checks are in place or use Option types to safely handle potential null values.

Apply this diff to handle null values safely:

- val isNumeric = - tileDriftSeries.percentileDriftSeries.asScala != null && tileDriftSeries.percentileDriftSeries.asScala - .exists(_ != null) + val percentileSeries = Option(tileDriftSeries.percentileDriftSeries).map(_.asScala).getOrElse(Seq.empty) + val histogramSeries = Option(tileDriftSeries.histogramDriftSeries).map(_.asScala).getOrElse(Seq.empty) + val isNumeric = percentileSeries.nonEmpty && percentileSeries.exists(_ != null)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

private def convertTileDriftSeriesInfoToTimeSeries(tileDriftSeries: TileDriftSeries,

metric: Metric): FeatureTimeSeries = {

val lhsList = if (metric == NullMetric) {

tileDriftSeries.nullRatioChangePercentSeries.asScala

} else {

val featureTsJson = if (granularity == Raw) {

val featureTs = ComparedFeatureTimeSeries(name,

generateMockRawTimeSeriesPoints(startTs, 100),

generateMockRawTimeSeriesPoints(startTs, 100))

featureTs.asJson.noSpaces

// check if we have a numeric / categorical feature. If the percentile drift series has non-null doubles

// then we have a numeric feature at hand

val isNumeric =

tileDriftSeries.percentileDriftSeries.asScala != null && tileDriftSeries.percentileDriftSeries.asScala

.exists(_ != null)

if (isNumeric) tileDriftSeries.percentileDriftSeries.asScala

else tileDriftSeries.histogramDriftSeries.asScala

}

val points = lhsList.zip(tileDriftSeries.timestamps.asScala).map {

case (v, ts) => TimeSeriesPoint(v, ts)

}

FeatureTimeSeries(tileDriftSeries.getKey.getColumn, points)

}

private def convertTileDriftSeriesInfoToTimeSeries(tileDriftSeries: TileDriftSeries,

metric: Metric): FeatureTimeSeries = {

val lhsList = if (metric == NullMetric) {

tileDriftSeries.nullRatioChangePercentSeries.asScala

} else {

// check if we have a numeric / categorical feature. If the percentile drift series has non-null doubles

// then we have a numeric feature at hand

val percentileSeries = Option(tileDriftSeries.percentileDriftSeries).map(_.asScala).getOrElse(Seq.empty)

val histogramSeries = Option(tileDriftSeries.histogramDriftSeries).map(_.asScala).getOrElse(Seq.empty)

val isNumeric = percentileSeries.nonEmpty && percentileSeries.exists(_ != null)

if (isNumeric) percentileSeries

else histogramSeries

}

val points = lhsList.zip(tileDriftSeries.timestamps.asScala).map {

case (v, ts) => TimeSeriesPoint(v, ts)

}

FeatureTimeSeries(tileDriftSeries.getKey.getColumn, points)

}

coderabbitai · 2024-11-26T16:10:21Z

hub/app/controllers/TimeSeriesController.scala

+    if (metric == NullMetric) {
+      summarySeries.nullCount.asScala.zip(summarySeries.timestamps.asScala).map {
+        case (nullCount, ts) => TimeSeriesPoint(0, ts, nullValue = Some(nullCount.intValue()))
+      }
+    } else {
+      // check if we have a numeric / categorical feature. If the percentile drift series has non-null doubles
+      // then we have a numeric feature at hand
+      val isNumeric = summarySeries.percentiles.asScala != null && summarySeries.percentiles.asScala.exists(_ != null)
+      if (isNumeric) {
+        summarySeries.percentiles.asScala.zip(summarySeries.timestamps.asScala).flatMap {
+          case (percentiles, ts) =>
+            DriftStore.percentileLabels.zip(percentiles.asScala).map {
+              case (l, value) => TimeSeriesPoint(value, ts, Some(l))
+            }
+        }
      } else {
-        val featuresTs = FeatureTimeSeries(name, generateMockTimeSeriesPercentilePoints(startTs, endTs))
-        featuresTs.asJson.noSpaces
+        summarySeries.timestamps.asScala.zipWithIndex.flatMap {
+          case (ts, idx) =>
+            summarySeries.histogram.asScala.map {
+              case (label, values) =>
+                TimeSeriesPoint(values.get(idx).toDouble, ts, Some(label))
+            }
+        }


⚠️ Potential issue

Ensure safe handling of potential null values in convertTileSummarySeriesToTimeSeries

Similar to the previous method, convertTileSummarySeriesToTimeSeries may encounter null values when accessing summarySeries.percentiles and summarySeries.histogram. Add null checks or use Option types to prevent NullPointerException.

Apply this diff to handle null values safely:

- val isNumeric = summarySeries.percentiles.asScala != null && summarySeries.percentiles.asScala.exists(_ != null) + val percentiles = Option(summarySeries.percentiles).map(_.asScala).getOrElse(Seq.empty) + val histogram = Option(summarySeries.histogram).map(_.asScala).getOrElse(Map.empty) + val isNumeric = percentiles.nonEmpty && percentiles.exists(_ != null)

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-26T16:10:21Z

hub/test/controllers/TimeSeriesControllerSpec.scala

+      val isNumeric = if (feature % 2 == 0) true else false
+      val percentileDrifts = if (isNumeric) List.fill(timestamps.size())(JDouble.valueOf(0.12)).asJava else List.fill[JDouble](timestamps.size())(null).asJava
+      val histogramDrifts = if (isNumeric) List.fill[JDouble](timestamps.size())(null).asJava else List.fill(timestamps.size())(JDouble.valueOf(0.23)).asJava
+      val nullRationChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava


⚠️ Potential issue

Typo in variable name nullRationChangePercents

There's a typographical error in the variable name nullRationChangePercents at line 275. It should be nullRatioChangePercents.

Apply this diff to correct the typo:

- val nullRationChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava + val nullRatioChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava

Also, update its usage at line 278:

tileDriftSeries.setPercentileDriftSeries(percentileDrifts) - tileDriftSeries.setNullRatioChangePercentSeries(nullRationChangePercents) + tileDriftSeries.setNullRatioChangePercentSeries(nullRatioChangePercents) tileDriftSeries.setHistogramDriftSeries(histogramDrifts)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

val nullRationChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava

val nullRatioChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava

coderabbitai · 2024-11-26T16:10:21Z

hub/test/controllers/TimeSeriesControllerSpec.scala

+    val tileKey = new TileSeriesKey()
+    tileKey.setNodeName(join)
+    tileKey.setGroupName(groupBy)
+    tileKey.setNodeName(join)


⚠️ Potential issue

Duplicate setNodeName call on tileKey

In the generateSummarySeries method, tileKey.setNodeName(join) is called twice (lines 286 and 288). The second call at line 288 is redundant or may be a mistake where a different property was intended to be set.

Apply this diff to remove the redundant call:

val tileKey = new TileSeriesKey() tileKey.setNodeName(join) tileKey.setGroupName(groupBy) - tileKey.setNodeName(join) tileKey.setColumn(featureName)

If a different property was intended, please correct it accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

tileKey.setNodeName(join)

tileKey.setColumn(featureName)

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (7)

hub/app/controllers/JoinController.scala (1)
11-17: Enhance class documentation

While the basic documentation is present, consider adding more details about:

The purpose and functionality of the Paginate trait

Expected behavior when store operations fail

Response format examples
 /**
  * Controller for the Zipline Join entities
+ *
+ * This controller provides REST endpoints for managing Zipline Join entities.
+ * It implements pagination through the Paginate trait and uses MonitoringModelStore
+ * for data persistence.
+ *
+ * Response Format:
+ * - List: { "offset": number, "results": Join[] }
+ * - Get: Join object
+ *
+ * @see MonitoringModelStore for storage implementation
+ * @see Paginate for pagination functionality
  */
hub/app/model/Model.scala (3)
5-10: Consider reducing data duplication and adding field validation

The Join case class now mirrors several fields from the Model class (online, production, team). While this provides flexibility, it might lead to data inconsistency.

Consider:

Adding field validation for the team field to ensure standardization

Evaluating if these fields could be consolidated into a shared trait/class to avoid duplication

Adding documentation to clarify the relationship between Join and Model fields
+/** Represents configuration for feature joins with metadata about deployment status and ownership */
 case class Join(name: String,
                 joinFeatures: Seq[String],
                 groupBys: Seq[GroupBy],
                 online: Boolean,
                 production: Boolean,
                 team: String)
13-19: Enhance documentation format and clarity

While the workflow documentation is helpful, it could be improved for better readability and documentation generation.

Consider reformatting as:
-// 1.) metadataUpload: join -> map<gb, list<cols>>
-// 2.) fetchJoinConf + listColumns: join => list<column>
-// 3.) (columns, start, end) -> list<tileKey>
-
-// 4.) 1:n/fetchTile: tileKey -> TileSummaries
-// 5.) 1:n:n/compareTiles: TileSummaries, TileSummaries -> TileDrift
-// 6.) Map[column, Seq[tileDrift]] -> TimeSeriesController
+/** Data Processing Workflow:
+ * 1. Metadata Upload: Processes join configuration into grouped column mappings
+ *    Input: join -> Output: map<groupBy, List<columns>>
+ * 
+ * 2. Configuration Fetch: Retrieves and lists columns for a join
+ *    Input: join -> Output: List<column>
+ * 
+ * 3. Tile Key Generation: Generates tile keys for given columns and time range
+ *    Input: (columns, startTime, endTime) -> Output: List<tileKey>
+ * 
+ * 4. Tile Fetching (1:n): Retrieves summaries for each tile
+ *    Input: tileKey -> Output: TileSummaries
+ * 
+ * 5. Tile Comparison (1:n:n): Computes drift between tile summaries
+ *    Input: (TileSummaries, TileSummaries) -> Output: TileDrift
+ * 
+ * 6. Time Series Generation: Prepares data for TimeSeriesController
+ *    Input: Map[column, Seq[tileDrift]] -> Output: TimeSeriesController data
+ */
64-67: LGTM! Consider adding documentation for response classes

The response class structure is consistent and the naming better reflects the functionality.

Consider adding documentation to clarify the usage of each response type:
+/** Response for listing available models with pagination support */
 case class ListModelResponse(offset: Int, items: Seq[Model])
+/** Response for searching joins with pagination support */
 case class SearchJoinResponse(offset: Int, items: Seq[Join])
+/** Response for listing available joins with pagination support */
 case class ListJoinResponse(offset: Int, items: Seq[Join])
hub/conf/routes (1)
17-17: Consider refactoring the complex parameter list.

While the endpoint correctly incorporates the join parameter, the large number of query parameters suggests potential for improvement.

Consider these alternatives:

Use a POST endpoint with a request body for complex queries

Group related parameters into meaningful path segments

Example refactor:
-GET /api/v1/join/:join/feature/:name/timeseries
+POST /api/v1/join/:join/feature/:name/timeseries
With request body:
{
  "timeRange": {
    "start": "timestamp",
    "end": "timestamp"
  },
  "metrics": {
    "type": "drift",
    "names": ["metric1", "metric2"],
    "granularity": "hour"
  },
  "options": {
    "offset": "value",
    "algorithm": "name"
  }
}
hub/app/store/MonitoringModelStore.scala (2)
34-37: LGTM! Good architectural improvement.

The transition from DynamoDB-specific naming to a more generic store interface improves modularity and allows for different storage implementations (like the InMemoryKV mentioned in PR objectives).

105-105: LGTM! Consider adding performance monitoring.

The transition to using kvStore is good. Since this is a critical data access point, consider adding performance monitoring metrics to track latency and success rates.
-    kvStore.list(listRequest).flatMap { response =>
+    val startTime = System.currentTimeMillis()
+    kvStore.list(listRequest).flatMap { response =>
+      val latency = System.currentTimeMillis() - startTime
+      Metrics.recordTimer("kvstore.list.latency", latency)
+      Metrics.incrementCounter("kvstore.list.success")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 71cd947 and 0da87a9.

📒 Files selected for processing (8)

hub/app/controllers/InMemKVStoreController.scala (1 hunks)
hub/app/controllers/JoinController.scala (1 hunks)
hub/app/model/Model.scala (2 hunks)
hub/app/store/MonitoringModelStore.scala (4 hunks)
hub/conf/routes (1 hunks)
hub/test/controllers/JoinControllerSpec.scala (1 hunks)
hub/test/controllers/ModelControllerSpec.scala (3 hunks)
hub/test/controllers/SearchControllerSpec.scala (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

hub/app/controllers/InMemKVStoreController.scala
hub/test/controllers/JoinControllerSpec.scala
hub/test/controllers/ModelControllerSpec.scala
hub/test/controllers/SearchControllerSpec.scala

🔇 Additional comments (5)

hub/app/controllers/JoinController.scala (2)

24-40: Add error handling and consider performance optimization

The previous review comment about error handling and response structure is still valid. Additionally:

Consider adding result size limits to prevent memory issues with large datasets
The getJoins call might be expensive - consider caching or streaming results

Consider these additional improvements:

 def list(offset: Option[Int], limit: Option[Int]): Action[AnyContent] =
   Action { implicit request: Request[AnyContent] =>
+    // Add upper bound for result set size
+    val maxResultSize = 10000
     val offsetValue = offset.getOrElse(defaultOffset)
-    val limitValue = limit.map(l => math.min(l, maxLimit)).getOrElse(defaultLimit)
+    val limitValue = limit
+      .map(l => math.min(l, maxLimit))
+      .getOrElse(defaultLimit)
+
+    if (offsetValue > maxResultSize) {
+      BadRequest(s"Offset exceeds maximum allowed value of $maxResultSize")
+    } else

45-53: ⚠️ Potential issue

Optimize join retrieval and add input validation

The current implementation has several areas for improvement:

Loading all joins to find one is inefficient
No validation or sanitization of the name parameter
Missing error handling for store operations

Consider this implementation:

 def get(name: String): Action[AnyContent] = {
   Action { implicit request: Request[AnyContent] =>
+    if (name == null || name.trim.isEmpty) {
+      BadRequest("Join name cannot be empty")
+    } else {
+      Try {
+        // Assuming monitoringStore can be updated to support direct lookup
+        monitoringStore.getJoinByName(name.trim)
+      } match {
+        case Success(Some(join)) => Ok(join.asJson.noSpaces)
+        case Success(None) => NotFound(s"Join: $name wasn't found")
+        case Failure(ex) =>
+          Logger.error(s"Failed to fetch join: $name", ex)
+          InternalServerError(s"Failed to fetch join: ${ex.getMessage}")
+      }
+    }
   }
 }

Let's verify the potential impact of loading all joins:

hub/conf/routes (2)

4-5: LGTM! Verify error handling for join endpoints.

The new join endpoints follow RESTful conventions and properly support pagination. They align well with the PR objective of transitioning to configured joins.

Let's verify the error handling implementation:

✅ Verification successful

Error handling in join endpoints is properly implemented

The JoinController implements comprehensive error handling:

GET /api/v1/joins:
- Returns BadRequest for negative offset/limit values
- Properly handles pagination with default values
GET /api/v1/join/:name:
- Returns NotFound with descriptive message when join doesn't exist
- Successfully returns join data for valid names

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check error handling in JoinController
# Expect: Error handling for invalid join names and proper validation

ast-grep --pattern 'def get(name: String) = Action { implicit request =>
  $$$
  case None => NotFound($$$)
  $$$
}'

Length of output: 118

Script:

#!/bin/bash
# Let's try a broader search for the JoinController implementation
rg -l "class JoinController" 

# Then search for error handling patterns in Play Framework
rg "NotFound|BadRequest|InternalServerError" --type scala

Length of output: 3683

Script:

#!/bin/bash
# Let's examine the full implementation of JoinController to understand the complete error handling
rg -A 10 "class JoinController" hub/app/controllers/JoinController.scala

Length of output: 589

Script:

#!/bin/bash
# Let's get the complete file content to see all error handling patterns
cat hub/app/controllers/JoinController.scala

Length of output: 1861

8-9: Clarify the status and impact of the TBD model endpoint.

While commenting out this endpoint aligns with removing mock data endpoints, the TBD status needs clarification as it might block PR completion.

Let's check for any remaining dependencies on this endpoint:

✅ Verification successful

The commented model timeseries endpoint can be safely removed

The commented endpoint /api/v1/model/:id/timeseries is not implemented in the codebase. The TimeSeriesController only has implementations for join and feature-level timeseries endpoints, and there are no test cases for model-level timeseries. The TBD comment can be safely removed along with the commented route as this functionality is not part of the current implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to the model timeseries endpoint
# Expect: No active code should be depending on this endpoint

# Search for any remaining references to fetchModel
rg "fetchModel|model/:id/timeseries" --type ts --type js --type scala

# Check for any TODO/FIXME comments related to this
rg "TODO.*model.*timeseries|FIXME.*model.*timeseries"

Length of output: 127

Script:

#!/bin/bash
# Let's broaden our search to find any related implementations or references

# Check for TimeSeriesController implementation
fd -e scala "TimeSeriesController"

# Search for any references to timeseries in routes file
cat hub/conf/routes

# Look for any model-related endpoints or implementations
rg "model.*endpoint|model.*api" --type scala

Length of output: 4254

Script:

#!/bin/bash
# Let's check the TimeSeriesController implementation to understand if the commented endpoint has any existing code

# Check TimeSeriesController implementation
cat hub/app/controllers/TimeSeriesController.scala

# Check if there are any references to model timeseries in tests
cat hub/test/controllers/TimeSeriesControllerSpec.scala

Length of output: 28238

hub/app/store/MonitoringModelStore.scala (1)

77-91: Add error handling and optimize performance.

The previous review comment about error handling is still valid.
Consider collecting the results in parallel for better performance when dealing with large datasets.

 def getJoins: Seq[Join] = {
-    configRegistryCache("default").joins.map { thriftJoin =>
+    configRegistryCache("default").joins.par.map { thriftJoin =>
       val groupBys = thriftJoin.joinParts.asScala.map { part =>
         GroupBy(part.groupBy.metaData.name, part.groupBy.valueColumns)
       }

       val outputColumns = thriftJoin.outputColumnsByGroup.getOrElse("derivations", Array.empty)
       Join(thriftJoin.metaData.name,
            outputColumns,
            groupBys,
            thriftJoin.metaData.online,
            thriftJoin.metaData.production,
            thriftJoin.metaData.team)
-    }
+    }.seq
 }

coderabbitai · 2024-11-26T19:46:37Z

hub/conf/routes

+# Temporary in-memory kv store endpoint
+POST /api/v1/dataset/data                            controllers.InMemKVStoreController.bulkPut()


💡 Codebase verification

Memory management and data persistence concerns identified

The implementation shows several potential issues:

The in-memory store uses unbounded ConcurrentHashMap and Buffer collections without size limits

No eviction policy is implemented for old data

Play's memory buffer is capped at 512K per request, but there's no limit on total store size

No persistence mechanism for data recovery

Key locations requiring attention:

spark/src/main/scala/ai/chronon/spark/utils/InMemoryKvStore.scala: Implement size limits and eviction policy

hub/app/module/InMemoryKVStoreModule.scala: Add configuration for memory limits

application.conf: Consider increasing maxMemoryBuffer if needed for bulk operations

Recommendations:

Add configurable size limits for the ConcurrentHashMap

Implement LRU or time-based eviction for the VersionedData buffers

Add monitoring for memory usage

Consider implementing persistence to disk for recovery

Add metrics for store size and operation latencies

🔗 Analysis chain

Verify memory management and data persistence strategy.

While this temporary solution addresses the DynamoDB ingestion issues, we should ensure proper memory management and data persistence guarantees.

Let's check the implementation details:

Consider:

Implementing memory limits for the in-memory store

Adding monitoring for memory usage

Implementing data eviction policies

Planning for persistence strategy if the temporary solution becomes long-term

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify memory management and bulk operation handling # Expect: Memory limits and error handling for large requests # Check for memory management code ast-grep --pattern 'class InMemKVStoreController $$ { $$$ def bulkPut() = Action(parse.raw) { implicit request => $$$ } }' # Look for configuration related to memory limits rg "memory|heap|buffer" --type conf

Length of output: 201

Script:

#!/bin/bash # Let's try different approaches to find the implementation and configuration # Find the controller file fd "InMemKVStoreController" # Search for memory-related configurations in all files rg -t scala -t conf -t properties -t yaml "memory|heap|buffer" # Look for the controller implementation ast-grep --pattern 'class InMemKVStoreController' # Search for any data store or cache implementation related to this controller rg -t scala "class.*Store|class.*Cache"

Length of output: 8655

Script:

#!/bin/bash # Let's examine the InMemoryKVStore implementation and module configuration # Check InMemoryKvStore implementation cat spark/src/main/scala/ai/chronon/spark/utils/InMemoryKvStore.scala # Check the module configuration cat hub/app/module/InMemoryKVStoreModule.scala # Look for any size/capacity configurations in application.conf fd "application.conf" --type f --exec cat {}

Length of output: 8099

coderabbitai · 2024-11-26T19:46:37Z

hub/app/store/MonitoringModelStore.scala

+        val outputColumns = thriftJoin.outputColumnsByGroup.getOrElse("derivations", Array.empty)
+        val join = Join(thriftJoin.metaData.name,
+                        outputColumns,
+                        groupBys,
+                        thriftJoin.metaData.online,
+                        thriftJoin.metaData.production,
+                        thriftJoin.metaData.team)


⚠️ Potential issue

Add null checks for metadata fields.

While the output column handling is improved with getOrElse, the metadata fields (online, production, team) should be validated to prevent potential NullPointerException.

- val join = Join(thriftJoin.metaData.name, - outputColumns, - groupBys, - thriftJoin.metaData.online, - thriftJoin.metaData.production, - thriftJoin.metaData.team) + val metadata = thriftJoin.metaData + if (metadata == null) { + logger.error(s"Missing metadata for join in model ${m.metaData.name}") + throw new IllegalStateException("Missing join metadata") + } + val join = Join( + metadata.name, + outputColumns, + groupBys, + Option(metadata.online).getOrElse(false), + Option(metadata.production).getOrElse(false), + Option(metadata.team).getOrElse("") + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

val outputColumns = thriftJoin.outputColumnsByGroup.getOrElse("derivations", Array.empty)

val join = Join(thriftJoin.metaData.name,

outputColumns,

groupBys,

thriftJoin.metaData.online,

thriftJoin.metaData.production,

thriftJoin.metaData.team)

val outputColumns = thriftJoin.outputColumnsByGroup.getOrElse("derivations", Array.empty)

val metadata = thriftJoin.metaData

if (metadata == null) {

logger.error(s"Missing metadata for join in model ${m.metaData.name}")

throw new IllegalStateException("Missing join metadata")

}

val join = Join(

metadata.name,

outputColumns,

groupBys,

Option(metadata.online).getOrElse(false),

Option(metadata.production).getOrElse(false),

Option(metadata.team).getOrElse("")

)

nikhil-zlai · 2024-11-27T03:59:20Z

docker-init/demo/load_summaries.sh

+  --driver-class-path "/opt/spark/jars/*:/app/cli/*" \
+  --conf "spark.driver.host=localhost" \
+  --conf "spark.driver.bindAddress=0.0.0.0" \
+  --class ai.chronon.spark.scripts.ObservabilityDemo \


made some changes to this to make logs appear.

I have been seeing logs in my runs:

$ ./docker-init/demo/load_summaries.sh ... 24/11/26 23:05:23 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, localhost, 39793, None) ---- running query ---- CREATE DATABASE IF NOT EXISTS observability_demo ---- call path ---- TableUtils.sql(TableUtils.scala:368) TableUtils.createDatabase(TableUtils.scala:143) scripts.ObservabilityDemo$.main(ObservabilityDemo.scala:67) scripts.ObservabilityDemo.main(ObservabilityDemo.scala) ---- end ---- generating anomalous data for fraud Generated 0/700000 rows of data. Generated 100000/700000 rows of data. ... Finished computing range 2/2 input: observability_demo.risk_user_transactions_txn_join_summary (2023-01-31 -> 2023-02-27) output: observability_demo.risk_user_transactions_txn_join_summary_packed (2023-01-31 -> 2023-02-27) Done uploading summaries! 🥳

build.sbt

.github/workflows/test_scala_and_python.yaml

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (19)

docker-init/demo/load_summaries.sh (1)
8-12: Consider parameterizing hardcoded paths

The script contains hardcoded paths which could make it less portable across different environments:

/opt/spark/jars/*

/app/cli/*

/app/cli/spark.jar

Consider making these configurable through environment variables:
+# Default paths (can be overridden through environment variables)
+SPARK_JARS_PATH=${SPARK_JARS_PATH:-"/opt/spark/jars/*"}
+APP_CLI_PATH=${APP_CLI_PATH:-"/app/cli"}
+SPARK_JAR_PATH=${SPARK_JAR_PATH:-"$APP_CLI_PATH/spark.jar"}
+
 docker-compose -f docker-init/compose.yaml exec app /opt/spark/bin/spark-submit \
   --master "local[*]" \
   --driver-memory 8g \
   --conf "spark.driver.maxResultSize=6g" \
   --conf "spark.driver.memory=8g" \
-  --driver-class-path "/opt/spark/jars/*:/app/cli/*" \
+  --driver-class-path "$SPARK_JARS_PATH:$APP_CLI_PATH/*" \
   --conf "spark.driver.host=localhost" \
   --conf "spark.driver.bindAddress=0.0.0.0" \
   --class ai.chronon.spark.scripts.ObservabilityDemo \
-  /app/cli/spark.jar
+  "$SPARK_JAR_PATH"
hub/conf/application.conf (1)

31-33: Consider data persistence and recovery strategy with in-memory storage

The shift from DynamoDB to in-memory storage resolves the ingestion performance issues but introduces new considerations:

Data persistence across service restarts

Recovery mechanism in case of crashes

Memory management for growing datasets

Consider implementing:

Periodic snapshots to persistent storage

Memory monitoring and alerting

Data retention/cleanup policies
hub/app/controllers/SearchController.scala (2)
20-24: Enhance API documentation with response format details.

While the documentation clearly explains the parameters, it would be helpful to include:

Response format and example

Any specific limitations on join name search

Pagination response structure

Consider adding:
  /**
    * Powers the /api/v1/search endpoint. Returns a list of joins
    * @param term - Search term to search for (currently we only support searching join names)
    * @param offset - For pagination. We skip over offset entries before returning results
    * @param limit - Number of elements to return
+   * @return JSON response containing:
+   *         - offset: Current offset
+   *         - joins: Array of matching joins, each containing:
+   *           - name: Join name
+   *           - [other join properties...]
    */
43-46: Consider improving search performance and accuracy.

The current implementation has several potential limitations:

Case-sensitive string matching might miss relevant results

Full scan of all joins for each search request

No caching strategy visible

Consider these improvements:
  private def searchRegistry(term: String): Seq[Join] = {
+   // Convert to lowercase for case-insensitive search
+   val searchTerm = term.toLowerCase
    val joins = monitoringStore.getJoins
-   joins.filter(j => j.name.contains(term))
+   joins.filter(j => j.name.toLowerCase.contains(searchTerm))
  }
Additionally, consider:

Adding caching for frequently accessed joins

Implementing a more sophisticated search algorithm for large datasets

Adding fuzzy matching for better search results

For better scalability, consider:

Implementing an index-based search solution

Adding a caching layer with TTL

Using a dedicated search service (e.g., Elasticsearch) for large datasets
hub/app/controllers/JoinController.scala (2)
11-13: Enhance controller documentation

Consider adding more detailed documentation including:

Available endpoints and their paths

Response formats

Pagination details

Error scenarios
 /**
  * Controller for the Zipline Join entities
+ *
+ * Endpoints:
+ * - GET /api/v1/joins?offset={offset}&limit={limit} - List all joins with pagination
+ * - GET /api/v1/join/{name} - Get a specific join by name
+ *
+ * Responses:
+ * - 200 OK: Successfully retrieved joins
+ * - 400 Bad Request: Invalid pagination parameters
+ * - 404 Not Found: Join not found (get endpoint)
+ *
+ * Pagination:
+ * - Default offset: 0
+ * - Default limit: defined by defaultLimit
+ * - Maximum limit: defined by maxLimit
  */
47-47: Consider optimizing join retrieval

The current implementation fetches all joins from the store and then filters in memory. This could be inefficient with a large number of joins.

Consider pushing the filtering down to the store layer:
-val maybeJoin = monitoringStore.getJoins.find(j => j.name.equalsIgnoreCase(name))
+val maybeJoin = monitoringStore.getJoinByName(name)  // New method to implement in store
hub/app/model/Model.scala (2)
5-10: Add documentation for new Join fields and consider status handling.

While the new fields align well with the Model class structure, please consider:

Adding ScalaDoc comments to document the purpose and valid values for each new field

Consider using an enumeration or sealed trait for team instead of String to ensure valid values

Consider if the boolean flags (online, production) might better be represented as a single status field for future extensibility
+/** Represents a join configuration with its features, groupings and operational status
+ * @param name Unique identifier for the join
+ * @param joinFeatures List of features used in the join
+ * @param groupBys Grouping configurations
+ * @param online Whether the join is available for online serving
+ * @param production Whether the join is used in production
+ * @param team Team responsible for maintaining this join
+ */
 case class Join(name: String,
                 joinFeatures: Seq[String],
                 groupBys: Seq[GroupBy],
                 online: Boolean,
                 production: Boolean,
                 team: String)
57-60: Standardize response class naming and add pagination constraints.

There's an inconsistency in the naming convention between "Search" and "List" prefixes. Additionally, the pagination implementation needs constraints:

Standardize on either "Search" or "List" prefix for consistency

Consider adding a maximum page size to prevent performance issues

Consider implementing a base trait for pagination:
trait PaginatedResponse[T] {
  def offset: Int
  def items: Seq[T]
  def maxPageSize: Int = 100  // Default max page size
  require(items.size <= maxPageSize, s"Page size cannot exceed $maxPageSize items")
}

case class ListModelResponse(offset: Int, items: Seq[Model]) extends PaginatedResponse[Model]
case class ListJoinResponse(offset: Int, items: Seq[Join]) extends PaginatedResponse[Join]
online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (2)
29-31: Add success logging for monitoring

Consider adding debug/info level logging for successful operations to aid in monitoring and debugging.
 response.code match {
-  case StatusCode.Ok => Seq(true)
+  case StatusCode.Ok =>
+    logger.debug(s"Successfully put ${putRequests.length} items")
+    Seq(true)
11-11: Add performance monitoring and documentation

Given that this implementation replaces DynamoDB to address performance issues:

Add metrics collection for request latency and success rates

Document the expected performance characteristics

Add retry/backoff strategy documentation

Would you like assistance in implementing metrics collection using your metrics framework?
hub/conf/routes (1)
17-17: Consider optimizing the API design for feature timeseries.

While the addition of the join parameter aligns with the PR objectives, the endpoint has numerous query parameters which could be better handled using a request body. This would:

Improve readability and maintainability

Make it easier to validate input

Allow for future parameter additions without URL length concerns

Consider refactoring to use POST with a request body:
-GET     /api/v1/join/:join/feature/:name/timeseries      controllers.TimeSeriesController.fetchFeature(join: String, name: String, startTs: Long, endTs: Long, metricType: String, metrics: String, granularity: String, offset: Option[String], algorithm: Option[String])
+POST    /api/v1/join/:join/feature/:name/timeseries      controllers.TimeSeriesController.fetchFeature(join: String, name: String)
With a request body schema:
{
  "startTs": number,
  "endTs": number,
  "metricType": string,
  "metrics": string,
  "granularity": string,
  "offset": string?,
  "algorithm": string?
}
hub/app/store/MonitoringModelStore.scala (1)
77-91: Consider refactoring to reduce code duplication.

The join processing logic is duplicated between getModels and getJoins. Consider extracting the common logic into a shared method.
+ private def processJoin(thriftJoin: api.Join): Join = {
+   val metadata = Option(thriftJoin.metaData).getOrElse(
+     throw new IllegalStateException("Missing join metadata")
+   )
+   
+   val groupBys = thriftJoin.joinParts.asScala.map { part =>
+     if (part.groupBy == null || part.groupBy.metaData == null) {
+       throw new IllegalStateException(s"Invalid join configuration: missing groupBy metadata in join ${metadata.name}")
+     }
+     GroupBy(part.groupBy.metaData.name, part.groupBy.valueColumns)
+   }
+   
+   val outputColumns = thriftJoin.outputColumnsByGroup.getOrElse("derivations", Array.empty)
+   Join(
+     metadata.name,
+     outputColumns,
+     groupBys,
+     Option(metadata.online).getOrElse(false),
+     Option(metadata.production).getOrElse(false),
+     Option(metadata.team).getOrElse("")
+   )
+ }
hub/app/controllers/TimeSeriesController.scala (4)
53-62: Consider API versioning for breaking changes

The addition of the join parameter to fetchFeature is a breaking change. Consider implementing API versioning (e.g., via URL path /v2/feature or Accept header) to maintain backward compatibility.

90-91: Extract error handling to a reusable function

The error handling pattern for offset and algorithm validation is repeated across methods. Consider extracting it to a reusable function.
+ private def validateDriftParams(offset: Option[String], algorithm: Option[String]): Either[Result, (Duration, DriftMetric)] = {
+   (parseOffset(offset), parseAlgorithm(algorithm)) match {
+     case (None, _) => Left(BadRequest(s"Unable to parse offset - $offset"))
+     case (_, None) => Left(BadRequest("Invalid drift algorithm. Expect JSD, PSI or Hellinger"))
+     case (Some(o), Some(a)) => Right((o, a))
+   }
+ }
96-112: Consider using for-comprehension for cleaner error handling

The nested match cases and Future transformations can be simplified using for-comprehension.
- maybeDriftSeries match {
-   case Failure(exception) =>
-     Future.successful(InternalServerError(s"Error computing join drift - ${exception.getMessage}"))
-   case Success(driftSeriesFuture) =>
-     driftSeriesFuture.map { driftSeries =>
+ val result = for {
+   driftSeriesFuture <- Future.fromTry(maybeDriftSeries)
+   driftSeries <- driftSeriesFuture
+ } yield {
    val grpToDriftSeriesList = driftSeries.groupBy(_.key.groupName)
    // ... rest of the logic ...
    Ok(tsData.asJson.noSpaces)
- }
}
+ result.recover {
+   case e => InternalServerError(s"Error computing join drift - ${e.getMessage}")
+ }
26-28: Consider splitting controller responsibilities

This controller handles multiple distinct responsibilities (join operations, feature operations, time series conversions). Consider splitting it into separate controllers (e.g., JoinTimeSeriesController, FeatureTimeSeriesController) to improve maintainability and follow the Single Responsibility Principle.
hub/test/controllers/TimeSeriesControllerSpec.scala (3)
Line range hint 87-107: Enhance test coverage with more specific assertions.

While the test covers the basic success case, consider adding more specific assertions:

Verify the exact number of items returned

Assert the specific values in the response points

Validate the structure of each group
       response.items.foreach { grpByTs =>
         grpByTs.items.isEmpty mustBe false
         grpByTs.items.foreach(featureTs => featureTs.points.length mustBe expectedLength)
+        // Assert specific values
+        grpByTs.items.foreach { featureTs =>
+          featureTs.points.foreach { point =>
+            point.percentileDrift must (be >= 0.0 and be <= 1.0)
+            point.nullRatioChange must (be >= 0.0 and be <= 1.0)
+          }
+        }
       }
217-246: Add validation for mock categories in categorical feature tests.

The test should validate that the mock categories used in the response match the expected categories.
       response.current.zip(response.baseline).foreach {
         case (current, baseline) =>
           (current.ts - baseline.ts) mustBe offset.toMillis
           current.label.isEmpty mustBe false
           baseline.label.isEmpty mustBe false
+          mockCategories must contain(current.label.get)
+          mockCategories must contain(baseline.label.get)
       }
271-275: Extract magic numbers into named constants.

The test data generation uses several magic numbers that would be clearer as named constants:

0.12 for percentile drifts

0.23 for histogram drifts

0.25 for null ratio changes
+  private val MockPercentileDrift = 0.12
+  private val MockHistogramDrift = 0.23
+  private val MockNullRatioChange = 0.25
+  private val MockHistogramCount = 1L

   private def generateDriftSeries(...) = {
     ...
-    val percentileDrifts = if (isNumeric) List.fill(timestamps.size())(JDouble.valueOf(0.12)).asJava else ...
-    val histogramDrifts = if (isNumeric) List.fill[JDouble](timestamps.size())(null).asJava else List.fill(timestamps.size())(JDouble.valueOf(0.23)).asJava
-    val nullRationChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava
+    val percentileDrifts = if (isNumeric) List.fill(timestamps.size())(JDouble.valueOf(MockPercentileDrift)).asJava else ...
+    val histogramDrifts = if (isNumeric) List.fill[JDouble](timestamps.size())(null).asJava else List.fill(timestamps.size())(JDouble.valueOf(MockHistogramDrift)).asJava
+    val nullRationChangePercents = List.fill(timestamps.size())(JDouble.valueOf(MockNullRatioChange)).asJava
Also applies to: 302-302, 308-308

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0da87a9 and 45532b7.

📒 Files selected for processing (35)

.github/workflows/test_scala_no_spark.yaml (1 hunks)
build.sbt (3 hunks)
docker-init/Dockerfile (1 hunks)
docker-init/demo/Dockerfile (0 hunks)
docker-init/demo/README.md (1 hunks)
docker-init/demo/build.sh (0 hunks)
docker-init/demo/load_summaries.sh (1 hunks)
docker-init/demo/run.sh (0 hunks)
docker-init/start.sh (0 hunks)
hub/app/controllers/InMemKVStoreController.scala (1 hunks)
hub/app/controllers/JoinController.scala (1 hunks)
hub/app/controllers/ModelController.scala (1 hunks)
hub/app/controllers/Paginate.scala (1 hunks)
hub/app/controllers/SearchController.scala (2 hunks)
hub/app/controllers/TimeSeriesController.scala (3 hunks)
hub/app/model/Model.scala (2 hunks)
hub/app/module/DriftStoreModule.scala (1 hunks)
hub/app/module/DynamoDBModule.scala (0 hunks)
hub/app/module/InMemoryKVStoreModule.scala (1 hunks)
hub/app/module/ModelStoreModule.scala (1 hunks)
hub/app/store/MonitoringModelStore.scala (4 hunks)
hub/conf/application.conf (1 hunks)
hub/conf/routes (1 hunks)
hub/test/controllers/JoinControllerSpec.scala (1 hunks)
hub/test/controllers/ModelControllerSpec.scala (3 hunks)
hub/test/controllers/SearchControllerSpec.scala (4 hunks)
hub/test/controllers/TimeSeriesControllerSpec.scala (4 hunks)
hub/test/store/MonitoringModelStoreTest.scala (2 hunks)
online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (1 hunks)
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala (1 hunks)
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (2 hunks)
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/Driver.scala (2 hunks)
spark/src/main/scala/ai/chronon/spark/TableUtils.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala (3 hunks)

💤 Files with no reviewable changes (5)

docker-init/demo/Dockerfile
docker-init/demo/build.sh
docker-init/demo/run.sh
docker-init/start.sh
hub/app/module/DynamoDBModule.scala

🚧 Files skipped from review as they are similar to previous changes (19)

build.sbt
docker-init/Dockerfile
docker-init/demo/README.md
hub/app/controllers/InMemKVStoreController.scala
hub/app/controllers/ModelController.scala
hub/app/controllers/Paginate.scala
hub/app/module/DriftStoreModule.scala
hub/app/module/InMemoryKVStoreModule.scala
hub/app/module/ModelStoreModule.scala
hub/test/controllers/JoinControllerSpec.scala
hub/test/controllers/ModelControllerSpec.scala
hub/test/controllers/SearchControllerSpec.scala
hub/test/store/MonitoringModelStoreTest.scala
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala
spark/src/main/scala/ai/chronon/spark/Driver.scala
spark/src/main/scala/ai/chronon/spark/TableUtils.scala
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemo.scala

🧰 Additional context used

📓 Learnings (1)

hub/test/controllers/TimeSeriesControllerSpec.scala (1)

Learnt from: piyush-zlai
PR: zipline-ai/chronon#22
File: hub/test/controllers/TimeSeriesControllerSpec.scala:40-49
Timestamp: 2024-11-12T09:38:33.532Z
Learning: When testing with mocked data, it's acceptable to not add detailed assertions on the data contents.

🪛 Shellcheck (0.10.0)

docker-init/demo/load_summaries.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🔇 Additional comments (24)

docker-init/demo/load_summaries.sh (2)

1-2: Add shell script essentials

The script is missing crucial shell script components like shebang directive and documentation.

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

3-12: 🛠️ Refactor suggestion

Add error handling and environment validation

The script could benefit from some robustness improvements:

Validate that required containers are running
Check if required files exist
Add error handling for the spark-submit command

Here's a suggested improvement:

+#!/bin/bash
+
+# Script: load_summaries.sh
+# Purpose: Initiates ObservabilityDemo Spark job in the app container
+
+set -e
+
+# Validate environment
+if ! docker-compose -f docker-init/compose.yaml ps | grep -q "app.*running"; then
+    echo "Error: App container is not running. Please start containers first."
+    exit 1
+fi
+
+if [ ! -f /app/cli/spark.jar ]; then
+    echo "Error: Spark JAR not found at /app/cli/spark.jar"
+    exit 1
+fi
+
+# Execute spark-submit
 docker-compose -f docker-init/compose.yaml exec app /opt/spark/bin/spark-submit \
   --master "local[*]" \
   --driver-memory 8g \
   --conf "spark.driver.maxResultSize=6g" \
   --conf "spark.driver.memory=8g" \
   --driver-class-path "/opt/spark/jars/*:/app/cli/*" \
   --conf "spark.driver.host=localhost" \
   --conf "spark.driver.bindAddress=0.0.0.0" \
   --class ai.chronon.spark.scripts.ObservabilityDemo \
-  /app/cli/spark.jar
+  /app/cli/spark.jar || {
+    echo "Error: Spark job failed"
+    exit 1
+}

Likely invalid or redundant comment.

hub/conf/application.conf (1)

31-33: Verify memory configuration for in-memory storage

Since the application now uses in-memory storage, we should verify if the current memory configuration is sufficient.

✅ Verification successful

Let me gather more information about the in-memory storage implementation to better understand its requirements.

Based on the search results, I can now provide a final response regarding the memory configuration for in-memory storage:

Memory configuration appears sufficient with defined limits

The application has the following memory-related configurations in place:

HTTP parser memory limits: maxMemoryBuffer=512K and maxDiskBuffer=10M
JVM memory settings in Docker: -Xms1g -Xmx1g
The InMemoryKVStore implementation is a simple in-memory store without explicit memory limits, but it's bounded by:
- The JVM heap limits above
- Natural limits from the data stored (only two datasets are created: MetadataDataset and TiledSummaryDataset)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check memory-related configurations in the codebase

# Search for memory configurations in application.conf and related files
echo "Checking for memory configurations..."
rg -i "memory|heap|ram" --type conf

# Look for JVM memory settings in deployment files
echo -e "\nChecking for JVM memory settings..."
fd -e yaml -e yml -e properties | xargs rg -i "xmx|xms|java.*opts"

# Check for any memory management related code
echo -e "\nChecking for memory management implementations..."
ast-grep --pattern 'class $_ implements MemoryManager { $$$ }'

Length of output: 623

Script:

#!/bin/bash
# Check the implementation of InMemoryKVStoreModule and related classes
echo "Checking InMemoryKVStore implementation..."
rg -A 10 "class InMemoryKVStore"

# Look for cache or storage size configurations
echo -e "\nChecking for cache/storage size configurations..."
rg -i "cache\.size|store\.size|capacity|buffer"

# Check if there are any existing memory-related tests
echo -e "\nChecking for memory-related tests..."
fd -e test -e spec | xargs rg -i "memory|cache|store" -A 5

Length of output: 111843

.github/workflows/test_scala_no_spark.yaml (2)

63-64: LGTM!

The formatting change improves readability by adding a consistent newline between steps.

65-68: 🛠️ Refactor suggestion

Remove redundant SBT_OPTS export.

The SBT_OPTS environment variable is already set globally in the "Set SBT resources" step. The local export in this step is redundant and could lead to maintenance issues if memory settings need to be adjusted in the future.

Apply this diff to remove the redundant export:

      - name: Run hub tests
        run: |
-          export SBT_OPTS="-Xmx8G -Xms2G"
          sbt "++ 2.12.18 hub/test"

Additionally, let's verify the hub module's test dependencies are properly configured:

✅ Verification successful

Remove redundant SBT_OPTS export

The verification confirms that SBT_OPTS is indeed set globally via GITHUB_ENV in the "Set SBT resources" step with the exact same values (-Xmx8G -Xms2G). The local export in the "Run hub tests" step is redundant and should be removed as suggested.

Remove the export SBT_OPTS="-Xmx8G -Xms2G" line from the "Run hub tests" step
The hub module's dependencies look properly configured in build.sbt with necessary test dependencies and no circular dependencies detected

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check hub module's test dependencies in build.sbt

# Check if hub module exists and has test dependencies
rg -A 10 'lazy val hub' build.sbt

# Check for potential circular dependencies
rg 'dependsOn.*hub' build.sbt

Length of output: 512

Script:

#!/bin/bash
# Check for SBT_OPTS setting in the workflow file
rg -B 5 -A 5 'SBT_OPTS' .github/workflows/test_scala_no_spark.yaml

Length of output: 475

hub/app/controllers/SearchController.scala (1)

5-15: LGTM! Verify MonitoringModelStore implementation.

The transition from DynamoDBMonitoringStore to MonitoringModelStore improves abstraction and aligns with addressing the DynamoDB performance issues mentioned in the PR objectives.

Let's verify the MonitoringModelStore implementation:

✅ Verification successful

MonitoringModelStore implementation verified and properly used

The transition from DynamoDBMonitoringStore to MonitoringModelStore is well-implemented:

MonitoringModelStore is properly defined in hub/app/store/MonitoringModelStore.scala and initialized with API implementation
It's correctly bound in ModelStoreModule for dependency injection
Used consistently across multiple controllers (Search, Join, Model)
Has comprehensive test coverage in MonitoringModelStoreTest
The implementation maintains the required functionality for both model and join operations

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify MonitoringModelStore implementation and its usage
# Expected: Find the implementation and other potential usages

# Find MonitoringModelStore implementation
ast-grep --pattern 'class MonitoringModelStore {
  $$$
}'

# Find other controllers using MonitoringModelStore
ast-grep --pattern 'class $_ @Inject() ($$$, monitoringStore: MonitoringModelStore) {
  $$$
}'

Length of output: 156

Script:

#!/bin/bash
# Let's try a different approach to find MonitoringModelStore

# Find the file containing MonitoringModelStore definition
fd "MonitoringModelStore.scala"

# Search for MonitoringModelStore usage with ripgrep for better context
rg "MonitoringModelStore" -A 3

# Look for trait or abstract class definition as well
ast-grep --pattern 'trait MonitoringModelStore {
  $$$
}'

# Search for any references to monitoring store in the codebase
rg "monitoring.*store" -i

Length of output: 9116

hub/app/controllers/JoinController.scala (1)

24-40: ⚠️ Potential issue

Add error handling and improve response structure

The current implementation needs improvement in error handling and response structure.

Additionally, consider adding request logging for better observability:

 def list(offset: Option[Int], limit: Option[Int]): Action[AnyContent] =
   Action { implicit request: Request[AnyContent] =>
+    Logger.info(s"Received request to list joins with offset=$offset, limit=$limit")

hub/app/model/Model.scala (1)

Line range hint 5-60: Verify usage of renamed response types and new Join fields.

Let's ensure all usages of the renamed and modified classes have been updated accordingly.

✅ Verification successful

Let me generate additional verification scripts to check for any potential issues with the response types and pagination patterns.

Response types and Join fields are consistently used across the codebase

The verification shows that:

SearchModelResponse has been completely removed with no lingering references
The new response types (ListModelResponse, SearchJoinResponse, ListJoinResponse) are properly used in their respective controllers
Join case class fields are consistently used in the codebase, with proper initialization in MonitoringModelStore

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any remaining references to SearchModelResponse
rg "SearchModelResponse"

# Check for consistent usage of new Join fields
ast-grep --pattern 'new Join($$$)' 

# Check for potential pagination usage patterns
rg -A 2 "items: Seq\[(Model|Join)\]"

Length of output: 5930

Script:

#!/bin/bash
# Check for any references to SearchModelResponse in controllers and services
rg -t scala "SearchModelResponse" --glob "hub/app/**/*"

# Check for response type usage patterns in controllers
rg -t scala "(ListModelResponse|SearchJoinResponse|ListJoinResponse)" -A 3 --glob "hub/app/**/*"

# Look for Join constructor calls in hub module
rg -t scala "Join\(" -A 3 --glob "hub/app/**/*"

Length of output: 4251

online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (4)

13-14: ⚠️ Potential issue

Backend initialization needs proper serialization handling

The backend initialization could cause serialization issues in Spark executors.

16-16: ⚠️ Potential issue

Implement or document unimplemented methods

The multiGet and bulkPut methods are marked with ???.

Also applies to: 39-39

18-37: ⚠️ Potential issue

Improve error handling and response mapping in multiPut

Several issues in the multiPut implementation need to be addressed.

46-50: ⚠️ Potential issue

Fix potential NPE and improve JSON string construction

The current implementation has several issues that need to be addressed.

hub/conf/routes (3)

4-5: LGTM! Well-designed RESTful endpoints for joins.

The new join endpoints follow REST conventions and maintain consistency with existing endpoints. The pagination parameters for the list endpoint align with the model listing endpoint.

19-20: ⚠️ Potential issue

Address memory management for bulk data ingestion.

The bulk ingestion endpoint using raw parser could lead to memory issues with large payloads.

Consider:

Adding request size limits in application.conf
Implementing streaming for large payloads
Adding rate limiting for the endpoint

Example configuration:

play.http.parser.maxMemoryBuffer=512K
play.http.parser.maxDiskBuffer=10MB

8-9: Clarify the status and plan for model timeseries endpoint.

The comment indicates this endpoint is "TBD at the moment" but doesn't explain:

Why it's temporarily disabled
What's the plan for replacing/re-enabling it
How this affects the frontend integration mentioned in PR objectives

Let's check for any related model timeseries implementations:

hub/app/store/MonitoringModelStore.scala (4)

34-37: LGTM! Good abstraction of implementation details.

The renaming from DynamoDBMonitoringStore to MonitoringModelStore and dynamoDBKVStore to kvStore improves the abstraction by removing implementation details, making the code more maintainable and flexible for future storage backend changes.

105-105: LGTM! Consistent with the store abstraction.

The change from dynamoDBKVStore.list to kvStore.list maintains consistency with the store abstraction changes.

62-68: ⚠️ Potential issue

Add null checks for metadata fields.

While the output column handling is improved with getOrElse, the metadata fields still need null checks to prevent potential NullPointerException.

Previous review suggested adding null checks for metadata fields. This is still applicable:

-        val join = Join(thriftJoin.metaData.name,
-                        outputColumns,
-                        groupBys,
-                        thriftJoin.metaData.online,
-                        thriftJoin.metaData.production,
-                        thriftJoin.metaData.team)
+        val metadata = thriftJoin.metaData
+        if (metadata == null) {
+          logger.error(s"Missing metadata for join in model ${m.metaData.name}")
+          throw new IllegalStateException("Missing join metadata")
+        }
+        val join = Join(
+          metadata.name,
+          outputColumns,
+          groupBys,
+          Option(metadata.online).getOrElse(false),
+          Option(metadata.production).getOrElse(false),
+          Option(metadata.team).getOrElse("")
+        )

77-91: ⚠️ Potential issue

Add error handling for join configurations.

The getJoins method should include error handling for invalid join configurations to prevent silent failures.

Previous review suggested adding error handling. This is still applicable:

 def getJoins: Seq[Join] = {
   configRegistryCache("default").joins.map { thriftJoin =>
+    try {
       val groupBys = thriftJoin.joinParts.asScala.map { part =>
+        if (part.groupBy == null || part.groupBy.metaData == null) {
+          logger.error(s"Invalid join configuration: missing groupBy metadata in join ${thriftJoin.metaData.name}")
+          throw new IllegalStateException("Invalid join configuration: missing groupBy metadata")
+        }
         GroupBy(part.groupBy.metaData.name, part.groupBy.valueColumns)
       }

       val outputColumns = thriftJoin.outputColumnsByGroup.getOrElse("derivations", Array.empty)
       Join(thriftJoin.metaData.name, outputColumns, groupBys)
+    } catch {
+      case e: Exception =>
+        logger.error(s"Failed to process join configuration: ${thriftJoin.metaData.name}", e)
+        throw e
+    }
   }
 }

hub/app/controllers/TimeSeriesController.scala (2)

3-28: LGTM: Good transition to async operations

The addition of DriftStore dependency and ExecutionContext follows Play Framework best practices for handling async operations. This will improve scalability and responsiveness.

209-214: Previous null safety concerns still apply

The null-checking patterns in both conversion methods could lead to NullPointerException. Consider using Option types as suggested in previous reviews.

Also applies to: 231-238

hub/test/controllers/TimeSeriesControllerSpec.scala (3)

Line range hint 3-42: LGTM! Well-structured test setup with proper mocking.

The test setup follows best practices with:

Appropriate imports for the required functionality
Proper mock initialization for DriftStore
Clear test organization with PlaySpec

275-275: Typo in variable name nullRationChangePercents

288-288: Duplicate setNodeName call on tileKey

coderabbitai · 2024-11-27T15:36:54Z

docker-init/demo/load_summaries.sh

+  --master "local[*]" \
+  --driver-memory 8g \
+  --conf "spark.driver.maxResultSize=6g" \
+  --conf "spark.driver.memory=8g" \


💡 Codebase verification

Memory configuration in demo environment needs adjustment

The verification reveals inconsistency in memory configurations across the codebase. While the demo environment uses fixed high values (8GB driver memory, 6GB max result size), other scripts in the test environment use a more flexible approach with environment variables defaulting to 1GB, which is more suitable for demo purposes.

Suggested changes:

Consider adopting the same environment variable approach used in test scripts (DRIVER_MEMORY:-1G)

Reduce the default memory allocation to be more accommodating for demo environments

Remove the duplicate memory configuration (--driver-memory and spark.driver.memory)

🔗 Analysis chain

Review memory configuration for demo environment

The current configuration allocates 8GB of memory and 6GB for max result size, which might be excessive for a demo environment and could cause issues on machines with limited resources.

Let's check if these memory settings are documented or consistent across the codebase:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for similar Spark memory configurations rg -A 2 "driver-memory|maxResultSize" --type yaml --type sh

Length of output: 935

coderabbitai · 2024-11-27T15:36:54Z

hub/app/controllers/JoinController.scala

+  def get(name: String): Action[AnyContent] = {
+    Action { implicit request: Request[AnyContent] =>
+      val maybeJoin = monitoringStore.getJoins.find(j => j.name.equalsIgnoreCase(name))
+      maybeJoin match {
+        case None       => NotFound(s"Join: $name wasn't found")
+        case Some(join) => Ok(join.asJson.noSpaces)
+      }
+    }
+  }


⚠️ Potential issue

Improve error handling and input validation in get endpoint

The endpoint needs several improvements:

Add error handling for store operations

Validate the name parameter

Make response format consistent with list endpoint

Apply these improvements:

def get(name: String): Action[AnyContent] = { Action { implicit request: Request[AnyContent] => + Logger.info(s"Received request to get join with name=$name") + + if (name.trim.isEmpty) { + BadRequest("Join name cannot be empty") + } else Try { val maybeJoin = monitoringStore.getJoins.find(j => j.name.equalsIgnoreCase(name)) maybeJoin match { case None => NotFound(s"Join: $name wasn't found") - case Some(join) => Ok(join.asJson.noSpaces) + case Some(join) => Ok(Map("data" -> join).asJson.noSpaces) } + } recover { + case ex: Exception => + Logger.error(s"Failed to fetch join: $name", ex) + InternalServerError(s"Failed to fetch join: ${ex.getMessage}") + } get } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get(name: String): Action[AnyContent] = {

Action { implicit request: Request[AnyContent] =>

val maybeJoin = monitoringStore.getJoins.find(j => j.name.equalsIgnoreCase(name))

maybeJoin match {

case None => NotFound(s"Join: $name wasn't found")

case Some(join) => Ok(join.asJson.noSpaces)

}

}

}

def get(name: String): Action[AnyContent] = {

Action { implicit request: Request[AnyContent] =>

Logger.info(s"Received request to get join with name=$name")

if (name.trim.isEmpty) {

BadRequest("Join name cannot be empty")

} else Try {

val maybeJoin = monitoringStore.getJoins.find(j => j.name.equalsIgnoreCase(name))

maybeJoin match {

case None => NotFound(s"Join: $name wasn't found")

case Some(join) => Ok(Map("data" -> join).asJson.noSpaces)

}

} recover {

case ex: Exception =>

Logger.error(s"Failed to fetch join: $name", ex)

InternalServerError(s"Failed to fetch join: ${ex.getMessage}")

} get

}

}

coderabbitai · 2024-11-27T15:36:54Z

online/src/main/scala/ai/chronon/online/HTTPKVStore.scala

+// Hacky test kv store that we use to send objects to the in-memory KV store that lives in a different JVM (e.g spark -> hub)
+class HTTPKVStore(host: String = "localhost", port: Int = 9000) extends KVStore with Serializable {


⚠️ Potential issue

Remove or clarify the "hacky" designation

The comment indicates this is a "hacky test kv store", but based on the PR objectives, this appears to be used in a production context for drift metrics computation. Either:

Remove the comment if this is intended for production use

Document the temporary nature and create a tracking issue for proper implementation

coderabbitai · 2024-11-27T15:36:54Z

online/src/main/scala/ai/chronon/online/HTTPKVStore.scala

+  override def create(dataset: String): Unit = {
+    logger.warn(s"Skipping creation of $dataset in HTTP kv store implementation")
+  }


⚠️ Potential issue

Improve create method implementation

The current implementation silently skips dataset creation which could lead to issues:

No documentation explaining why creation is skipped

No error propagation to caller

Using warn level logging for expected behavior

Consider either:

Implementing proper dataset creation

Throwing UnsupportedOperationException with explanation

Documenting why creation is unnecessary in this implementation

coderabbitai · 2024-11-27T15:36:54Z

hub/app/controllers/TimeSeriesController.scala

+            (maybeCurrentSummarySeries, maybeBaselineSummarySeries) match {
+              case (Failure(exceptionA), Failure(exceptionB)) =>
+                Future.successful(InternalServerError(
+                  s"Error computing feature percentiles for current + offset time window.\nCurrent window error: ${exceptionA.getMessage}\nOffset window error: ${exceptionB.getMessage}"))
+              case (_, Failure(exception)) =>
+                Future.successful(
+                  InternalServerError(
+                    s"Error computing feature percentiles for offset time window - ${exception.getMessage}"))
+              case (Failure(exception), _) =>
+                Future.successful(
+                  InternalServerError(
+                    s"Error computing feature percentiles for current time window - ${exception.getMessage}"))
+              case (Success(currentSummarySeriesFuture), Success(baselineSummarySeriesFuture)) =>
+                Future.sequence(Seq(currentSummarySeriesFuture, baselineSummarySeriesFuture)).map { merged =>
+                  val currentSummarySeries = merged.head
+                  val baselineSummarySeries = merged.last
+                  val currentFeatureTs = convertTileSummarySeriesToTimeSeries(currentSummarySeries.head, metric)
+                  val baselineFeatureTs = convertTileSummarySeriesToTimeSeries(baselineSummarySeries.head, metric)
+                  val comparedTsData = ComparedFeatureTimeSeries(name, baselineFeatureTs, currentFeatureTs)
+                  Ok(comparedTsData.asJson.noSpaces)
+                }
+            }


🛠️ Refactor suggestion

Simplify complex Future composition with for-comprehension

The nested pattern matching for handling multiple Future results can be simplified using for-comprehension and Future.sequence.

- (maybeCurrentSummarySeries, maybeBaselineSummarySeries) match { - case (Failure(exceptionA), Failure(exceptionB)) => - Future.successful(InternalServerError(...)) - case (_, Failure(exception)) => - Future.successful(InternalServerError(...)) - case (Failure(exception), _) => - Future.successful(InternalServerError(...)) - case (Success(currentSummarySeriesFuture), Success(baselineSummarySeriesFuture)) => + val result = for { + current <- Future.fromTry(maybeCurrentSummarySeries).flatten + baseline <- Future.fromTry(maybeBaselineSummarySeries).flatten + } yield { + val currentFeatureTs = convertTileSummarySeriesToTimeSeries(current.head, metric) + val baselineFeatureTs = convertTileSummarySeriesToTimeSeries(baseline.head, metric) + val comparedTsData = ComparedFeatureTimeSeries(name, baselineFeatureTs, currentFeatureTs) + Ok(comparedTsData.asJson.noSpaces) + } + result.recover { + case e: Exception => InternalServerError(s"Error computing feature percentiles: ${e.getMessage}") + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(maybeCurrentSummarySeries, maybeBaselineSummarySeries) match {

case (Failure(exceptionA), Failure(exceptionB)) =>

Future.successful(InternalServerError(

s"Error computing feature percentiles for current + offset time window.\nCurrent window error: ${exceptionA.getMessage}\nOffset window error: ${exceptionB.getMessage}"))

case (_, Failure(exception)) =>

Future.successful(

InternalServerError(

s"Error computing feature percentiles for offset time window - ${exception.getMessage}"))

case (Failure(exception), _) =>

Future.successful(

InternalServerError(

s"Error computing feature percentiles for current time window - ${exception.getMessage}"))

case (Success(currentSummarySeriesFuture), Success(baselineSummarySeriesFuture)) =>

Future.sequence(Seq(currentSummarySeriesFuture, baselineSummarySeriesFuture)).map { merged =>

val currentSummarySeries = merged.head

val baselineSummarySeries = merged.last

val currentFeatureTs = convertTileSummarySeriesToTimeSeries(currentSummarySeries.head, metric)

val baselineFeatureTs = convertTileSummarySeriesToTimeSeries(baselineSummarySeries.head, metric)

val comparedTsData = ComparedFeatureTimeSeries(name, baselineFeatureTs, currentFeatureTs)

Ok(comparedTsData.asJson.noSpaces)

}

}

val result = for {

current <- Future.fromTry(maybeCurrentSummarySeries).flatten

baseline <- Future.fromTry(maybeBaselineSummarySeries).flatten

} yield {

val currentFeatureTs = convertTileSummarySeriesToTimeSeries(current.head, metric)

val baselineFeatureTs = convertTileSummarySeriesToTimeSeries(baseline.head, metric)

val comparedTsData = ComparedFeatureTimeSeries(name, baselineFeatureTs, currentFeatureTs)

Ok(comparedTsData.asJson.noSpaces)

}

result.recover {

case e: Exception => InternalServerError(s"Error computing feature percentiles: ${e.getMessage}")

}

coderabbitai

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (1)

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala (1)
23-29: Consider enhancing the timing utility for robustness.

While the implementation is functional, consider these improvements:

Add exception handling to ensure timing completion is logged even if the block fails

Use System.nanoTime() for more precise measurements

Consider adding a fallback for environments where color codes aren't supported

Here's a suggested implementation:
-  def time(message: String)(block: => Unit): Unit = {
-    logger.info(s"$message..".yellow)
-    val start = System.currentTimeMillis()
-    block
-    val end = System.currentTimeMillis()
-    logger.info(s"$message took ${end - start} ms".green)
+  def time(message: String)(block: => Unit): Unit = {
+    logger.info(s"$message..")
+    val start = System.nanoTime()
+    try {
+      block
+    } catch {
+      case e: Exception =>
+        logger.error(s"$message failed: ${e.getMessage}")
+        throw e
+    } finally {
+      val end = System.nanoTime()
+      val duration = (end - start) / 1000000 // Convert to milliseconds
+      logger.info(s"$message took $duration ms")
+    }
+  }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 45532b7 and 4b21e6f.

📒 Files selected for processing (4)

build.sbt (4 hunks)
docker-init/demo/load_summaries.sh (1 hunks)
hub/app/controllers/InMemKVStoreController.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

build.sbt
hub/app/controllers/InMemKVStoreController.scala

🧰 Additional context used

🪛 Shellcheck (0.10.0)

docker-init/demo/load_summaries.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🔇 Additional comments (5)

docker-init/demo/load_summaries.sh (4)

1-2: Add shell script essentials

The script is missing crucial shell script components:

Shebang directive
Execution permissions
Usage instructions and requirements

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

5-7: 🛠️ Refactor suggestion

Simplify and optimize memory configuration

Issues with current memory settings:

Driver memory is configured twice (lines 5 and 7)
Fixed high memory values (8GB, 6GB) might be excessive for demo environments

Consider this simplified configuration:

-  --driver-memory 8g \
-  --conf "spark.driver.maxResultSize=6g" \
-  --conf "spark.driver.memory=8g" \
+  --driver-memory "${DRIVER_MEMORY:-2g}" \
+  --conf "spark.driver.maxResultSize=${MAX_RESULT_SIZE:-2g}" \

Likely invalid or redundant comment.

9-10: Verify network configuration for Docker environment

The current setup uses localhost as driver host while binding to all interfaces (0.0.0.0). This might cause connectivity issues in certain Docker network configurations.

✅ Verification successful

Let me gather more context about the Docker environment configuration to make a more informed decision.

Let me check the Spark configuration files to see if there are any default network settings that might provide more context.

Network configuration is appropriate for Docker environment

The current setup using localhost as driver host with 0.0.0.0 bind address is actually the correct configuration for a Spark application running in Docker. This combination allows the driver to be accessible from within the container (localhost) while still accepting connections from other containers or hosts (0.0.0.0). This is a common pattern in Docker deployments and is consistent across both demo scripts (load_summaries.sh and run.sh).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any similar Spark network configurations in other scripts
# that might provide insights into the correct setup

rg -A 2 "spark.driver.(host|bindAddress)" --type yaml --type sh

Length of output: 673

Script:

#!/bin/bash
# Check Docker-related configuration files for network setup
fd Dockerfile --type f --exec cat {}

Length of output: 8341

Script:

#!/bin/bash
# Check Spark configuration files for network-related settings
fd "spark-defaults.conf" --type f --exec cat {}

Length of output: 678

8-8: Verify application configuration and dependencies

Please verify:

The existence of required JARs in the specified paths
The main class name and its dependencies

Also applies to: 11-12

✅ Verification successful

Configuration and dependencies are properly set up

The verification confirms:

The main class ObservabilityDemoDataLoader exists with a proper main method in spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala
The JAR paths in the script align with the Docker setup:
- Spark JARs are correctly located at /opt/spark/jars/*
- Additional classpath entries are properly configured in the Docker environment
The application is built with Spark 3.5.x and Scala 2.12, matching the Docker image configuration

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the main class exists in the codebase
ast-grep --pattern 'class ObservabilityDemoDataLoader'

# Check if there are any references to this loader in other files
rg "ObservabilityDemoDataLoader" --type scala --type java

Length of output: 231

Script:

#!/bin/bash
# Check the implementation of ObservabilityDemoDataLoader
rg -A 10 "object ObservabilityDemoDataLoader" --type scala

# Check for JAR files in the repository
fd ".jar$"

# Check if there are any build or dependency files that might indicate required JARs
fd "build.sbt|pom.xml|build.gradle"

Length of output: 1400

Script:

#!/bin/bash
# Check the content of build.sbt for dependencies
cat build.sbt

# Check if there are any Dockerfile or docker-compose files that might show the JAR paths
fd "Dockerfile|docker-compose.yml"

# Look for any main method in ObservabilityDemoDataLoader
ast-grep --pattern 'def main(args: Array[String])'

Length of output: 11330

Script:

#!/bin/bash
# Check the Dockerfile in docker-init/demo for JAR paths
cat docker-init/demo/Dockerfile

# Check for the main method specifically in ObservabilityDemoDataLoader
rg -A 5 "def main" spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

Length of output: 1473

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala (1)

20-22: LGTM! Logger setup follows Spark best practices.

The logger implementation correctly uses @transient and lazy initialization to prevent serialization issues in Spark.

coderabbitai · 2024-11-27T17:22:08Z

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

+    val spark = SparkSessionBuilder.build(namespace, local = true)
+    implicit val tableUtils: TableUtils = TableUtils(spark)
+    tableUtils.createDatabase(namespace)
+


🛠️ Refactor suggestion

Avoid hardcoding Spark configuration.

The local=true parameter should be configurable through command-line options to support different deployment environments.

Add a configuration option:

class Conf(arguments: Seq[String]) extends ScallopConf(arguments) { + val localMode: ScallopOption[Boolean] = opt[Boolean]( + name = "local-mode", + default = Some(false), + descr = "Run Spark in local mode" + ) } def main(args: Array[String]): Unit = { val config = new Conf(args) - val spark = SparkSessionBuilder.build(namespace, local = true) + val spark = SparkSessionBuilder.build(namespace, local = config.localMode())

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-27T17:22:08Z

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

+    time("Uploading summaries") {
+      val uploader = new SummaryUploader(tableUtils.loadTable(packedTable), hubApi)
+      uploader.run()
+    }
+
+    println("Done uploading summaries! \uD83E\uDD73".green)
+    // clean up spark session and force jvm exit
+    spark.stop()
+    System.exit(0)
+  }


⚠️ Potential issue

Improve cleanup and error handling.

The current implementation has several issues:

Uses System.exit which bypasses proper cleanup

Lacks try-with-resources pattern for Spark session

Missing error handling for upload operation

Consider this improvement:

- time("Uploading summaries") { - val uploader = new SummaryUploader(tableUtils.loadTable(packedTable), hubApi) - uploader.run() - } - - println("Done uploading summaries! \uD83E\uDD73".green) - // clean up spark session and force jvm exit - spark.stop() - System.exit(0) + try { + time("Uploading summaries") { + val uploader = new SummaryUploader(tableUtils.loadTable(packedTable), hubApi) + uploader.run() + } + logger.info("Done uploading summaries! \uD83E\uDD73") + } catch { + case e: Exception => + logger.error(s"Failed to upload summaries: ${e.getMessage}", e) + throw e + } finally { + if (spark != null) { + spark.stop() + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

time("Uploading summaries") {

val uploader = new SummaryUploader(tableUtils.loadTable(packedTable), hubApi)

uploader.run()

}

println("Done uploading summaries! \uD83E\uDD73".green)

// clean up spark session and force jvm exit

spark.stop()

System.exit(0)

}

try {

time("Uploading summaries") {

val uploader = new SummaryUploader(tableUtils.loadTable(packedTable), hubApi)

uploader.run()

}

logger.info("Done uploading summaries! \uD83E\uDD73")

} catch {

case e: Exception =>

logger.error(s"Failed to upload summaries: ${e.getMessage}", e)

throw e

} finally {

if (spark != null) {

spark.stop()

}

}

}

coderabbitai · 2024-11-27T17:22:08Z

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

+    // mock api impl for online fetching and uploading
+    val inMemKvStoreFunc: () => KVStore = () => {
+      // cannot reuse the variable - or serialization error
+      val result = InMemoryKvStore.build(namespace, () => null)
+      result
+    }
+    val inMemoryApi = new MockApi(inMemKvStoreFunc, namespace)
+


⚠️ Potential issue

Improve error handling for KVStore initialization.

The KVStore initialization passes a null function and lacks error handling.

Consider this improvement:

val inMemKvStoreFunc: () => KVStore = () => { - // cannot reuse the variable - or serialization error - val result = InMemoryKvStore.build(namespace, () => null) - result + try { + InMemoryKvStore.build(namespace, () => throw new UnsupportedOperationException("Mock store doesn't support this operation")) + } catch { + case e: Exception => + logger.error(s"Failed to initialize InMemoryKvStore: ${e.getMessage}") + throw e + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// mock api impl for online fetching and uploading

val inMemKvStoreFunc: () => KVStore = () => {

// cannot reuse the variable - or serialization error

val result = InMemoryKvStore.build(namespace, () => null)

result

}

val inMemoryApi = new MockApi(inMemKvStoreFunc, namespace)

// mock api impl for online fetching and uploading

val inMemKvStoreFunc: () => KVStore = () => {

try {

InMemoryKvStore.build(namespace, () => throw new UnsupportedOperationException("Mock store doesn't support this operation"))

} catch {

case e: Exception =>

logger.error(s"Failed to initialize InMemoryKvStore: ${e.getMessage}")

throw e

}

}

val inMemoryApi = new MockApi(inMemKvStoreFunc, namespace)

coderabbitai · 2024-11-27T17:22:09Z

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

+  class Conf(arguments: Seq[String]) extends ScallopConf(arguments) {
+    val startDs: ScallopOption[String] = opt[String](
+      name = "start-ds",
+      default = Some("2023-01-01"),
+      descr = "Start date in YYYY-MM-DD format"
+    )
+
+    val endDs: ScallopOption[String] = opt[String](
+      name = "end-ds",
+      default = Some("2023-02-30"),
+      descr = "End date in YYYY-MM-DD format"
+    )
+
+    val rowCount: ScallopOption[Int] = opt[Int](
+      name = "row-count",
+      default = Some(700000),
+      descr = "Number of rows to generate"
+    )
+
+    val namespace: ScallopOption[String] = opt[String](
+      name = "namespace",
+      default = Some("observability_demo"),
+      descr = "Namespace for the demo"
+    )
+
+    verify()
+  }


⚠️ Potential issue

Fix configuration validation issues.

Several validation issues need to be addressed:

The default end date "2023-02-30" is invalid as February never has 30 days

Missing validation for date format

Missing validation for minimum row count

Here's a suggested implementation:

class Conf(arguments: Seq[String]) extends ScallopConf(arguments) { + // Date format validation + private val dateFormat = "yyyy-MM-dd" + private def validateDate(date: String): Boolean = { + try { + java.time.LocalDate.parse(date, java.time.format.DateTimeFormatter.ofPattern(dateFormat)) + true + } catch { + case _: Exception => false + } + } + val startDs: ScallopOption[String] = opt[String]( name = "start-ds", default = Some("2023-01-01"), - descr = "Start date in YYYY-MM-DD format" + descr = s"Start date in $dateFormat format", + validate = validateDate ) val endDs: ScallopOption[String] = opt[String]( name = "end-ds", - default = Some("2023-02-30"), + default = Some("2023-02-28"), - descr = "End date in YYYY-MM-DD format" + descr = s"End date in $dateFormat format", + validate = validateDate ) val rowCount: ScallopOption[Int] = opt[Int]( name = "row-count", default = Some(700000), - descr = "Number of rows to generate" + descr = "Number of rows to generate (minimum: 1)", + validate = _ > 0 ) // Add cross-validation for dates + validate(startDs, endDs) { (start, end) => + val startDate = java.time.LocalDate.parse(start) + val endDate = java.time.LocalDate.parse(end) + if (startDate.isAfter(endDate)) { + Left("Start date must be before or equal to end date") + } else { + Right(()) + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class Conf(arguments: Seq[String]) extends ScallopConf(arguments) {

val startDs: ScallopOption[String] = opt[String](

name = "start-ds",

default = Some("2023-01-01"),

descr = "Start date in YYYY-MM-DD format"

)

val endDs: ScallopOption[String] = opt[String](

name = "end-ds",

default = Some("2023-02-30"),

descr = "End date in YYYY-MM-DD format"

)

val rowCount: ScallopOption[Int] = opt[Int](

name = "row-count",

default = Some(700000),

descr = "Number of rows to generate"

)

val namespace: ScallopOption[String] = opt[String](

name = "namespace",

default = Some("observability_demo"),

descr = "Namespace for the demo"

)

verify()

}

class Conf(arguments: Seq[String]) extends ScallopConf(arguments) {

// Date format validation

private val dateFormat = "yyyy-MM-dd"

private def validateDate(date: String): Boolean = {

try {

java.time.LocalDate.parse(date, java.time.format.DateTimeFormatter.ofPattern(dateFormat))

true

} catch {

case _: Exception => false

}

}

val startDs: ScallopOption[String] = opt[String](

name = "start-ds",

default = Some("2023-01-01"),

descr = s"Start date in $dateFormat format",

validate = validateDate

)

val endDs: ScallopOption[String] = opt[String](

name = "end-ds",

default = Some("2023-02-28"),

descr = s"End date in $dateFormat format",

validate = validateDate

)

val rowCount: ScallopOption[Int] = opt[Int](

name = "row-count",

default = Some(700000),

descr = "Number of rows to generate (minimum: 1)",

validate = _ > 0

)

val namespace: ScallopOption[String] = opt[String](

name = "namespace",

default = Some("observability_demo"),

descr = "Namespace for the demo"

)

// Add cross-validation for dates

validate(startDs, endDs) { (start, end) =>

val startDate = java.time.LocalDate.parse(start)

val endDate = java.time.LocalDate.parse(end)

if (startDate.isAfter(endDate)) {

Left("Start date must be before or equal to end date")

} else {

Right(())

}

}

verify()

}

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (6)

hub/app/controllers/TimeSeriesController.scala (6)
3-8: Consider organizing imports for better readability

The imports could be organized into logical groups:

Play Framework imports

Chronon API imports

Scala standard library imports

Project-specific imports
-import ai.chronon.api.DriftMetric
-import ai.chronon.api.Extensions.WindowOps
-import ai.chronon.api.TileDriftSeries
-import ai.chronon.api.TileSummarySeries
-import ai.chronon.api.TimeUnit
-import ai.chronon.api.Window
+// Chronon API imports
+import ai.chronon.api.{DriftMetric, TileDriftSeries, TileSummarySeries, TimeUnit, Window}
+import ai.chronon.api.Extensions.WindowOps

-import scala.concurrent.ExecutionContext
-import scala.concurrent.Future
-import scala.concurrent.duration._
-import scala.jdk.CollectionConverters._
-import scala.util.Failure
-import scala.util.Success
+// Scala imports
+import scala.concurrent.{ExecutionContext, Future}
+import scala.concurrent.duration._
+import scala.jdk.CollectionConverters._
+import scala.util.{Failure, Success}
Also applies to: 15-20

53-62: Update method documentation to reflect new parameters

The documentation for fetchFeature should be updated to include information about the new join parameter and its purpose.
  /**
    * Helps retrieve a time series (drift or skew) for a given feature. Time series is
    * retrieved between the start and end ts. Choice of granularity (raw, aggregate, percentiles) along with the
    * metric type (drift / skew) dictates the shape of the returned time series.
+   * @param join The join identifier used to locate the feature's data
    */
90-91: Enhance error messages with format examples

The error messages could be more helpful by including the expected format.
-      case (None, _) => Future.successful(BadRequest(s"Unable to parse offset - $offset"))
-      case (_, None) => Future.successful(BadRequest("Invalid drift algorithm. Expect JSD, PSI or Hellinger"))
+      case (None, _) => Future.successful(BadRequest(s"Unable to parse offset - $offset. Expected format: <number>h or <number>d (e.g., 24h, 7d)"))
+      case (_, None) => Future.successful(BadRequest("Invalid drift algorithm. Expected one of: JSD, PSI, Hellinger"))
94-94: Extract path transformation logic to a utility method

The path transformation logic appears in multiple places and should be extracted to a utility method.
+  private def transformJoinPath(path: String): String = path.replaceFirst("\\.", "/")
+
   private def doFetchJoinDrift(...) = {
-    val joinPath = name.replaceFirst("\\.", "/")
+    val joinPath = transformJoinPath(name)
221-223: Add validation for timestamps and values in point conversion

The point conversion should validate that timestamps are not negative and values are within reasonable bounds.
   val points = lhsList.zip(tileDriftSeries.timestamps.asScala).map {
-    case (v, ts) => TimeSeriesPoint(v, ts)
+    case (v, ts) if ts >= 0 => 
+      require(v.isFinite, s"Invalid value: $v")
+      TimeSeriesPoint(v, ts)
+    case (_, ts) => 
+      throw new IllegalArgumentException(s"Invalid timestamp: $ts")
   }
283-284: Remove commented out code

The commented out cases for "skew" and "ooc" should be removed if they are no longer needed. If they are planned for future implementation, consider adding a TODO comment with a ticket reference.
-//      case Some("skew")  => Some(Skew)
-//      case Some("ooc")   => Some(Skew)
+      // TODO(TICKET-123): Add support for skew and out-of-control metrics

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4b21e6f and 083c55f.

📒 Files selected for processing (1)

hub/app/controllers/TimeSeriesController.scala (3 hunks)

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (4)

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (3)
77-78: Consider adding error handling for serialization operations.

While the ThreadLocal usage is correct, the serialization operations could benefit from explicit error handling to prevent potential runtime exceptions.

Consider wrapping the serialization operations in Try:
-    val serializer: TSerializer = binarySerializer.get()
-    val tileKeyMap = tileKeysForJoin(joinConf, None, columnPrefix)
-    val requestContextMap: Map[GetRequest, SummaryRequestContext] = tileKeyMap.flatMap {
+    Try {
+      val serializer: TSerializer = binarySerializer.get()
+      val tileKeyMap = tileKeysForJoin(joinConf, None, columnPrefix)
+      val requestContextMap: Map[GetRequest, SummaryRequestContext] = tileKeyMap.flatMap {
Also applies to: 91-91

203-211: Consider handling potential initialization failures in ThreadLocal.

While the ThreadLocal implementation is correct, the initialValue method could potentially throw exceptions during initialization.

Consider adding error handling in initialValue:
  @transient
  lazy val binarySerializer: ThreadLocal[TSerializer] = new ThreadLocal[TSerializer] {
-    override def initialValue(): TSerializer = new TSerializer(new TBinaryProtocol.Factory())
+    override def initialValue(): TSerializer = Try(new TSerializer(new TBinaryProtocol.Factory())).getOrElse {
+      // Log the error
+      throw new RuntimeException("Failed to initialize binary serializer")
+    }
  }
203-211: Consider abstracting serialization strategy into KVStore interface.

The serialization strategy (binary vs compact) could be encapsulated within the KVStore interface rather than being managed by the DriftStore. This would:

Make the serialization strategy configurable per KVStore implementation

Reduce coupling between DriftStore and serialization details

Simplify testing by allowing mock serializers
spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (1)

Line range hint 1-479: Consider documenting the new serialization strategy

The transition from mock data to generated data, coupled with the new binary serialization approach, represents a significant architectural change. Consider adding documentation that explains:

The rationale behind choosing binary serialization

Any performance implications

Data format compatibility considerations

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 083c55f and 9688376.

📒 Files selected for processing (3)

docker-init/demo/log4j2.properties (0 hunks)
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (4 hunks)
spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (2 hunks)

💤 Files with no reviewable changes (1)

docker-init/demo/log4j2.properties

🔇 Additional comments (3)

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (2)

15-16: LGTM: Import changes align with serializer implementation.

The imports correctly reference the new binary serializer implementations.

213-234: Consider making percentile labels more configurable and type-safe.

This comment reiterates a previous review suggestion about making the percentile labels more maintainable and flexible.

spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (1)

9-9: Verify backward compatibility of the new serializer

The switch from compactSerializer to binarySerializer might affect the ability to read previously serialized data.

✅ Verification successful

Switch to binary serializer is intentional and safe

The change from compactSerializer to binarySerializer is actually a necessary fix. The DriftStore implementation explicitly mentions "crazy bug in compact protocol - do not change to compact" in the comments. The binarySerializer is the only serializer defined in DriftStore and is used consistently across the codebase for both serialization and deserialization of drift metrics data.

Key findings:

The binary protocol is explicitly chosen to avoid bugs in the compact protocol
Both serialization and deserialization use the same TBinaryProtocol consistently
No compatibility issues as there are no remaining uses of compact protocol for drift metrics

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for existing uses of compactSerializer and potential compatibility issues

# Search for any remaining uses of compactSerializer
rg "compactSerializer" --type scala

# Search for any serialization compatibility tests
rg -l "SerializationTest|SerializerTest" --type scala

Length of output: 350

Script:

#!/bin/bash
# Let's check for the usage and implementation of binarySerializer
rg "binarySerializer" --type scala -A 3

# Also check for any migration or compatibility related code
rg "Migration|Compatibility" --type scala

# Check the DriftStore implementation
fd "DriftStore.scala" --type f --exec cat {}

Length of output: 12153

coderabbitai · 2024-11-27T20:19:56Z

spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala

+      val serializer = binarySerializer.get()
+


🛠️ Refactor suggestion

Consider moving serializer initialization outside the map operation

The serializer is currently initialized for each row in the RDD, which could impact performance. Consider moving the initialization outside the map operation and broadcasting it to all workers.

def packSummaryDf(df: DataFrame): (DataFrame, Unit) = { + val serializer = binarySerializer.get() val packedRdd: RDD[sql.Row] = df.rdd.flatMap(func).map { tileRow => - val serializer = binarySerializer.get() val partition = tileRow.partition val timestamp = tileRow.tileTs

Committable suggestion skipped: line range outside the PR's diff.

build.sbt

docker-init/demo/README.md

hub/app/controllers/InMemKVStoreController.scala

nikhil-zlai · 2024-11-27T20:41:41Z

hub/app/controllers/Paginate.scala

 trait Paginate {
  val defaultOffset = 0
  val defaultLimit = 10
  val maxLimit = 100

-  def paginateResults(results: Seq[Model], offset: Int, limit: Int): Seq[Model] = {
+  def paginateResults[T](results: Seq[T], offset: Int, limit: Int): Seq[T] = {


coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (16)

docker-init/demo/README.md (1)
1-25: Add architectural overview and data flow description

The README should explain the new architecture connecting Spark drift metrics computation with Hub.

Add a new section at the beginning:
+# Architecture Overview
+
+This demo showcases the integration between:
+1. Spark-based drift metrics computation
+2. Hub's InMemoryKV store for data persistence
+3. Frontend visualization
+
+## Data Flow
+```mermaid
+graph LR
+    A[Spark Job] -->|Compute Drift Metrics| B[Hub]
+    B -->|Serve via API| C[Frontend]
+    B -->|Serve via API| D[Streamlit]
+```
+
 # Populate Observability Demo Data
🧰 Tools

🪛 LanguageTool

[grammar] ~20-~20: Replace ‘exec’ with an appropriate verb.
Context: ...n build.sh once, and you can repeatedly exec to quickly visualize In first termina...

(MD_NN)

[uncategorized] ~22-~22: You might be missing the article “the” here.
Context: ...peatedly exec to quickly visualize In first terminal: sbt spark/assembly In seco...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~23-~23: You might be missing the article “the” here.
Context: ...irst terminal: sbt spark/assembly In second terminal: ./run.sh to load the built ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~24-~24: You might be missing the article “the” here.
Context: ...and serve the data on localhost:8181 In third terminal: streamlit run viz.py

(AI_EN_LECTOR_MISSING_DETERMINER_THE)
hub/app/controllers/SearchController.scala (2)

20-24: Document API response type change in changelog

The endpoint now returns joins instead of models, which is a breaking change for API consumers. While the documentation is updated correctly, we should ensure this change is properly communicated.

Consider:

Adding API version headers for backward compatibility

Documenting the response schema change in the API changelog

Updating any API documentation (e.g., Swagger/OpenAPI specs)

38-39: Consider caching search results

Given that this endpoint serves the frontend and might be called frequently, consider implementing caching for the search results to improve performance.

Consider:

Adding response caching with appropriate cache invalidation

Using Play's built-in caching mechanisms

Adding cache headers in the response
hub/app/model/Model.scala (1)
5-10: Fix parameter indentation in Join case class

While the new fields (online, production, team) are logically sound additions, the parameter indentation is inconsistent.

Consider applying this formatting:
-case class Join(name: String,
-                joinFeatures: Seq[String],
-                groupBys: Seq[GroupBy],
-                online: Boolean,
-                production: Boolean,
-                team: String)
+case class Join(
+  name: String,
+  joinFeatures: Seq[String],
+  groupBys: Seq[GroupBy],
+  online: Boolean,
+  production: Boolean,
+  team: String
+)
spark/src/main/scala/ai/chronon/spark/scripts/DataServer.scala (3)
Line range hint 39-46: Add error handling for serialization operations

The serialization logic could fail due to various reasons (invalid data, memory issues). Consider wrapping the operations in a try-catch block to handle potential failures gracefully.

Here's a suggested improvement:
     private def convertToBytesMap[T <: TBase[_, _]: Manifest: ClassTag](
         series: T,
         keyF: T => TileSeriesKey): Map[String, String] = {
-      val serializerInstance = DriftStore.binarySerializer.get()
-      val encoder = Base64.getEncoder
-      val keyBytes = serializerInstance.serialize(keyF(series))
-      val valueBytes = serializerInstance.serialize(series)
-      Map(
-        "keyBytes" -> encoder.encodeToString(keyBytes),
-        "valueBytes" -> encoder.encodeToString(valueBytes)
-      )
+      try {
+        val serializerInstance = DriftStore.binarySerializer.get()
+        val encoder = Base64.getEncoder
+        val keyBytes = serializerInstance.serialize(keyF(series))
+        val valueBytes = serializerInstance.serialize(series)
+        Map(
+          "keyBytes" -> encoder.encodeToString(keyBytes),
+          "valueBytes" -> encoder.encodeToString(valueBytes)
+        )
+      } catch {
+        case e: Exception =>
+          logger.error(s"Failed to serialize series: ${e.getMessage}", e)
+          throw new RuntimeException("Failed to serialize data", e)
+      }
Line range hint 39-46: Consider performance optimization for high-throughput scenarios

The current implementation creates a new serializer instance and performs Base64 encoding for every request. For high-throughput scenarios, consider:

Caching the encoder instance

Reusing serializer instances where possible

Evaluating if Base64 encoding is necessary (could potentially use binary responses)

Line range hint 58-59: Inconsistent serialization approach between endpoints

The drift-series endpoint appears to bypass the byte serialization (commented code), while summary-series uses it. This inconsistency could lead to:

Different data formats being served

Potential integration issues with clients

Maintenance challenges

Consider applying a consistent serialization approach across all endpoints.
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (1)
Line range hint 91-117: Enhance error handling for deserialization failures

The current error handling simply prints stack traces and drops failed entries. Consider:

Adding proper error logging

Implementing retry logic for transient failures

Adding metrics for monitoring deserialization failures
- case Failure(exception) => exception.printStackTrace(); None
+ case Failure(exception) =>
+   logger.error(s"Failed to deserialize summary: ${exception.getMessage}", exception)
+   metrics.incrementCounter("drift.deserialization.failures")
+   None
build.sbt (2)

83-88: Enhance the comment to specify which case classes use Circe

The comment provides context but could be more specific about which case classes or payloads are being serialized/deserialized.

139-140: Create a tracking issue for HTTPKvStore removal

The comment indicates this dependency is temporary. Consider creating a tracking issue to ensure its removal doesn't get forgotten.

Would you like me to create a GitHub issue to track the removal of HTTPKvStore and this dependency?
hub/test/controllers/TimeSeriesControllerSpec.scala (3)
76-84: Enhance error handling test coverage

Consider verifying the error message in the response body to ensure proper error propagation.
 val result =
   controller.fetchJoin("my_join", startTs, endTs, "drift", "null", Some("10h"), Some("psi")).apply(FakeRequest())

 status(result) mustBe INTERNAL_SERVER_ERROR
+val bodyText = contentAsString(result)
+bodyText must include("Some internal error")
Line range hint 87-109: Strengthen assertions in success case test

The test could be more specific about the expected values in the response. Consider adding assertions for:

The actual values in the response items

The structure of the grouped items

The timestamp values

254-282: Enhance mock data generation

The current implementation uses fixed values for drift metrics (0.12, 0.23, 0.25). Consider:

Using a random number generator with fixed seed for reproducibility

Adding more variation in the mock data

Adding documentation about the mock data pattern

Example improvement:
+ private val random = new Random(42) // Fixed seed for reproducibility
- val percentileDrifts = if (isNumeric) List.fill(timestamps.size())(JDouble.valueOf(0.12)).asJava else List.fill[JDouble](timestamps.size())(null).asJava
+ val percentileDrifts = if (isNumeric) 
+   timestamps.map(_ => JDouble.valueOf(0.1 + random.nextDouble() * 0.1)).asJava 
+ else 
+   List.fill[JDouble](timestamps.size())(null).asJava
hub/app/controllers/TimeSeriesController.scala (3)
175-202: Simplify complex pattern matching with for-comprehension

The nested pattern matching for handling multiple Future results can be simplified.
-            (maybeCurrentSummarySeries, maybeBaselineSummarySeries) match {
-              case (Failure(exceptionA), Failure(exceptionB)) =>
-                Future.successful(InternalServerError(
-                  s"Error computing feature percentiles for current + offset time window.\nCurrent window error: ${exceptionA.getMessage}\nOffset window error: ${exceptionB.getMessage}"))
-              case (_, Failure(exception)) =>
-                Future.successful(
-                  InternalServerError(
-                    s"Error computing feature percentiles for offset time window - ${exception.getMessage}"))
-              case (Failure(exception), _) =>
-                Future.successful(
-                  InternalServerError(
-                    s"Error computing feature percentiles for current time window - ${exception.getMessage}"))
-              case (Success(currentSummarySeriesFuture), Success(baselineSummarySeriesFuture)) =>
+            val result = for {
+              current <- Future.fromTry(maybeCurrentSummarySeries).flatten
+              baseline <- Future.fromTry(maybeBaselineSummarySeries).flatten
+            } yield {
+              val currentFeatureTs = if (current.isEmpty) Seq.empty
+                else convertTileSummarySeriesToTimeSeries(current.head, metric)
+              val baselineFeatureTs = if (baseline.isEmpty) Seq.empty
+                else convertTileSummarySeriesToTimeSeries(baseline.head, metric)
+              val comparedTsData = ComparedFeatureTimeSeries(name, baselineFeatureTs, currentFeatureTs)
+              Ok(comparedTsData.asJson.noSpaces)
+            }
+            result.recover {
+              case e: Exception => 
+                InternalServerError(s"Error computing feature percentiles: ${e.getMessage}")
+            }
283-285: Remove commented out code

The commented cases for "skew" and "ooc" should be removed if they're not being used. If they're planned for future implementation, consider tracking this in a GitHub issue instead.
-//      case Some("skew")  => Some(Skew)
-//      case Some("ooc")   => Some(Skew)
26-28: Consider adding performance monitoring

With the transition to bulk HTTP calls and InMemoryKV store, consider adding metrics to monitor:

Response times for bulk operations

Memory usage patterns

Cache hit/miss rates

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9688376 and 4965c8a.

📒 Files selected for processing (34)

.github/workflows/test_scala_no_spark.yaml (1 hunks)
build.sbt (4 hunks)
docker-init/Dockerfile (1 hunks)
docker-init/demo/README.md (1 hunks)
docker-init/demo/load_summaries.sh (1 hunks)
docker-init/demo/log4j2.properties (0 hunks)
docker-init/start.sh (0 hunks)
hub/app/controllers/InMemKVStoreController.scala (1 hunks)
hub/app/controllers/JoinController.scala (1 hunks)
hub/app/controllers/ModelController.scala (1 hunks)
hub/app/controllers/Paginate.scala (1 hunks)
hub/app/controllers/SearchController.scala (2 hunks)
hub/app/controllers/TimeSeriesController.scala (3 hunks)
hub/app/model/Model.scala (2 hunks)
hub/app/module/DriftStoreModule.scala (1 hunks)
hub/app/module/DynamoDBModule.scala (0 hunks)
hub/app/module/InMemoryKVStoreModule.scala (1 hunks)
hub/app/module/ModelStoreModule.scala (1 hunks)
hub/app/store/MonitoringModelStore.scala (4 hunks)
hub/conf/application.conf (1 hunks)
hub/conf/routes (1 hunks)
hub/test/controllers/JoinControllerSpec.scala (1 hunks)
hub/test/controllers/ModelControllerSpec.scala (3 hunks)
hub/test/controllers/SearchControllerSpec.scala (4 hunks)
hub/test/controllers/TimeSeriesControllerSpec.scala (4 hunks)
hub/test/store/MonitoringModelStoreTest.scala (2 hunks)
online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (1 hunks)
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala (1 hunks)
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (4 hunks)
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/Driver.scala (2 hunks)
spark/src/main/scala/ai/chronon/spark/scripts/DataServer.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala (1 hunks)
spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala (2 hunks)

💤 Files with no reviewable changes (3)

docker-init/demo/log4j2.properties
docker-init/start.sh
hub/app/module/DynamoDBModule.scala

🚧 Files skipped from review as they are similar to previous changes (19)

.github/workflows/test_scala_no_spark.yaml
docker-init/Dockerfile
hub/app/controllers/InMemKVStoreController.scala
hub/app/controllers/JoinController.scala
hub/app/controllers/ModelController.scala
hub/app/controllers/Paginate.scala
hub/app/module/DriftStoreModule.scala
hub/app/module/InMemoryKVStoreModule.scala
hub/app/module/ModelStoreModule.scala
hub/app/store/MonitoringModelStore.scala
hub/conf/application.conf
hub/test/controllers/JoinControllerSpec.scala
hub/test/controllers/ModelControllerSpec.scala
hub/test/controllers/SearchControllerSpec.scala
hub/test/store/MonitoringModelStoreTest.scala
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala
online/src/main/scala/ai/chronon/online/stats/TileDriftCalculator.scala
spark/src/main/scala/ai/chronon/spark/Driver.scala
spark/src/main/scala/ai/chronon/spark/stats/drift/Summarizer.scala

🧰 Additional context used

📓 Learnings (1)

hub/test/controllers/TimeSeriesControllerSpec.scala (1)

Learnt from: piyush-zlai
PR: zipline-ai/chronon#22
File: hub/test/controllers/TimeSeriesControllerSpec.scala:40-49
Timestamp: 2024-11-12T09:38:33.532Z
Learning: When testing with mocked data, it's acceptable to not add detailed assertions on the data contents.

🪛 LanguageTool

docker-init/demo/README.md

[grammar] ~20-~20: Replace ‘exec’ with an appropriate verb.
Context: ...n build.sh once, and you can repeatedly exec to quickly visualize In first termina...

(MD_NN)

[uncategorized] ~22-~22: You might be missing the article “the” here.
Context: ...peatedly exec to quickly visualize In first terminal: sbt spark/assembly In seco...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~23-~23: You might be missing the article “the” here.
Context: ...irst terminal: sbt spark/assembly In second terminal: ./run.sh to load the built ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~24-~24: You might be missing the article “the” here.
Context: ...and serve the data on localhost:8181 In third terminal: streamlit run viz.py

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Shellcheck (0.10.0)

docker-init/demo/load_summaries.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🔇 Additional comments (33)

docker-init/demo/load_summaries.sh (2)

1-2: Add shell script essentials

The script is missing crucial shell script components that were previously flagged.

Apply this diff to improve the script header:

+#!/bin/bash
+
+# Script: load_summaries.sh
+# Purpose: Loads summary data for ObservabilityDemo using Spark
+#
+# Requirements:
+# - Docker and docker-compose must be running
+# - Containers must be up (see docker-init/demo/README.md)
+#
+# Usage: ./load_summaries.sh
+
 # Kick off the ObsDemo spark job in the app container

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

3-12: 🛠️ Refactor suggestion

Optimize Spark configuration for demo environment

There are several issues with the current Spark configuration:

Memory settings are duplicated (--driver-memory and spark.driver.memory)
The allocated memory (8GB) might be excessive for a demo environment
The class path and JAR locations should be verified

Suggested improvements:

 docker-compose -f docker-init/compose.yaml exec app /opt/spark/bin/spark-submit \
   --master "local[*]" \
-  --driver-memory 8g \
+  --driver-memory "${DRIVER_MEMORY:-1g}" \
   --conf "spark.driver.maxResultSize=6g" \
-  --conf "spark.driver.memory=8g" \
   --driver-class-path "/opt/spark/jars/*:/app/cli/*" \
   --conf "spark.driver.host=localhost" \
   --conf "spark.driver.bindAddress=0.0.0.0" \
   --class ai.chronon.spark.scripts.ObservabilityDemoDataLoader \
   /app/cli/spark.jar

Consider:

Using environment variables for memory configuration
Removing the duplicate memory setting
Adding error handling for missing JARs or class

docker-init/demo/README.md (1)

12-17: Documentation needs more details about the data loading process

The data loading section requires more detailed information about the process, expected duration, verification steps, and troubleshooting guidance.

hub/app/controllers/SearchController.scala (1)

5-8: Verify MonitoringModelStore binding in dependency injection module

The transition from DynamoDBMonitoringStore to MonitoringModelStore looks good and aligns with the PR objectives to address performance issues. However, we should verify the DI setup.

Also applies to: 15-15

✅ Verification successful

Based on the analysis of the code, I can now generate the final response:

DI setup for MonitoringModelStore is correctly configured

The dependency injection setup is properly implemented:

ModelStoreModule correctly binds MonitoringModelStore using the new abstraction
The transition from DynamoDB is handled through the AwsApiImpl which provides the underlying KVStore implementation
The store abstraction allows for different KVStore implementations (MongoKvStore, HTTPKVStore, etc.) making it flexible for testing and future changes

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the MonitoringModelStore binding in the DI module
# Expected: Find the module configuration that binds MonitoringModelStore

# Search for the DI module configuration
rg -t scala "bind.*MonitoringModelStore" 

# Verify no remaining references to DynamoDBMonitoringStore
rg -t scala "DynamoDBMonitoringStore"

Length of output: 194

Script:

#!/bin/bash
# Let's verify the complete DI setup and implementation
# 1. Check the ModelStoreModule implementation
cat hub/app/module/ModelStoreModule.scala

# 2. Check if there are any remaining DynamoDB related imports or references
rg -t scala "DynamoDB|dynamodb" -A 2 -B 2

# 3. Look for the MonitoringModelStore trait/class definition
ast-grep --pattern 'trait MonitoringModelStore {
  $$$
}'

ast-grep --pattern 'class MonitoringModelStore {
  $$$
}'

Length of output: 20441

Script:

#!/bin/bash
# Let's check the MonitoringModelStore implementation and its relationship with DynamoDB
cat hub/app/store/MonitoringModelStore.scala

# Also check if there are any other store implementations that might be relevant
fd -t f ".*Store.scala" -x cat {}

Length of output: 44933

hub/app/model/Model.scala (1)

57-60: Clarify the distinction between List and Search responses

A few concerns about the response classes:

The comment "Currently search only covers joins" seems to conflict with having ListModelResponse
SearchJoinResponse and ListJoinResponse have identical structures. Consider:
- Merging them if they serve the same purpose
- Adding documentation to clarify their distinct use cases if they differ

Let's check how these response types are used:

online/src/main/scala/ai/chronon/online/HTTPKVStore.scala (6)

10-11: Remove or clarify the "hacky" designation

Based on the PR objectives, this code is used in production for drift metrics computation. The "hacky test" designation is inappropriate for production code.

13-14: Consider making backend lazy or transient to avoid serialization issues

Given this code's integration with Spark (as per PR objectives), the current backend initialization could cause serialization issues in Spark executors.

16-16: Implement or document unimplemented methods

The multiGet and bulkPut methods are marked with ???. Given this is for production drift metrics computation, these methods need proper implementation or documentation.

Also applies to: 39-39

41-43: Improve create method implementation

The current implementation silently skips dataset creation which could lead to issues in production.

18-37: ⚠️ Potential issue

Add retry mechanism to prevent data loss

Given the PR objectives mention previous issues with DynamoDB ingestion and data loss, this implementation should include a retry mechanism with exponential backoff.

+ import scala.concurrent.duration._
+ import scala.annotation.tailrec
+
   override def multiPut(putRequests: collection.Seq[KVStore.PutRequest]): Future[collection.Seq[Boolean]] = {
     if (putRequests.isEmpty) {
       Future.successful(Seq.empty)
     } else {
+      @tailrec
+      def retryWithBackoff(attempts: Int, backoff: Duration): (StatusCode, String) = {
+        if (attempts <= 0) {
+          throw new RuntimeException("Max retry attempts exceeded")
+        }
+        
         val response = basicRequest
           .post(uri"$baseUrl/data")
           .header("Content-Type", "application/json")
           .body(jsonList(putRequests))
+          .timeout(30.seconds)
           .send(backend)
+          
+        response.code match {
+          case StatusCode.Ok => (response.code, response.body)
+          case StatusCode.TooManyRequests | StatusCode.ServiceUnavailable =>
+            Thread.sleep(backoff.toMillis)
+            retryWithBackoff(attempts - 1, backoff * 2)
+          case _ => (response.code, response.body)
+        }
+      }
+      
+      Future {
+        retryWithBackoff(3, 1.second)
       }.map { response =>
-        response.code match {
+        response._1 match {
           case StatusCode.Ok => Seq(true)
           case _ =>
-            logger.error(s"HTTP multiPut failed with status ${response.code}: ${response.body}")
+            logger.error(s"HTTP multiPut failed with status ${response._1}: ${response._2}")
             Seq(false)
         }
       }
     }
   }

Likely invalid or redundant comment.

46-56: ⚠️ Potential issue

Add input validation for dataset field

The dataset field is directly interpolated into the JSON string without validation, which could lead to JSON injection vulnerabilities.

   def jsonString(request: PutRequest): String = {
+    require(request.dataset.matches("[a-zA-Z0-9_-]+"), "Dataset name contains invalid characters")
+    
     val keyBase64 = Base64.getEncoder.encodeToString(request.keyBytes)
     val valueBase64 = Base64.getEncoder.encodeToString(request.valueBytes)
-    s"""{ "keyBytes": "${keyBase64}", "valueBytes": "${valueBase64}", "dataset": "${request.dataset}", "tsMillis": ${request.tsMillis.orNull}}""".stripMargin
+    s"""{ "keyBytes": "${keyBase64}", "valueBytes": "${valueBase64}", "dataset": "${request.dataset.replaceAll("\"", "\\\"")}", "tsMillis": ${request.tsMillis.map(_.toString).getOrElse("null")}}""".stripMargin
   }

Likely invalid or redundant comment.

hub/conf/routes (3)

4-5: LGTM! Well-structured RESTful endpoints for join operations.

The new join-related endpoints follow RESTful conventions and properly handle pagination.

19-20: ⚠️ Potential issue

Address memory management concerns and clarify temporary status.

While this endpoint addresses the DynamoDB ingestion issues mentioned in the PR objectives, there are important considerations:

Memory Management (previously raised):
- No size limits on the in-memory store
- Missing eviction policy
- No persistence mechanism
- Potential OOM risks with bulk operations
Temporary Solution Clarity:
- Define criteria for when this temporary solution will be replaced
- Document data lifecycle and cleanup strategy
- Add monitoring for memory usage

Consider adding configuration parameters to application.conf:

+ # In-memory store configuration
+ inmemory.store {
+   # Maximum size in MB
+   max-size = 512
+   # Eviction policy: LRU, FIFO, etc.
+   eviction-policy = "LRU"
+   # Cleanup interval in minutes
+   cleanup-interval = 60
+ }

8-9: Please clarify the plan for the TBD model timeseries endpoint.

While commenting out the endpoint aligns with the transition from a model-based approach, leaving it as "TBD" without a clear plan could create technical debt. Consider:

Adding a timeline for resolving the TBD status
Documenting whether this endpoint will be replaced or removed
Creating a tracking issue for this work

Let's check if this endpoint is still referenced elsewhere:

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala (5)

20-22: LGTM! Well-structured logger initialization.

The logger is properly initialized using SLF4J with appropriate transient and lazy modifiers to handle serialization correctly.

31-57: Fix configuration validation issues.

The configuration class has several validation issues that need to be addressed, including invalid default end date and missing validations.

This issue was previously identified in a past review comment. The same concerns about date validation and minimum row count validation remain valid.

67-70: Avoid hardcoding Spark configuration.

This issue was previously identified in a past review comment. The same concern about hardcoded local mode remains valid.

80-87: Improve error handling for KVStore initialization.

This issue was previously identified in a past review comment. The same concerns about null function and error handling remain valid.

110-119: Improve cleanup and error handling.

This issue was previously identified in a past review comment. The same concerns about System.exit and proper cleanup remain valid.

spark/src/main/scala/ai/chronon/spark/scripts/DataServer.scala (1)

39-39: Verify thread safety of DriftStore.binarySerializer

The code obtains a new serializer instance for each request. Let's verify that the serializer implementation is thread-safe.

✅ Verification successful

Based on the search results, I found the implementation of DriftStore.binarySerializer in online/src/main/scala/ai/chronon/online/stats/DriftStore.scala:

lazy val binarySerializer: ThreadLocal[TSerializer] = new ThreadLocal[TSerializer] {
  override def initialValue(): TSerializer = new TSerializer()
}

Thread safety verified for DriftStore.binarySerializer

The implementation uses ThreadLocal which provides thread-safe access by maintaining a separate TSerializer instance for each thread. This is a correct and thread-safe approach for managing serializer instances that are not thread-safe themselves.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for DriftStore implementation to verify thread safety
ast-grep --pattern 'class DriftStore {
  $$$
  val binarySerializer = $$$
  $$$
}'

# Look for any synchronization or thread-safety related comments/annotations
rg -i "thread|synchronized|concurrent" --type scala

Length of output: 18709

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (4)

15-16: LGTM: Import changes align with serializer implementation

The imports correctly reference the new binary serializer implementations.

203-211: LGTM: Efficient thread-safe serializer implementation

Good improvements:

Thread-local implementation ensures thread safety
@transient annotation prevents serialization issues
Binary protocol is more efficient than compact protocol
Lazy initialization prevents unnecessary object creation

213-234: Consider making percentile labels more configurable and type-safe

The hard-coded sequence of percentile labels should be made more maintainable and flexible, as suggested in the previous review.

77-78: Verify thread-safety of serializer usage in multi-get operations

While the thread-local implementation is correct, ensure that the multiGet operation in kvStore properly handles concurrent requests, as this is in a performance-critical path.

✅ Verification successful

Based on the analysis of the shell script outputs, I can now generate the final response:

Thread-safety of serializer usage is properly implemented

The serializer usage in DriftStore.scala is thread-safe because:

The binarySerializer is implemented as a ThreadLocal variable, ensuring each thread gets its own instance
The get() method is used correctly to access the thread-local value
The serializer is used in a read-only manner for key serialization within the scope of the method
The implementation follows the same pattern used in other parts of the codebase for thread-safe serializer handling (e.g., ThriftJsonCodec, AvroCodec)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for multiGet implementations and concurrent access patterns
ast-grep --pattern 'def multiGet($$$) {
  $$$
}'

Length of output: 50

Script:

#!/bin/bash
# Search for multiGet and related concurrent patterns with ripgrep for broader coverage
rg -U "multiGet|ConcurrentMap|synchronized|ThreadLocal" -A 5

Length of output: 20270

Script:

#!/bin/bash
# Look for KVStore implementations and thread safety patterns
rg "trait\s+KVStore|class\s+.*KVStore" -A 10

Length of output: 25348

Script:

#!/bin/bash
# Search for serializer usage patterns
rg "serializer" -A 3

Length of output: 27919

build.sbt (2)

268-271: LGTM! Comprehensive logging exclusions

The additional log4j exclusions are well-structured and ensure consistent logging behavior across the application.

248-248: Verify the impact of spark dependency on hub project

Adding spark as a dependency to hub is a significant change that could affect:

Build times
Artifact size
Potential circular dependencies

Let's verify the dependency graph:

✅ Verification successful

Based on the output, I can now generate the final response:

No circular dependencies detected in spark and hub integration

The dependency analysis shows:

The spark project depends on aggregator and online projects, not on hub
The hub project depends on cloud_aws and spark, forming a linear dependency chain
The spark project has proper test and compile scope configurations

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for circular dependencies and analyze dependency tree
# Note: This will help identify any potential issues with the new dependency structure

# Check if there are any circular dependencies
sbt "hub/dependencyTree"

# Check the size impact
sbt "hub/assembly"
ls -lh hub/target/scala-*/hub-assembly-*.jar

Length of output: 365

Script:

#!/bin/bash
# Let's analyze the dependency structure using available tools
# Check if spark is referenced in other projects' dependencies
rg -l "dependsOn.*spark.*)" 

# Look for any potential circular references where spark depends on hub
rg -l "dependsOn.*hub.*)" 

# Check build.sbt for spark project definition and its dependencies
fd "build.sbt" --exec grep -A 10 "lazy val spark" {}

# Look for any other potential dependency relationships
fd "build.sbt" --exec grep -A 5 "dependsOn" {}

Length of output: 3218

hub/test/controllers/TimeSeriesControllerSpec.scala (3)

275-275: Typo in variable name nullRationChangePercents

The variable name has a typo: "Ration" should be "Ratio".

288-288: Remove duplicate setNodeName call

The setNodeName(join) is called twice on the tileKey object.

211-214: 🛠️ Refactor suggestion

Consider edge cases in duration calculation

The expected length calculation assumes that the time difference will always result in whole hours. This might fail for timestamps that don't align perfectly with hour boundaries.

Consider using ceiling division to handle partial hours:

-  val expectedLength = DriftStore.percentileLabels.length * expectedHours(startTs, endTs)
+  val expectedLength = DriftStore.percentileLabels.length * ((endTs - startTs + 3600000 - 1) / 3600000)

hub/app/controllers/TimeSeriesController.scala (4)

3-8: LGTM! Constructor changes align with async processing requirements.

The addition of DriftStore dependency and ExecutionContext follows best practices for dependency injection and async processing in Play Framework.

Also applies to: 15-20, 26-28

90-99: LGTM! Comprehensive error handling with specific messages.

The error handling provides clear, actionable messages for different failure scenarios.

215-220: ⚠️ Potential issue

Add null checks to prevent NullPointerException in data conversion methods

In convertTileDriftSeriesInfoToTimeSeries, accessing properties without null checks may lead to NullPointerException.

- val isNumeric =
-   tileDriftSeries.percentileDriftSeries.asScala != null && tileDriftSeries.percentileDriftSeries.asScala
-     .exists(_ != null)
+ val percentileSeries = Option(tileDriftSeries.percentileDriftSeries).map(_.asScala).getOrElse(Seq.empty)
+ val histogramSeries = Option(tileDriftSeries.histogramDriftSeries).map(_.asScala).getOrElse(Seq.empty)
+ val isNumeric = percentileSeries.nonEmpty && percentileSeries.exists(_ != null)

46-46: Verify backward compatibility of API changes

The removal of the slice parameter from fetchJoin and addition of the join parameter to fetchFeature might impact existing clients.

Also applies to: 53-62

✅ Verification successful

Let me check the actual usage context of these endpoints to better understand the impact.

API changes are safe and backward compatible

The verification shows that:

The slice parameter was never part of the fetchJoin endpoint signature in the routes or tests
The join parameter in fetchFeature is part of a new endpoint path /api/v1/join/:join/feature/:name/timeseries while the old endpoints are commented out but preserved
All tests have been updated to use the new signatures consistently

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for existing usages of these endpoints
echo "Searching for fetchJoin and fetchFeature endpoint usages..."
rg -l 'fetchJoin.*slice' || rg -l 'fetchFeature[^j]'

Length of output: 297

Script:

#!/bin/bash
# Check the actual endpoint definitions and usages
echo "Checking routes file..."
cat hub/conf/routes | grep -E "fetchJoin|fetchFeature"

echo -e "\nChecking controller implementation..."
rg "fetchFeature" hub/app/controllers/TimeSeriesController.scala -A 5

echo -e "\nChecking test implementations..."
rg "fetchFeature|fetchJoin" hub/test/controllers/TimeSeriesControllerSpec.scala -A 5

Length of output: 8173

coderabbitai · 2024-11-27T21:13:45Z

docker-init/demo/README.md

+# Populate Observability Demo Data
+To populate the observability demo data:
+* Launch the set of docker containers:
+```bash
+~/workspace/chronon $ docker-compose -f docker-init/compose.yaml up --build
+...
+app-1           | [info] 2024-11-26 05:10:45,758 [main] INFO  play.api.Play - Application started (Prod) (no global state)
+app-1           | [info] 2024-11-26 05:10:45,958 [main] INFO  play.core.server.AkkaHttpServer - Listening for HTTP on /[0:0:0:0:0:0:0:0]:9000
+```
+(you can skip the --build if you don't wish to rebuild your code)


🛠️ Refactor suggestion

Enhance Docker setup instructions

The setup instructions should include prerequisites and verification steps.

Add the following sections:

# Populate Observability Demo Data + +## Prerequisites +- Docker v20.10 or higher +- Docker Compose v2.0 or higher +- Minimum 4GB RAM available for containers + +## Setup To populate the observability demo data: * Launch the set of docker containers: ```bash ~/workspace/chronon $ docker-compose -f docker-init/compose.yaml up --build ... app-1 | [info] 2024-11-26 05:10:45,758 [main] INFO play.api.Play - Application started (Prod) (no global state) app-1 | [info] 2024-11-26 05:10:45,958 [main] INFO play.core.server.AkkaHttpServer - Listening for HTTP on /[0:0:0:0:0:0:0:0]:9000

(you can skip the --build if you don't wish to rebuild your code)
+
+Expected startup time: ~2-3 minutes
+
+### Verify Setup
+Ensure all containers are running:
+bash +~/workspace/chronon $ docker-compose -f docker-init/compose.yaml ps +
+
+All services should show status as "running".



coderabbitai · 2024-11-27T21:13:45Z

docker-init/demo/README.md

+# Streamlit local experimentation
 run build.sh once, and you can repeatedly exec to quickly visualize 

 In first terminal: `sbt spark/assembly` 
 In second terminal: `./run.sh` to load the built jar and serve the data on localhost:8181
 In third terminal: `streamlit run viz.py`
+


🛠️ Refactor suggestion

Restructure and clarify Streamlit instructions

The Streamlit section needs better formatting, clearer instructions, and proper grammar.

Replace the section with:

-# Streamlit local experimentation -run build.sh once, and you can repeatedly exec to quickly visualize - -In first terminal: `sbt spark/assembly` -In second terminal: `./run.sh` to load the built jar and serve the data on localhost:8181 -In third terminal: `streamlit run viz.py` +# Local Development with Streamlit + +For local experimentation with the Streamlit visualization: + +1. Build the Spark assembly (required once): + ```bash + ~/workspace/chronon $ sbt spark/assembly + ``` + +2. Start the data server: + ```bash + ~/workspace/chronon $ ./run.sh + ``` + This loads the built JAR and serves data at http://localhost:8181 + +3. Launch the Streamlit application: + ```bash + ~/workspace/chronon $ streamlit run viz.py + ``` + Access the visualization at http://localhost:8501

🧰 Tools

🪛 LanguageTool

[grammar] ~20-~20: Replace ‘exec’ with an appropriate verb.
Context: ...n build.sh once, and you can repeatedly exec to quickly visualize In first termina...

(MD_NN)

[uncategorized] ~22-~22: You might be missing the article “the” here.
Context: ...peatedly exec to quickly visualize In first terminal: sbt spark/assembly In seco...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~23-~23: You might be missing the article “the” here.
Context: ...irst terminal: sbt spark/assembly In second terminal: ./run.sh to load the built ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~24-~24: You might be missing the article “the” here.
Context: ...and serve the data on localhost:8181 In third terminal: streamlit run viz.py

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

coderabbitai · 2024-11-27T21:13:45Z

hub/app/controllers/SearchController.scala

+  // a trivial search where we check the join name for similarity with the search term
+  private def searchRegistry(term: String): Seq[Join] = {
+    val joins = monitoringStore.getJoins
+    joins.filter(j => j.name.contains(term))


🛠️ Refactor suggestion

Consider enhancing search functionality and error handling

The current implementation has several areas for improvement:

Case-sensitive search might not be user-friendly

No error handling for monitoringStore.getJoins

Potential performance impact of filtering all joins in memory

Consider this improvement:

private def searchRegistry(term: String): Seq[Join] = { - val joins = monitoringStore.getJoins - joins.filter(j => j.name.contains(term)) + try { + val normalizedTerm = term.toLowerCase + monitoringStore.getJoins.filter(j => + j.name.toLowerCase.contains(normalizedTerm) + ) + } catch { + case e: Exception => + logger.error(s"Failed to search joins: ${e.getMessage}", e) + Seq.empty + } }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-27T21:13:45Z

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

+  def time(message: String)(block: => Unit): Unit = {
+    logger.info(s"$message..".yellow)
+    val start = System.currentTimeMillis()
+    block
+    val end = System.currentTimeMillis()
+    logger.info(s"$message took ${end - start} ms".green)
+  }


🛠️ Refactor suggestion

Add error handling to timing utility.

The timing utility should handle exceptions while preserving timing information.

Consider this improvement:

- def time(message: String)(block: => Unit): Unit = { + def time[T](message: String)(block: => T): T = { logger.info(s"$message..".yellow) val start = System.currentTimeMillis() - block - val end = System.currentTimeMillis() - logger.info(s"$message took ${end - start} ms".green) + try { + val result = block + val end = System.currentTimeMillis() + logger.info(s"$message took ${end - start} ms".green) + result + } catch { + case e: Exception => + val end = System.currentTimeMillis() + logger.error(s"$message failed after ${end - start} ms: ${e.getMessage}".red) + throw e + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def time(message: String)(block: => Unit): Unit = {

logger.info(s"$message..".yellow)

val start = System.currentTimeMillis()

block

val end = System.currentTimeMillis()

logger.info(s"$message took ${end - start} ms".green)

}

def time[T](message: String)(block: => T): T = {

logger.info(s"$message..".yellow)

val start = System.currentTimeMillis()

try {

val result = block

val end = System.currentTimeMillis()

logger.info(s"$message took ${end - start} ms".green)

result

} catch {

case e: Exception =>

val end = System.currentTimeMillis()

logger.error(s"$message failed after ${end - start} ms: ${e.getMessage}".red)

throw e

}

}

coderabbitai · 2024-11-27T21:13:45Z

spark/src/main/scala/ai/chronon/spark/scripts/ObservabilityDemoDataLoader.scala

+    val httpKvStoreFunc: () => KVStore = () => {
+      // cannot reuse the variable - or serialization error
+      val result = new HTTPKVStore()
+      result
+    }
+    val hubApi = new MockApi(httpKvStoreFunc, namespace)
+


⚠️ Potential issue

Ensure thread-safe access to shared KV store.

The KV store initialization and usage pattern could lead to thread safety issues in a concurrent environment.

Consider these improvements:

Make the KV store thread-safe or ensure synchronized access

Use connection pooling for HTTP KV store

Add proper resource cleanup

val httpKvStoreFunc: () => KVStore = () => { - // cannot reuse the variable - or serialization error - val result = new HTTPKVStore() - result + synchronized { + val store = new HTTPKVStore() + Runtime.getRuntime().addShutdownHook(new Thread(() => { + store.close() + })) + store + } }

Committable suggestion skipped: line range outside the PR's diff.

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala

nikhil-zlai · 2024-11-27T22:05:27Z

spark/src/main/scala/ai/chronon/spark/Driver.scala

@@ -565,7 +566,7 @@ object Driver {
    lazy val api: Api = impl(serializableProps)

    def metaDataStore =
-      new MetadataStore(impl(serializableProps).genKvStore, "ZIPLINE_METADATA", timeoutMillis = 10000)


good catch!

nikhil-zlai

looks great! minor comments

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (11)

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (3)
Line range hint 77-91: Consider moving serializer.get() inside the Future for better thread safety

While the deserializer is correctly obtained inside the Future's map operation, the serializer is obtained outside. This could lead to thread safety issues if the Future executes in a different thread pool.

Consider this improvement:
-    val serializer: TSerializer = binarySerializer.get()
     val tileKeyMap = tileKeysForJoin(joinConf, None, columnPrefix)
     val requestContextMap: Map[GetRequest, SummaryRequestContext] = tileKeyMap.flatMap {
       case (group, keys) =>
         keys.map { key =>
+          val serializer = binarySerializer.get()
           val keyBytes = serializer.serialize(key)
           val get = GetRequest(keyBytes, summaryDataset, startTsMillis = startMs, endTsMillis = endMs)
           get -> SummaryRequestContext(get, key, group)
         }
     }
203-211: LGTM: Thread-safe serialization implementation

The ThreadLocal implementation with @transient lazy val is correct and addresses thread safety concerns. The BinaryProtocol choice is well-justified.

Consider improving the warning comment to be more specific about the compact protocol bug:
-  // crazy bug in compact protocol - do not change to compact
+  // WARNING: TCompactProtocol has known issues with concurrent access and data corruption.
+  // Use TBinaryProtocol for thread-safe serialization.
213-214: Consider adding validation and edge case handling

Good implementation that addresses the previous review feedback about making percentile labels programmatic. However, consider these improvements:
-  def breaks(count: Int): Seq[String] = (0 to count).map(_ * (100 / count)).map("p" + _.toString)
+  def breaks(count: Int): Seq[String] = {
+    require(count > 0, "Count must be positive")
+    require(count <= 100, "Count cannot exceed 100")
+    
+    if (count == 1) Seq("p0", "p100")
+    else (0 to count).map(i => "p" + (i * (100 / count)).toString)
+  }
This adds:

Input validation

Special handling for count=1

More readable formatting
hub/test/controllers/TimeSeriesControllerSpec.scala (4)
76-84: Consider adding more specific error assertions

While the error handling test is good, consider asserting the error message or response body to ensure the correct error is being propagated.
-     status(result) mustBe INTERNAL_SERVER_ERROR
+     val response = contentAsString(result)
+     status(result) mustBe INTERNAL_SERVER_ERROR
+     response must include("Some internal error")
Line range hint 87-110: Add more specific assertions for drift metrics

The successful case test could benefit from more specific assertions on the drift metrics values to ensure the data transformation is correct.

Consider adding assertions for:

Specific drift metric values

Time series continuity

Data point validity

189-194: Extract magic numbers into named constants

The offset of 7 days and other numeric values should be extracted into named constants for better maintainability and clarity.
+  private val DefaultOffsetDays = 7
+  private val DefaultPercentileDrift = 0.12
+  private val DefaultHistogramDrift = 0.23
+  private val DefaultNullRatioChange = 0.25

-  val offset = Duration.apply(7, TimeUnit.DAYS)
+  val offset = Duration.apply(DefaultOffsetDays, TimeUnit.DAYS)
254-282: Consider parameterizing mock data generation

The mock data generation could be more flexible and maintainable by accepting parameters for the drift values instead of using hard-coded values.

Consider refactoring to:
   private def generateDriftSeries(
     startTs: Long, 
     endTs: Long, 
     join: String, 
     numGroups: Int, 
     numFeaturesPerGroup: Int,
+    percentileDrift: Double = 0.12,
+    histogramDrift: Double = 0.23,
+    nullRatioChange: Double = 0.25
   ): Try[Future[Seq[TileDriftSeries]]] = {
hub/app/controllers/TimeSeriesController.scala (4)
53-62: Document the new 'join' parameter in the method's scaladoc

The scaladoc comment should be updated to explain the purpose and format of the new 'join' parameter.

90-112: Simplify error handling with for-comprehension

The nested pattern matching can be simplified using for-comprehension for better readability.
- (parseOffset(offset), parseAlgorithm(algorithm)) match {
-   case (None, _) => Future.successful(BadRequest(s"Unable to parse offset - $offset"))
-   case (_, None) => Future.successful(BadRequest("Invalid drift algorithm. Expect JSD, PSI or Hellinger"))
-   case (Some(o), Some(driftMetric)) =>
+ for {
+   o <- Future.fromOption(parseOffset(offset))(BadRequest(s"Unable to parse offset - $offset"))
+   driftMetric <- Future.fromOption(parseAlgorithm(algorithm))(BadRequest("Invalid drift algorithm. Expect JSD, PSI or Hellinger"))
+   result <- {
    val window = new Window(o.toMinutes.toInt, TimeUnit.MINUTES)
    val joinPath = name.replaceFirst("\\.", "/")
    driftStore.getDriftSeries(joinPath, driftMetric, window, startTs, endTs)
      .fold(
        ex => Future.successful(InternalServerError(s"Error computing join drift - ${ex.getMessage}")),
        _.map { driftSeries =>
          // ... rest of the success handling ...
        }
      )
+   }
+ } yield result
171-202: Consider extracting the percentile computation logic

The percentile computation logic is complex and would benefit from being extracted into a separate method for better maintainability.
+ private def computePercentiles(
+   currentSeries: Seq[TileSummarySeries],
+   baselineSeries: Seq[TileSummarySeries],
+   metric: Metric,
+   name: String
+ ): ComparedFeatureTimeSeries = {
+   val currentFeatureTs = {
+     if (currentSeries.isEmpty) Seq.empty
+     else convertTileSummarySeriesToTimeSeries(currentSeries.head, metric)
+   }
+   val baselineFeatureTs = {
+     if (baselineSeries.isEmpty) Seq.empty
+     else convertTileSummarySeriesToTimeSeries(baselineSeries.head, metric)
+   }
+   ComparedFeatureTimeSeries(name, baselineFeatureTs, currentFeatureTs)
+ }
283-285: Remove commented out code

The commented out cases for "skew" and "ooc" should be removed if they are no longer needed. If they are planned for future implementation, consider adding a TODO comment with a ticket reference.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4965c8a and 3089feb.

📒 Files selected for processing (3)

hub/app/controllers/TimeSeriesController.scala (3 hunks)
hub/test/controllers/TimeSeriesControllerSpec.scala (4 hunks)
online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (4 hunks)

🧰 Additional context used

📓 Learnings (1)

hub/test/controllers/TimeSeriesControllerSpec.scala (1)

Learnt from: piyush-zlai
PR: zipline-ai/chronon#22
File: hub/test/controllers/TimeSeriesControllerSpec.scala:40-49
Timestamp: 2024-11-12T09:38:33.532Z
Learning: When testing with mocked data, it's acceptable to not add detailed assertions on the data contents.

🔇 Additional comments (7)

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala (1)

15-16: LGTM: Import changes align with new serialization implementation

The new imports correctly reference the ThreadLocal binary serializers that replace the previous compact serializers.

hub/test/controllers/TimeSeriesControllerSpec.scala (3)

Line range hint 3-42: LGTM: Well-structured test setup with proper mocking

The test setup follows best practices with clear organization of imports and proper initialization of mocked components.

275-275: Fix typo in variable name

The variable name has a typo: nullRationChangePercents should be nullRatioChangePercents.

288-288: Remove duplicate setNodeName call

The setNodeName(join) is called twice on the tileKey object.

hub/app/controllers/TimeSeriesController.scala (3)

26-28: LGTM: Clean dependency injection setup

The controller's constructor properly uses dependency injection for DriftStore and follows Play Framework best practices with implicit ExecutionContext.

215-220: ⚠️ Potential issue

Add null checks to prevent NullPointerException

This is a duplicate of a previous review comment. The null handling issues for percentileDriftSeries and histogramDriftSeries still need to be addressed.

71-202: Verify error handling paths for async operations

Let's verify that all error paths are properly handled and tested.

✅ Verification successful

Based on the results, I can now generate the final response since we have comprehensive information about the error handling patterns in the codebase.

Error handling implementation is comprehensive and well-structured

The error handling for async operations is properly implemented with:

Consistent validation of input parameters (metric choice, rollup, granularity) with clear error messages
Proper handling of computation failures in drift calculations
Detailed error messages for both single and compound failure scenarios
Appropriate HTTP status codes (400 for client errors, 500 for server errors)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for test coverage of error scenarios
rg -A 5 "it should.*error" hub/test/controllers/TimeSeriesControllerSpec.scala

# Search for error handling patterns
ast-grep --pattern 'Future.successful(BadRequest($_))' 
ast-grep --pattern 'Future.successful(InternalServerError($_))'

Length of output: 3361

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)

hub/app/model/Model.scala (1)
57-60: Add documentation for pagination parameters.

The response classes have a consistent structure with offset-based pagination. Consider adding documentation to clarify:

The expected offset values

Any limits on the number of items returned

Pagination behavior when offset exceeds available items

Example documentation:
+ /** Response for listing models with offset-based pagination
+  * @param offset Zero-based offset for pagination
+  * @param items Sequence of models, limited to maximum of N items
+  */
case class ListModelResponse(offset: Int, items: Seq[Model])

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3089feb and 4930a98.

📒 Files selected for processing (4)

hub/app/model/Model.scala (2 hunks)
hub/app/store/MonitoringModelStore.scala (4 hunks)
hub/test/controllers/ModelControllerSpec.scala (3 hunks)
hub/test/controllers/SearchControllerSpec.scala (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

hub/app/store/MonitoringModelStore.scala
hub/test/controllers/ModelControllerSpec.scala
hub/test/controllers/SearchControllerSpec.scala

🔇 Additional comments (1)

hub/app/model/Model.scala (1)

5-10: LGTM! Verify the optional team field requirement.

The enhanced Join case class with online, production, and team fields creates better consistency with the Model case class structure. However, it's worth noting that team is optional (Option[String]) in Join but required (String) in Model.

Let's verify if this difference in team field requirement is intentional:

✅ Verification successful

Let me analyze the search results to verify the team field requirement in Join case class usage:

Looking at test cases and actual usage:

In hub/test/controllers/SearchControllerSpec.scala and ModelControllerSpec.scala, the team field is consistently used with Some("my_team")
In hub/app/store/MonitoringModelStore.scala, the team field is populated from thriftJoin.metaData.team and wrapped in Option()

Looking at Python API usage:

Most Join instantiations don't explicitly set the team field
When set, it's optional via team_override parameter (e.g., in api/py/test/test_join.py)

Analyzing the pattern:

The team field being optional (Option[String]) in Join is consistent with its usage across the codebase
It allows flexibility where team information may or may not be available
The difference from Model (which has required String) appears intentional for different use cases

The optional team field requirement is correctly implemented.

The Join case class's optional team field aligns with its usage patterns across the codebase, where team information is sometimes provided (wrapped in Some) and sometimes omitted. This design choice provides flexibility while maintaining compatibility with the thrift metadata structure.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for team field usage across the codebase
# Look for patterns where Join or Model objects are created to understand the team field usage

ast-grep --pattern 'new Join($$$)' 
ast-grep --pattern 'Join($$$)'

Length of output: 45910

…ontend (#95) ## Summary Builds on a couple of the summary computation PRs and data generation to wire things up so that Hub can serve them. * Yanked out mock data based endpoints (model perf / drift, join & feature skew) - decided it would be confusing to have a mix of mock and generated data so we just have the generated data served * Dropped a few of the scripts introduced in #87. We bring up our containers the way and we have a script `load_summaries.sh` that we can trigger that leverages the existing app container to load data. * DDB ingestion was taking too long and we were dropping a lot of data due to rejected execution exceptions. To unblock for now, we've gone with an approach of making a bulk put HTTP call from the ObservabilityDemo app -> Hub and Hub utilizing a InMemoryKV store to persist and serve up features. * Added an endpoint to serve the join that are configured as we've switched from the model based world. There's still an issue to resolve around fetching individual feature series data. Once I resolve that, we can switch this PR out of wip mode. To test / run: start up our docker containers: ``` $ docker-compose -f docker-init/compose.yaml up --build ... ``` In a different term load data: ``` $ ./docker-init/demo/load_summaries.sh Done uploading summaries! 🥳 ``` You can now curl join & feature time series data. Join drift (null ratios) ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=null&offset=10h&algorithm=psi' ``` Join drift (value drift) ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=10h&algorithm=psi' ``` Feature drift: ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=aggregates' ``` Feature summaries: ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=percentile' ``` Join metadata ``` curl -X GET 'http://localhost:9000/api/v1/joins' curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join' ``` ## Checklist - [X] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update  ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced a new `JoinController` for managing joins with pagination support. - Added functionality for an in-memory key-value store with bulk data upload capabilities. - Implemented observability demo data loading within a Spark application. - Added a new `HTTPKVStore` class for remote key-value store interactions over HTTP. - **Improvements** - Enhanced the `ModelController` and `SearchController` to align with the new join data structure. - Updated the `TimeSeriesController` to support asynchronous operations and improved error handling. - Refined dependency management in the build configuration for better clarity and maintainability. - Updated API routes to include new endpoints for listing and retrieving joins. - Updated configuration to replace the `DynamoDBModule` with `ModelStoreModule`, adding `InMemoryKVStoreModule` and `DriftStoreModule`. - **Documentation** - Revised README instructions for Docker container setup and demo data loading. - Updated API routes documentation to reflect new endpoints for joins and in-memory data operations. - **Bug Fixes** - Resolved issues related to error handling in various controllers and improved logging for better traceability.  --------- Co-authored-by: nikhil-zlai <[email protected]>

…ontend (#95) ## Summary Builds on a couple of the summary computation PRs and data generation to wire things up so that Hub can serve them. * Yanked out moour clients data based endpoints (model perf / drift, join & feature skew) - decided it would be confusing to have a mix of moour clients and generated data so we just have the generated data served * Dropped a few of the scripts introduced in #87. We bring up our containers the way and we have a script `load_summaries.sh` that we can trigger that leverages the existing app container to load data. * DDB ingestion was taking too long and we were dropping a lot of data due to rejected execution exceptions. To unbloour clients for now, we've gone with an approach of making a bulk put HTTP call from the ObservabilityDemo app -> Hub and Hub utilizing a InMemoryKV store to persist and serve up features. * Added an endpoint to serve the join that are configured as we've switched from the model based world. There's still an issue to resolve around fetching individual feature series data. Once I resolve that, we can switch this PR out of wip mode. To test / run: start up our doour clientser containers: ``` $ doour clientser-compose -f doour clientser-init/compose.yaml up --build ... ``` In a different term load data: ``` $ ./doour clientser-init/demo/load_summaries.sh Done uploading summaries! 🥳 ``` You can now curl join & feature time series data. Join drift (null ratios) ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=null&offset=10h&algorithm=psi' ``` Join drift (value drift) ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=10h&algorithm=psi' ``` Feature drift: ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=aggregates' ``` Feature summaries: ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=percentile' ``` Join metadata ``` curl -X GET 'http://localhost:9000/api/v1/joins' curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join' ``` ## Cheour clientslist - [X] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update  ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced a new `JoinController` for managing joins with pagination support. - Added functionality for an in-memory key-value store with bulk data upload capabilities. - Implemented observability demo data loading within a Spark application. - Added a new `HTTPKVStore` class for remote key-value store interactions over HTTP. - **Improvements** - Enhanced the `ModelController` and `SearchController` to align with the new join data structure. - Updated the `TimeSeriesController` to support asynchronous operations and improved error handling. - Refined dependency management in the build configuration for better clarity and maintainability. - Updated API routes to include new endpoints for listing and retrieving joins. - Updated configuration to replace the `DynamoDBModule` with `ModelStoreModule`, adding `InMemoryKVStoreModule` and `DriftStoreModule`. - **Documentation** - Revised README instructions for Doour clientser container setup and demo data loading. - Updated API routes documentation to reflect new endpoints for joins and in-memory data operations. - **Bug Fixes** - Resolved issues related to error handling in various controllers and improved logging for better traceability.  --------- Co-authored-by: nikhil-zlai <[email protected]>

coderabbitai bot reviewed Nov 26, 2024

View reviewed changes

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

build.sbt Outdated Show resolved Hide resolved

tchow-zlai reviewed Nov 27, 2024

View reviewed changes

.github/workflows/test_scala_and_python.yaml Outdated Show resolved Hide resolved

piyush-zlai force-pushed the piyush/drift_metrics_controller branch from 0da87a9 to 45532b7 Compare November 27, 2024 15:32

coderabbitai bot reviewed Nov 27, 2024

View reviewed changes

piyush-zlai changed the title ~~[wip] Connect drift metrics computation in Spark with Hub for serving to frontend~~ Connect drift metrics computation in Spark with Hub for serving to frontend Nov 27, 2024

coderabbitai bot reviewed Nov 27, 2024

View reviewed changes

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

build.sbt Show resolved Hide resolved

nikhil-zlai and others added 17 commits November 27, 2024 15:37

Changes

45b4af3

changes so far

e3d2115

remove unused file

c0e5d8f

fix

e718f91

adding back color printer

d3ff253

scalafmt fix

bbf1ddd

assign intervals

d61c83a

assign intervals

ce39428

tile summary distance

787b664

histogram drift

e576dad

tile drift

3ca60fa

test wiring

ea115e3

Rename DynamoDB store to monitoring model store

1af7b76

First cut wiring up with passing tests

1228488

Rip out mock data generation and corresponding endpoints

86b1cfd

Add joins endpoints and switch search to use joins

3657bb7

Switch to correct metadata table

622405a

piyush-zlai added 6 commits November 27, 2024 15:39

Swap to new uploader app and revert old code

18e260c

Downgrade in mem controller log to debug

de044c4

style: Apply scalafix and scalafmt changes

dfc4e1f

Handle empty responses

6ab315c

Remove redundant log4j props file

62687b9

Use thread locals for thrift serializers

5316680

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

docker-init/demo/README.md Show resolved Hide resolved

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

hub/app/controllers/InMemKVStoreController.scala Show resolved Hide resolved

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

piyush-zlai added 2 commits November 27, 2024 16:01

Rebase + comments

92f0d11

style: Apply scalafix and scalafmt changes

4965c8a

piyush-zlai force-pushed the piyush/drift_metrics_controller branch from 9688376 to 4965c8a Compare November 27, 2024 21:08

coderabbitai bot reviewed Nov 27, 2024

View reviewed changes

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

online/src/main/scala/ai/chronon/online/stats/DriftStore.scala Outdated Show resolved Hide resolved

nikhil-zlai reviewed Nov 27, 2024

View reviewed changes

nikhil-zlai approved these changes Nov 27, 2024

View reviewed changes

piyush-zlai added 2 commits November 27, 2024 17:49

Add breaks method

06ccac7

style: Apply scalafix and scalafmt changes

3089feb

coderabbitai bot reviewed Nov 27, 2024

View reviewed changes

Wrap team in optional

4930a98

coderabbitai bot reviewed Nov 27, 2024

View reviewed changes

piyush-zlai merged commit 8cd16dd into main Nov 27, 2024
8 checks passed

piyush-zlai deleted the piyush/drift_metrics_controller branch November 27, 2024 23:25

coderabbitai bot mentioned this pull request Apr 7, 2025

chore: pull metrics into its own target #592

Merged

4 tasks

coderabbitai bot mentioned this pull request Apr 17, 2025

Swap our metrics provider from statsd to otel-metrics + instrument our Fetcher threadpool #657

Closed

4 tasks

coderabbitai bot mentioned this pull request Apr 30, 2025

Cherrypick OSS fetcher failure handling PRs - #932 and #964 #706

Merged

4 tasks

		inMemoryKVStore.create(Constants.MetadataDataset)
		inMemoryKVStore.create(Constants.TiledSummaryDataset)

	val nullRationChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava
	val nullRatioChangePercents = List.fill(timestamps.size())(JDouble.valueOf(0.25)).asJava

		# Temporary in-memory kv store endpoint
		POST /api/v1/dataset/data controllers.InMemKVStoreController.bulkPut()

		// Hacky test kv store that we use to send objects to the in-memory KV store that lives in a different JVM (e.g spark -> hub)
		class HTTPKVStore(host: String = "localhost", port: Int = 9000) extends KVStore with Serializable {

Connect drift metrics computation in Spark with Hub for serving to frontend #95

Connect drift metrics computation in Spark with Hub for serving to frontend #95

Uh oh!

Conversation

piyush-zlai commented Nov 26, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

nikhil-zlai Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

piyush-zlai Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 27, 2024

Choose a reason for hiding this comment

piyush-zlai commented Nov 26, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 26, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)