Skip to content

[wip] use native annotation tags #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

tchow-zlai
Copy link
Collaborator

@tchow-zlai tchow-zlai commented Nov 21, 2024

Summary

  • Switching to using native tag annotations, as this allows us the full flexibility of the scala test CLI grammar.

For example, now we can do tag exclusions:

sbt sbt "spark/testOnly -- -l ai.chronon.spark.test.FetcherTestTag"

note the -l.

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

  • New Features

    • Introduced new test tags: JoinTestTag, FetcherTestTag, and MutationsTestTag for improved test categorization.
    • Updated test classes to utilize the new tagging mechanism, enhancing clarity and maintainability.
  • Bug Fixes

    • Corrected comments in the FeatureWithLabelJoinTest class for accuracy.
  • Chores

    • Removed the TaggedFilterSuite trait to streamline testing processes.

Copy link

coderabbitai bot commented Nov 21, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

This pull request modifies the GitHub Actions workflow for testing Scala and Python, focusing on specific test execution commands in several jobs. It introduces a new Java file defining three annotations for tagging tests, updates existing test classes to use these annotations instead of a previous tagging method, and corrects minor comments in the test files. Additionally, a trait used for conditional test execution has been removed.

Changes

File Path Change Summary
.github/workflows/test_scala_and_python.yaml Updated test execution commands in join_spark_tests, mutation_spark_tests, and fetcher_spark_tests jobs to use new test tags.
spark/src/test/java/ai/chronon/spark/test/Tags.java Introduced Tags.java with annotations: JoinTestTag, FetcherTestTag, and MutationsTestTag for categorizing tests in ScalaTest.
spark/src/test/scala/ai/chronon/spark/test/FeatureWithLabelJoinTest.scala Corrected comment regarding label in FeatureWithLabelJoinTest.
spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala Added @FetcherTestTag annotation to FetcherTest class, replacing the tagName method, and updated comments for running tests.
spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala Added @JoinTestTag annotation to JoinTest class, replacing the tagName method, and included necessary imports.
spark/src/test/scala/ai/chronon/spark/test/MutationsTest.scala Added @MutationsTestTag annotation to MutationsTest class, replacing the tagName method, and updated comments for running tests.
spark/src/test/scala/ai/chronon/spark/test/TaggedFilterSuite.scala Removed the TaggedFilterSuite trait, which was used for conditional test execution based on a specified tag.

Possibly related PRs

Suggested reviewers

  • piyush-zlai
  • nikhil-zlai
  • varant-zlai

Poem

🐰 In the meadow where tests do play,
New tags hop in, brightening the day.
Fetcher and Join, with mutations in tow,
Testing's a breeze, watch our coverage grow!
With a flick of a command, the errors we chase,
In the world of code, we find our place! 🌼


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.ElementType;

@org.scalatest.TagAnnotation
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piyush-zlai wdyt about this? Reason for the change is that this more natively leverages the scalatest TagAnnotation framework and allows us to do exclusions on the commandline without additional code ie:

sbt "spark/testOnly -- -l ai.chronon.spark.test.JoinTestTag"

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
.github/workflows/test_scala_and_python.yaml (1)

86-87: Update step names to reflect specific test categories.

The step names in these jobs say "Run other spark tests" but they are actually running specific tagged tests (join, mutations, and fetcher tests respectively). Consider updating the names for clarity:

- name: Run other spark tests
+ name: Run join spark tests
- name: Run other spark tests
+ name: Run mutations spark tests
- name: Run other spark tests
+ name: Run fetcher spark tests

Also applies to: 105-106, 124-125

spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala (1)

Line range hint 141-144: Consider enhancing method documentation.

The test helper methods (generateMutationData, generateRandomData, generateEventOnlyData) would benefit from additional documentation explaining:

  • Input parameters and their impact
  • Expected test data patterns
  • Use cases for different test scenarios

Also applies to: 234-237, 467-470

spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala (1)

43-43: Consider removing unused imports if flagged by static analysis.

The import statement includes Ignore which might not be used in this file.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fc7f408 and eb480b5.

📒 Files selected for processing (7)
  • .github/workflows/test_scala_and_python.yaml (3 hunks)
  • spark/src/test/java/ai/chronon/spark/test/Tags.java (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/FeatureWithLabelJoinTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/MutationsTest.scala (1 hunks)
  • spark/src/test/scala/ai/chronon/spark/test/TaggedFilterSuite.scala (0 hunks)
💤 Files with no reviewable changes (1)
  • spark/src/test/scala/ai/chronon/spark/test/TaggedFilterSuite.scala
✅ Files skipped from review due to trivial changes (2)
  • spark/src/test/java/ai/chronon/spark/test/Tags.java
  • spark/src/test/scala/ai/chronon/spark/test/FeatureWithLabelJoinTest.scala
🔇 Additional comments (8)
.github/workflows/test_scala_and_python.yaml (2)

87-87: LGTM! Consistent update to native annotation tags.

The changes correctly update the test tags to use fully qualified annotation class names, which aligns with the transition to native tag annotations. This change improves clarity and maintainability by explicitly referencing the annotation classes.

Also applies to: 106-106, 125-125


86-87: Verify resource allocation strategy.

There are notable differences in resource allocation between test jobs:

  • other_spark_tests uses ubuntu-8_cores-32_gb runner with 24G heap
  • These specific test jobs use ubuntu-latest with 8G heap

Let's verify if this allocation is intentional and optimal.

Also applies to: 105-106, 124-125

✅ Verification successful

Let me gather more information about the test characteristics and memory usage patterns.


Based on the gathered information, I can now provide a final response:

Memory allocation differences are justified by test characteristics

The different memory allocations between test jobs are appropriate because:

  • JoinTest primarily works with in-memory data transformations and Spark operations
  • The other Spark tests (using 24G heap) include more comprehensive test suites that likely process larger datasets and run multiple operations in parallel

The current allocation of 8G heap for JoinTest is sufficient as the test focuses on join operations with in-memory data structures and doesn't show evidence of heavy resource requirements in the implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test execution times and memory usage patterns

# Search for test execution patterns and memory requirements
rg -A 5 "testOnly.*JoinTestTag|testOnly.*MutationsTestTag|testOnly.*FetcherTestTag" --glob '!.github'

# Look for documentation about resource requirements
fd -t f "README|CONTRIBUTING" -X cat {} 2>/dev/null | grep -i "memory\|resource\|requirement"

Length of output: 2190


Script:

#!/bin/bash
# Check the test files for memory-intensive operations and data generation
ast-grep --pattern 'class JoinTest {
  $$$
}'

# Look for other test configurations and memory settings
rg -l "SBT_OPTS.*-Xmx" --glob '!.github/workflows/test_scala_and_python.yaml'

# Check for any performance or resource-related comments in the test files
rg -i "memory|performance|resource|intensive" spark/src/test/scala/

Length of output: 13181

spark/src/test/scala/ai/chronon/spark/test/FetcherTest.scala (2)

67-69: LGTM! Class annotation changes align with native tag transition.

The transition to using @FetcherTestTag annotation and removal of TaggedFilterSuite extension successfully implements the native tag annotation approach.


Line range hint 825-844: Add test coverage for tag-based filtering.

Given the transition to native tag annotations, consider adding test cases that specifically verify:

  • Tag-based test filtering functionality
  • Tag exclusion using the -l option
  • Multiple tag combinations
spark/src/test/scala/ai/chronon/spark/test/MutationsTest.scala (2)

49-49: LGTM: Test execution command updated correctly.

The comment has been updated to use the fully qualified annotation name, which aligns with the new native tag annotations approach.


52-53: LGTM: Test class properly annotated.

The @MutationsTestTag annotation has been correctly added to the test class, replacing the previous tagging mechanism.

Let's verify the annotation definition:

✅ Verification successful

Annotation is properly defined and used as expected

The verification confirms that:

  • The @MutationsTestTag is correctly defined as a Java annotation interface in spark/src/test/java/ai/chronon/spark/test/Tags.java
  • It's properly used in MutationsTest.scala with correct documentation for test execution
  • No unexpected or incorrect usages were found in the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the MutationsTestTag annotation definition and usage

# Check the annotation definition
ast-grep --pattern 'class MutationsTestTag {
  $$$
}'

# Check for any other usages of this tag
rg "MutationsTestTag" -A 2

Length of output: 764

spark/src/test/scala/ai/chronon/spark/test/JoinTest.scala (2)

49-49: Verify consistent annotation usage across test files.

Since this is a WIP PR transitioning to native tag annotations, ensure that:

  1. All test files follow the same pattern of using native annotations
  2. The running instructions are consistently documented

Let's check other test files for annotation usage:

✅ Verification successful

No consistency issues found with test annotations

Based on the search results, there are only a few test files using custom test tags (@JoinTestTag, @MutationsTestTag, @FetcherTestTag), while most test files use standard JUnit annotations (@Test, @Before, @After). The change to use @JoinTestTag in JoinTest.scala appears to be consistent with the existing pattern of using custom test tags where specific test categorization is needed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check annotation usage consistency across test files
# Look for files with @*TestTag annotations
ast-grep --pattern '@$_TestTag'

# Look for files still using the old tagging method
rg 'override def tagName'

Length of output: 61


Script:

#!/bin/bash
# Let's try a broader search for test files and their annotations
# First, find all test files
fd "Test.scala$" --type f

# Then look for any annotations in these files
rg "^[[:space:]]*@\w+" --type scala

# Also check for any imports of test-related annotations
rg "import.*Test.*Tag" --type scala

Length of output: 42368


48-50: LGTM! Clear documentation and proper annotation usage.

The comment clearly documents how to run the test with the new tag syntax, and the @JoinTestTag annotation is correctly applied at the class level.

Let's verify the annotation class exists:

@tchow-zlai tchow-zlai closed this Nov 21, 2024
@tchow-zlai tchow-zlai deleted the tchow/shard-tests branch November 21, 2024 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant