-
Notifications
You must be signed in to change notification settings - Fork 0
Backport service changes to zipline fork #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request introduces significant enhancements to the project's build configuration and structure, particularly in the Changes
Possibly related PRs
Suggested reviewers
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 25
🧹 Outside diff range and nitpick comments (30)
service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java (2)
17-17
: Consider making the service name configurableThe service name "feature-service" is hardcoded. Consider making it configurable through a configuration file or environment variable for better flexibility across different environments.
- JavaFetcher fetcher = api.buildJavaFetcher("feature-service", false); + JavaFetcher fetcher = api.buildJavaFetcher(config.getServiceName(), false);
10-12
: Enhance class documentation with Javadoc formatConsider improving the documentation by:
- Using proper Javadoc format
- Adding endpoint details (HTTP methods, request/response formats)
- Including example usage
-// Configures the routes for our get features endpoints -// We support bulkGets of groupBys and bulkGets of joins +/** + * Configures routes for feature endpoints supporting bulk operations. + * + * Endpoints: + * - POST /groupby/:name - Bulk get operations for groupBy features + * Request body: {"ids": ["id1", "id2"]} + * Response: {"features": [...]} + * + * - POST /join/:name - Bulk get operations for join features + * Request body: {"ids": ["id1", "id2"]} + * Response: {"features": [...]} + */service/src/main/java/ai/chronon/service/ConfigStore.java (1)
12-17
: Enhance class documentation with additional Javadoc tagsThe documentation clearly explains the purpose, but could benefit from standard Javadoc tags.
Add the following tags:
/** * Helps keep track of the various Chronon fetcher service configs. * We currently read configs once at startup - this makes sense for configs * such as the server port and we can revisit / extend things in the future to * be able to hot-refresh configs like Vertx supports under the hood. + * + * @since 1.0 + * @author piyush-zlai + * @throws IllegalStateException if unable to load service config */service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java (3)
16-31
: Consider enhancing test coverage and readability.While the basic happy path is well tested, consider these improvements:
- Add test cases for edge cases:
- Empty list
- Multiple items in the list
- Verify actual values, not just types
- Use more descriptive assertion messages
Example improvement:
@Test public void testParsingOfSimpleJavaRequests() { String mockRequest = "[{\"user\":\"user1\",\"zip\":10010}]"; RequestBody mockRequestBody = mock(RequestBody.class); when(mockRequestBody.asString()).thenReturn(mockRequest); String groupByName = "my_groupby.1"; JTry<List<JavaRequest>> maybeRequest = FeaturesHandler.parseJavaRequest(groupByName, mockRequestBody); - assertTrue(maybeRequest.isSuccess()); + assertTrue("Request parsing should succeed", maybeRequest.isSuccess()); List<JavaRequest> reqs = maybeRequest.getValue(); - assertEquals(1, reqs.size()); + assertEquals("Should parse exactly one request", 1, reqs.size()); JavaRequest req = reqs.get(0); - assertEquals(req.name, groupByName); + assertEquals("Group by name should match input", groupByName, req.name); - assertTrue(req.keys.containsKey("user") && req.keys.get("user").getClass().equals(String.class)); + assertEquals("User value should match input", "user1", req.keys.get("user")); - assertTrue(req.keys.containsKey("zip") && req.keys.get("zip").getClass().equals(Integer.class)); + assertEquals("Zip value should match input", 10010, req.keys.get("zip")); }
33-44
: Consider strengthening error validation.The error case testing is good, but could be enhanced by:
- Verifying the specific exception type/message
- Testing different JSON error scenarios (e.g., malformed arrays, invalid types)
Example improvement:
@Test public void testParsingInvalidRequest() { // mess up the colon after the zip field String mockRequest = "[{\"user\":\"user1\",\"zip\"10010}]"; RequestBody mockRequestBody = mock(RequestBody.class); when(mockRequestBody.asString()).thenReturn(mockRequest); String groupByName = "my_groupby.1"; JTry<List<JavaRequest>> maybeRequest = FeaturesHandler.parseJavaRequest(groupByName, mockRequestBody); - assertFalse(maybeRequest.isSuccess()); - assertNotNull(maybeRequest.getException()); + assertFalse("Parsing should fail for invalid JSON", maybeRequest.isSuccess()); + assertTrue("Should contain JSON parsing exception", + maybeRequest.getException().getMessage().contains("JSON parse error")); }
46-56
: Consider refactoring test strategy.This test case might be redundant as it doesn't significantly differ from
testParsingInvalidRequest
. Consider either:
- Removing this test if the validation behavior is the same
- Modifying it to test partial success scenarios if the feature supports it
- Adding more assertions about the specific failure mode
If partial success scenarios are not supported, you could remove this test and add a comment in
testParsingInvalidRequest
indicating that any invalid JSON in the array causes complete failure.service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java (2)
17-23
: Enhance documentation with configuration details.While the documentation explains the purpose well, it would be helpful to add:
- Required environment variables or configuration parameters
- Example StatsD metric names/formats that will be emitted
- Links to related documentation about the metrics system
58-60
: Add logging and error handling to main method.Consider enhancing the main method with:
- Startup logging to confirm configuration
- Error handling for dispatch failures
public static void main(String[] args) { + try { + // Add logging framework import at the top + Logger.info("Starting ChrononServiceLauncher..."); new ChrononServiceLauncher().dispatch(args); + Logger.info("ChrononServiceLauncher started successfully"); + } catch (Exception e) { + Logger.error("Failed to start ChrononServiceLauncher", e); + System.exit(1); + } }service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java (2)
9-14
: Enhance class documentation for better maintainability.Consider adding more details to the JavaDoc:
- Document the response structure and each field's purpose
- Specify thread safety guarantees (the class appears immutable)
- Document null handling behavior and the purpose of @JsonInclude
/** * PoJo capturing the response we return back as part of /v1/features/groupby and /v1/features/join endpoints * when the individual bulkGet lookups were either all successful or partially successful. + * + * This class is immutable and thread-safe. The response contains a list of Results, each representing + * the outcome of a feature lookup operation. Null fields are excluded from JSON serialization. + * + * @see Result */
1-105
: Consider architectural improvements for better API design.While the overall structure is good, consider these architectural improvements:
- Add a factory method for common response patterns (e.g.,
successResponse()
,failureResponse()
)- Consider using a more type-safe approach for features (e.g., generic type parameter)
- Add validation annotations (e.g.,
@NotNull
) to make API contracts more explicit- Consider implementing
toString()
,equals()
, andhashCode()
for better debugging and testingWould you like me to provide example implementations for any of these suggestions?
service/src/main/java/ai/chronon/service/WebServiceVerticle.java (4)
13-22
: Consider enhancing Javadoc with additional tags.While the documentation is informative, consider adding standard Javadoc tags like @author and @SInCE for better documentation practices.
/** * Entry point for the Chronon webservice. We wire up our API routes and configure and launch our HTTP service here. * We choose to use just 1 verticle for now as it allows us to keep things simple and we don't need to scale / * independently deploy different endpoint routes. + * @author [your-name] + * @since [version] */
23-27
: Consider enhancing error handling and configuration validation.The start method could benefit from:
- Proper error handling around ConfigStore initialization
- Validation of configuration values before server start
- Integration with metrics system for monitoring
@Override public void start(Promise<Void> startPromise) throws Exception { - ConfigStore cfgStore = new ConfigStore(vertx); - startHttpServer(cfgStore.getServerPort(), cfgStore.encodeConfig(), ApiProvider.buildApi(cfgStore), startPromise); + try { + ConfigStore cfgStore = new ConfigStore(vertx); + int port = cfgStore.getServerPort(); + if (port <= 0 || port > 65535) { + throw new IllegalArgumentException("Invalid port number: " + port); + } + // Initialize metrics + setupMetrics(); + startHttpServer(port, cfgStore.encodeConfig(), ApiProvider.buildApi(cfgStore), startPromise); + } catch (Exception e) { + logger.error("Failed to initialize service", e); + startPromise.fail(e); + } }
38-40
: Enhance health check response with system status.The current health check response is too basic. Consider including system status information.
router.get("/ping").handler(ctx -> { - ctx.json("Pong!"); + ctx.json(new JsonObject() + .put("status", "UP") + .put("timestamp", System.currentTimeMillis()) + .put("version", getClass().getPackage().getImplementationVersion()) + .put("memory", Runtime.getRuntime().freeMemory())); });
50-54
: Move hard-coded values to configuration.The TCP idle timeout is hard-coded. Consider moving it to configuration for better maintainability.
HttpServerOptions httpOptions = new HttpServerOptions() .setTcpKeepAlive(true) - .setIdleTimeout(60); + .setIdleTimeout(cfgStore.getHttpIdleTimeout());service/README.md (4)
1-5
: Add security considerations sectionSince this service exposes feature data through HTTP/gRPC endpoints, consider adding a section about security best practices, authentication, and authorization mechanisms available to protect sensitive feature data.
7-10
: Add Vert.x version compatibility informationConsider specifying:
- The Vert.x version being used
- Any version compatibility requirements
- Supported Java versions
🧰 Tools
🪛 LanguageTool
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...(BASED_HYPHEN)
52-76
: Enhance metrics documentationConsider adding:
- Description of each metric type and its significance
- Recommended alerting thresholds
- Grafana/dashboard setup instructions
- Performance benchmarks and expected metric ranges
1-76
: Improve formatting consistencyConsider these minor improvements:
- Use hyphenation for compound terms (e.g., "gRPC-based")
- Standardize compound words (e.g., "submodule" instead of "sub-module")
- Show command outputs in code blocks for better readability
🧰 Tools
🪛 LanguageTool
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...(BASED_HYPHEN)
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...(EN_COMPOUNDS_SUB_MODULE)
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...(EN_COMPOUNDS_SUB_DIRECTORY)
🪛 Markdownlint
47-47: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
online/src/main/scala/ai/chronon/online/Metrics.scala (2)
130-132
: Consider enhancing configuration management and security.While the Unix socket support is a good addition, consider these improvements:
- Extract property names as constants to prevent typos and enable easier refactoring
- Add validation for the host value
- Log the configured values at startup
- Consider using a more secure configuration mechanism than system properties
object Context { + private val METRICS_HOST_PROPERTY = "ai.chronon.metrics.host" + private val METRICS_DEFAULT_HOST = "localhost" + // Host can also be a Unix socket like: unix:///opt/datadog-agent/run/dogstatsd.sock // In the unix socket case port is configured to be 0 - val statsHost: String = System.getProperty("ai.chronon.metrics.host", "localhost") + val statsHost: String = { + val host = System.getProperty(METRICS_HOST_PROPERTY, METRICS_DEFAULT_HOST) + require(host.nonEmpty, s"$METRICS_HOST_PROPERTY must not be empty") + logger.info(s"Configured metrics host: $host") + host + }
141-141
: Consider improving testability and error handling.The statsClient initialization could benefit from these architectural improvements:
- Add error handling for connection failures
- Make the client more testable by allowing injection
- Add proper shutdown handling
object Context { + trait MetricsClient { + def shutdown(): Unit + // ... other methods + } + + private class StatsMetricsClient(host: String, port: Int) extends MetricsClient { + private val client = try { + new NonBlockingStatsDClientBuilder() + .prefix("ai.zipline") + .hostname(host) + .port(port) + .build() + } catch { + case e: Exception => + logger.error(s"Failed to initialize metrics client: ${e.getMessage}", e) + throw e + } + + override def shutdown(): Unit = client.stop() + // ... implement other methods + } + - private val statsClient: NonBlockingStatsDClient = - new NonBlockingStatsDClientBuilder().prefix("ai.zipline").hostname(statsHost).port(statsPort).build() + private val statsClient: MetricsClient = new StatsMetricsClient(statsHost, statsPort)build.sbt (3)
270-270
: Remove duplicate assemblyJarName settingThe assemblyJarName is defined twice (lines 270 and 298). Remove one instance to avoid confusion.
assembly / assemblyJarName := s"${name.value}-${version.value}.jar", assembly / artifact := { val art = (assembly / artifact).value art.withClassifier(Some("assembly")) }, addArtifact(assembly / artifact, assembly), libraryDependencies ++= Seq( ... ), - // Assembly settings - assembly / assemblyJarName := s"${name.value}-${version.value}.jar",Also applies to: 298-298
291-292
: Update Mockito comment and versionThe comment mentions Java 8, but according to the file comments, the project supports Java 11 (Dataproc) and Java 17 (EMR). Consider updating to Mockito 5.x for better Java 11+ support.
- // use mockito 4.x as Chronon builds on Java8 - "org.mockito" % "mockito-core" % "4.11.0" % Test, + // Mockito 5.x for Java 11+ support + "org.mockito" % "mockito-core" % "5.8.0" % Test,
306-317
: Consider inheriting merge strategy from ThisBuildThe service module's merge strategy largely duplicates the root level strategy. Consider inheriting from ThisBuild's strategy and only override specific cases if needed.
// Merge strategy for assembly assembly / assemblyMergeStrategy := { - case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard - case PathList("META-INF", xs @ _*) => MergeStrategy.first case PathList("javax", "activation", xs @ _*) => MergeStrategy.first case PathList("org", "apache", "logging", xs @ _*) => MergeStrategy.first case PathList("org", "slf4j", xs @ _*) => MergeStrategy.first case "application.conf" => MergeStrategy.concat case "reference.conf" => MergeStrategy.concat case x => - val oldStrategy = (assembly / assemblyMergeStrategy).value - oldStrategy(x) + (ThisBuild / assemblyMergeStrategy).value(x) }service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerTest.java (5)
36-47
: Consider grouping related mocks togetherThe mock declarations could be organized better by grouping related mocks together. Consider grouping
routingContext
andrequestBody
together, as they are closely related.@Mock private JavaFetcher mockFetcher; @Mock private RoutingContext routingContext; @Mock private HttpServerResponse response; @Mock - RequestBody requestBody; + private RequestBody requestBody; // Fix inconsistent access modifier
95-95
: Extract timer delay to a constantThe timer delay of 1000ms is duplicated across multiple test methods. Consider extracting it to a constant for better maintainability.
+ private static final long ASSERTION_DELAY_MS = 1000L; // Then replace all occurrences of 1000 with ASSERTION_DELAY_MS
Also applies to: 123-123, 153-153, 206-206, 261-261
275-275
: Fix double semicolonThere's a double semicolon at the end of the async.complete() call.
- async.complete();; + async.complete();
287-306
: Improve validation method robustness and readabilityThe validateSuccessfulResponse method could be improved in several ways:
- Add validation for empty results
- Reduce nesting by extracting validation logic
- Use safer type casting with instanceof checks
Consider refactoring like this:
private void validateSuccessfulResponse(JsonObject actualResponse, List<GetFeaturesResponse.Result> expectedResults, TestContext context) { context.assertTrue(actualResponse.containsKey("results"), "Response should contain 'results' field"); JsonArray results = actualResponse.getJsonArray("results"); context.assertNotNull(results, "Results array should not be null"); context.assertEquals(results.size(), expectedResults.size(), "Results size mismatch"); for (int i = 0; i < expectedResults.size(); i++) { validateSingleResult(results.getJsonObject(i), expectedResults.get(i), context); } } private void validateSingleResult(JsonObject result, GetFeaturesResponse.Result expected, TestContext context) { Map<String, Object> resultMap = result.getMap(); context.assertTrue(resultMap.containsKey("status"), "Result should contain status"); context.assertEquals(resultMap.get("status"), expected.getStatus().name()); if (expected.getStatus().equals(Success)) { validateSuccessResult(resultMap, expected, context); } else { validateFailureResult(resultMap, expected, context); } }
1-307
: Consider adding more error scenariosWhile the test coverage is good, consider adding these additional test cases:
- Test with empty request body
- Test with malformed feature map response
- Test with null values in the feature map
- Test concurrent requests handling
Would you like me to help generate these additional test cases?
service/src/main/java/ai/chronon/service/ApiProvider.java (2)
43-46
: Ensure proper resource management forURLClassLoader
Since
URLClassLoader
implementsCloseable
, it's good practice to close the class loader when it's no longer needed to prevent resource leaks.Consider using a try-with-resources statement to automatically close the class loader:
-URLClassLoader apiClassLoader = new URLClassLoader( - new URL[]{jarUrl}, - ApiProvider.class.getClassLoader() -); +try (URLClassLoader apiClassLoader = new URLClassLoader( + new URL[]{jarUrl}, + ApiProvider.class.getClassLoader() +)) { + // Move the code that uses apiClassLoader inside this block + // Load and instantiate the API implementation + Class<?> apiClass = Class.forName(className, true, apiClassLoader); + // ... rest of the code +}Make sure to adjust the scope of the class loader usage accordingly.
49-54
: Consider the security implications of loading external JARsLoading classes from external JAR files specified at runtime can introduce security risks. Ensure that the JAR files are from trusted sources, and consider implementing security measures such as code signing verification or input validation to prevent malicious code execution.
Review your security policies and consider adding safeguards to validate the integrity and authenticity of the JAR files being loaded.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
build.sbt
(3 hunks)online/src/main/scala/ai/chronon/online/Metrics.scala
(2 hunks)service/README.md
(1 hunks)service/src/main/java/ai/chronon/service/ApiProvider.java
(1 hunks)service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
(1 hunks)service/src/main/java/ai/chronon/service/ConfigStore.java
(1 hunks)service/src/main/java/ai/chronon/service/WebServiceVerticle.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java
(1 hunks)service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java
(1 hunks)service/src/main/resources/example_config.json
(1 hunks)service/src/main/resources/logback.xml
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerTest.java
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- service/src/main/resources/example_config.json
🧰 Additional context used
🪛 LanguageTool
service/README.md
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...
(BASED_HYPHEN)
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...
(EN_COMPOUNDS_SUB_MODULE)
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...
(EN_COMPOUNDS_SUB_DIRECTORY)
🪛 Markdownlint
service/README.md
47-47: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
🔇 Additional comments (17)
service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java (1)
19-20
: Consider adding security measures
The endpoints might need:
- Authentication/Authorization checks
- Rate limiting
- Request size limits
Let's check if any security measures are implemented elsewhere:
service/src/main/resources/logback.xml (3)
7-11
: LGTM! Well-structured console logging pattern
The console appender is well configured with:
- Essential logging fields (timestamp, thread, level, logger)
- Reasonable stack trace limit (10 lines)
46-49
: LGTM! Clean root logger configuration
The root logger is well configured with:
- Dynamic log level from environment
- Both console and file logging through async appenders
31-36
: Review async appender configuration for production readiness
The current async configuration might cause memory issues under heavy load:
discardingThreshold=0
means no events will be discardedneverBlock=true
means the queue will keep growingqueueSize=512
is relatively small for high-throughput scenarios
Consider:
- Setting
discardingThreshold=20
(default) to drop DEBUG/TRACE events under pressure - Increasing
queueSize
to 2048 for better throughput - Setting
neverBlock=false
for back-pressure in critical scenarios
Also applies to: 38-43
service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java (2)
1-14
: LGTM! Well-organized imports and clear class structure.
The package structure and imports are clean and follow Java conventions.
1-57
: Verify the existence and signature of parseJavaRequest method.
The tests look good overall, but let's verify the existence and signature of the method being tested.
✅ Verification successful
The parseJavaRequest method exists with the correct signature and implementation
The method exists in FeaturesHandler.java
with the exact signature being tested:
public static JTry<List<JavaRequest>> parseJavaRequest(String name, RequestBody body)
The implementation correctly:
- Takes a name and RequestBody as parameters
- Parses JSON from the request body into a List of Maps
- Converts each map into a JavaRequest
- Returns a JTry containing either the List or a failure
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the parseJavaRequest method exists in FeaturesHandler
# Expected: Should find the method with signature matching the test usage
ast-grep --pattern 'class FeaturesHandler {
$$$
static JTry<List<JavaRequest>> parseJavaRequest(String $_, RequestBody $_) {
$$$
}
$$$
}'
Length of output: 151
Script:
#!/bin/bash
# Let's try a broader search to find the FeaturesHandler class and its method
fd "FeaturesHandler.java" --type f --exec cat {}
Length of output: 5921
Script:
#!/bin/bash
# Let's also search for any file containing parseJavaRequest method using ripgrep
rg "parseJavaRequest" -A 5
Length of output: 3982
Script:
#!/bin/bash
# Search for JavaRequest class usage to understand the context
rg "JavaRequest" --type java
Length of output: 5841
service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java (2)
45-56
: Verify metrics configuration requirements.
The metrics setup looks correct, but please verify:
- Are all required HTTP labels included? (currently method, code, and path)
- Is enabling all JVM metrics appropriate for your use case?
- Consider adding error handling around registry creation
✅ Verification successful
Let me gather more information about the metrics configuration to make a more informed decision.
Let me check one more aspect regarding the Label enum to understand the available HTTP labels.
Based on the gathered information, I can now provide the final response:
Metrics configuration appears robust and well-configured
The current setup is appropriate because:
- The HTTP labels (method, code, path) are standard Vert.x metrics labels and cover the essential HTTP metrics dimensions
- JVM metrics are properly enabled and commonly used for monitoring JVM health
- Error handling for registry creation isn't critical here as:
- StatsdConfig is properly configured with host and port from Metrics.Context
- StatsdMeterRegistry creation failures are handled by Micrometer's internal error handling
The configuration is the only metrics setup in the codebase and follows standard practices.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other metric configurations in the codebase for consistency
rg -l "MicrometerMetricsOptions|MetricsOptions" --type java --type scala
# Look for existing JVM metrics usage
rg "JvmMetrics|jvmMetrics" --type java --type scala
Length of output: 310
Script:
#!/bin/bash
# Check for any metric-related configuration files
fd -e yaml -e properties -e conf -e xml | xargs rg -l "metrics|statsd"
# Look for Label enum or similar classes defining metric labels
ast-grep --pattern 'enum $_ {
$$$
HTTP_METHOD,
$$$
}'
# Search for StatsdMeterRegistry usage and configuration
rg "StatsdMeterRegistry|MeterRegistry" -A 3 --type java
Length of output: 1339
Script:
# Look for Label enum definition to see all available HTTP-related labels
rg "enum Label" -A 10 --type java
# Check if there are any custom metric configurations
rg "StatsdConfig|MeterConfig" -A 5 --type java
Length of output: 1537
1-61
: Verify consistency with original OSS implementation.
Since this is a backport from airbnb/chronon#873, please verify:
- All necessary dependencies have been correctly ported
- The metrics configuration matches the original implementation
- No environment-specific configurations were missed
✅ Verification successful
Let me gather more information about the metrics configuration and dependencies.
Let me check the specific metrics implementation and dependencies to ensure complete verification.
Based on the comprehensive verification of the metrics implementation, I can now provide the final response:
✓ Implementation correctly matches the original OSS version
The backport has been properly implemented with:
- All required dependencies present in
build.sbt
: vertx-micrometer-metrics and micrometer-registry-statsd - Correct metrics configuration using the shared
Metrics.scala
context - Proper integration with the existing statsd infrastructure through
statsHost
andstatsPort
fromMetrics.Context
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any additional files that might need porting
rg -l "StatsdConfig|MicrometerMetricsOptions" --type java --type scala
# Look for potential missing configuration files
fd -e yaml -e properties -e conf
Length of output: 458
Script:
#!/bin/bash
# Check build files for metrics-related dependencies
rg -l "micrometer|statsd" build.sbt project/
# Check metrics configuration in application.conf
cat hub/conf/application.conf
# Look for any other metrics-related code
rg -g '!*.{sbt,conf}' -l "metrics|statsd|micrometer"
Length of output: 3740
Script:
#!/bin/bash
# Check the Metrics.scala file referenced in ChrononServiceLauncher
cat online/src/main/scala/ai/chronon/online/Metrics.scala
# Check build.sbt for micrometer dependencies
cat build.sbt
# Look for any other StatsdMeterRegistry usage
rg "StatsdMeterRegistry" --type java --type scala -C 3
Length of output: 23715
service/src/main/java/ai/chronon/service/WebServiceVerticle.java (3)
1-12
: LGTM! Clean and well-organized imports.
The imports are properly organized and all seem necessary for the implemented functionality.
67-83
: LGTM! Well-implemented graceful shutdown.
The stop method properly handles server shutdown with appropriate error handling and logging.
25-26
: Verify integration with dependent components.
Let's verify the integration with FeaturesRouter, ConfigStore, and ApiProvider components.
Also applies to: 35-35
✅ Verification successful
Let me gather more information about the implementations and usages of these components.
Integration with dependent components is properly implemented
The code shows correct integration between all components:
ConfigStore
is properly instantiated and used to provide configuration (port, encoded config)ApiProvider.buildApi()
correctly accepts theConfigStore
instanceFeaturesRouter.createFeaturesRoutes()
is properly integrated inWebServiceVerticle
with the requiredvertx
andapi
parameters
The component chain works as follows:
WebServiceVerticle
createsConfigStore
- Uses
ConfigStore
to initializeApiProvider
- Sets up HTTP routes using
FeaturesRouter
with the created API instance
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the integration between components
# Check if all required components are present
echo "Checking for required components..."
rg -l "class (FeaturesRouter|ConfigStore|ApiProvider)"
# Verify FeaturesRouter implementation
echo "Checking FeaturesRouter implementation..."
ast-grep --pattern 'class FeaturesRouter {
$$$
static $_ createFeaturesRoutes($_, $_) {
$$$
}
}'
# Verify ConfigStore implementation
echo "Checking ConfigStore implementation..."
ast-grep --pattern 'class ConfigStore {
$$$
int getServerPort() {
$$$
}
String encodeConfig() {
$$$
}
}'
Length of output: 737
Script:
#!/bin/bash
# Check ConfigStore implementation details
echo "=== ConfigStore Implementation ==="
rg "class ConfigStore" -A 5 service/src/main/java/ai/chronon/service/ConfigStore.java
# Check ApiProvider implementation
echo -e "\n=== ApiProvider Implementation ==="
rg "class ApiProvider" -A 10 service/src/main/java/ai/chronon/service/ApiProvider.java
# Check FeaturesRouter implementation
echo -e "\n=== FeaturesRouter Implementation ==="
rg "class FeaturesRouter" -A 10 service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java
# Check usages of these components
echo -e "\n=== Component Usages ==="
rg "new ConfigStore|buildApi|createFeaturesRoutes" -n
Length of output: 2642
service/README.md (1)
19-20
: Verify and future-proof external links
The links to specific files in the airbnb/chronon repository might break in the future. Consider:
- Using relative links within the repository
- Adding code snippets or interface definitions directly in the documentation
build.sbt (3)
55-55
: LGTM: Root project aggregation updated correctly
The service module is properly integrated into the root project's aggregation.
277-288
: Verify Vertx and Micrometer versions compatibility
Ensure the selected versions of Vertx (4.5.10) and Micrometer (1.13.6) are compatible and stable.
✅ Verification successful
Let me gather more information about Micrometer compatibility.
Based on the security vulnerabilities found and the available information, I'll generate the final response:
The current Vertx version 4.5.10 is newer than versions with known vulnerabilities (last reported issue was in 4.5.3). However, for Micrometer integration with Vertx, both libraries are commonly used together without strict version coupling, and the current versions (Vertx 4.5.10 and Micrometer 1.13.6) are stable releases.
- No compatibility issues found between Vertx 4.5.10 and Micrometer 1.13.6
- The Vertx version is secure and up-to-date
- The integration is properly configured with vertx-micrometer-metrics module
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for known vulnerabilities in Vertx and Micrometer
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: MAVEN, package: "io.vertx:vertx-core") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}'
Length of output: 1468
Script:
#!/bin/bash
# Search for Micrometer version requirements in Vertx documentation or configuration
rg -A 5 "micrometer|metrics" --type md
# Check if there are any explicit version constraints in build files
rg -A 5 "micrometer|metrics" --type sbt --type gradle --type xml --type properties
Length of output: 13512
30-31
: Consider updating to newer logging library versions
The specified versions are outdated:
- slf4j 1.7.36 (current stable: 2.0.9)
- logback 1.2.11 (current stable: 1.4.14)
Newer versions include security fixes and performance improvements.
service/src/main/java/ai/chronon/service/ApiProvider.java (2)
49-54
:
Use specific constructor instead of getConstructors()[0]
Relying on getConstructors()[0]
is unsafe because the order of constructors is not guaranteed and may lead to invoking the wrong constructor. It's safer to retrieve the constructor that matches the expected parameter types.
Modify the code to obtain the specific constructor:
-return (Api) apiClass.getConstructors()[0].newInstance(scalaPropsMap);
+Constructor<?> constructor = apiClass.getConstructor(scala.collection.immutable.Map.class);
+return (Api) constructor.newInstance(scalaPropsMap);
This ensures that you are invoking the correct constructor that accepts a scala.collection.immutable.Map
as a parameter.
Likely invalid or redundant comment.
56-57
: Verify the correctness of Java to Scala Map
conversion
Ensure that the method used to convert propsMap
from a Java Map
to a Scala Map
is correct and compatible with the Scala version you are using. There might be compatibility issues between different Scala versions.
Run the following script to verify that the collection converters are correct:
This will help confirm that the collection converters used are compatible with your Scala version.
✅ Verification successful
Java to Scala Map conversion is correctly implemented
The ScalaVersionSpecificCollectionsConverter.convertJavaMapToScala
method is correctly implemented using the standard Scala collection converters. The implementation:
- Uses the appropriate
asScala
converter fromscala.jdk.CollectionConverters
(Scala 2.12/2.13) orscala.collection.JavaConverters
(Scala 2.11) - Properly converts to an immutable Scala Map using
toMap
- Handles the conversion in a type-safe manner with proper generic type parameters
The code is compatible across different Scala versions as evidenced by the version-specific implementations in different source directories (scala-2.11, scala-2.12, scala-2.13).
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the Scala collection converters availability and usage.
# Expected: The appropriate collection converters should be present.
# Check for the correct import statements in the codebase
rg 'import scala\.(collection|jdk)\.'
# List the available collection converter classes
jar tf path/to/scala-library.jar | grep -E 'CollectionConverters|JavaConverters'
Length of output: 12869
Script:
#!/bin/bash
# Let's verify the specific conversion implementation and its compatibility
# Check the implementation of ScalaVersionSpecificCollectionsConverter
fd ScalaVersionSpecificCollectionsConverter.scala --exec cat {}
# Check for any test files related to this converter
fd -t f ".*Test.*\.scala" -X rg -l "ScalaVersionSpecificCollectionsConverter"
Length of output: 6562
public static Router createFeaturesRoutes(Vertx vertx, Api api) { | ||
Router router = Router.router(vertx); | ||
router.route().handler(BodyHandler.create()); | ||
JavaFetcher fetcher = api.buildJavaFetcher("feature-service", false); | ||
|
||
router.post("/groupby/:name").handler(new FeaturesHandler(GroupBy, fetcher)); | ||
router.post("/join/:name").handler(new FeaturesHandler(Join, fetcher)); | ||
|
||
return router; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling and input validation
The current implementation could benefit from:
- Explicit error handling for failed requests
- Input validation for the
:name
parameter - Documentation for the
false
parameter inbuildJavaFetcher
public static Router createFeaturesRoutes(Vertx vertx, Api api) {
Router router = Router.router(vertx);
router.route().handler(BodyHandler.create());
- JavaFetcher fetcher = api.buildJavaFetcher("feature-service", false);
+ // Set enableCache to false for real-time feature fetching
+ JavaFetcher fetcher = api.buildJavaFetcher("feature-service", /* enableCache */ false);
+
+ // Add input validation
+ router.route("/*/:name").handler(ctx -> {
+ String name = ctx.pathParam("name");
+ if (name == null || name.trim().isEmpty()) {
+ ctx.response()
+ .setStatusCode(400)
+ .end("Invalid name parameter");
+ return;
+ }
+ ctx.next();
+ });
router.post("/groupby/:name").handler(new FeaturesHandler(GroupBy, fetcher));
router.post("/join/:name").handler(new FeaturesHandler(Join, fetcher));
+
+ // Add error handling
+ router.route().failureHandler(ctx -> {
+ ctx.response()
+ .setStatusCode(500)
+ .end("Internal server error");
+ });
return router;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
public static Router createFeaturesRoutes(Vertx vertx, Api api) { | |
Router router = Router.router(vertx); | |
router.route().handler(BodyHandler.create()); | |
JavaFetcher fetcher = api.buildJavaFetcher("feature-service", false); | |
router.post("/groupby/:name").handler(new FeaturesHandler(GroupBy, fetcher)); | |
router.post("/join/:name").handler(new FeaturesHandler(Join, fetcher)); | |
return router; | |
} | |
public static Router createFeaturesRoutes(Vertx vertx, Api api) { | |
Router router = Router.router(vertx); | |
router.route().handler(BodyHandler.create()); | |
// Set enableCache to false for real-time feature fetching | |
JavaFetcher fetcher = api.buildJavaFetcher("feature-service", /* enableCache */ false); | |
// Add input validation | |
router.route("/*/:name").handler(ctx -> { | |
String name = ctx.pathParam("name"); | |
if (name == null || name.trim().isEmpty()) { | |
ctx.response() | |
.setStatusCode(400) | |
.end("Invalid name parameter"); | |
return; | |
} | |
ctx.next(); | |
}); | |
router.post("/groupby/:name").handler(new FeaturesHandler(GroupBy, fetcher)); | |
router.post("/join/:name").handler(new FeaturesHandler(Join, fetcher)); | |
// Add error handling | |
router.route().failureHandler(ctx -> { | |
ctx.response() | |
.setStatusCode(500) | |
.end("Internal server error"); | |
}); | |
return router; | |
} |
public ConfigStore(Vertx vertx) { | ||
ConfigRetriever configRetriever = ConfigRetriever.create(vertx); | ||
configRetriever.getConfig().onComplete(ar -> { | ||
if (ar.failed()) { | ||
throw new IllegalStateException("Unable to load service config", ar.cause()); | ||
} | ||
jsonConfig = ar.result(); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix race condition in constructor and add thread safety
The current implementation has several critical issues:
- The constructor returns before config loading completes
- The
jsonConfig
field could be null if accessed before loading completes - The
jsonConfig
field lacks thread safety guarantees
Consider this safer implementation:
- private JsonObject jsonConfig;
+ private volatile JsonObject jsonConfig;
+ private final Object lock = new Object();
public ConfigStore(Vertx vertx) {
+ // Use CountDownLatch to wait for config loading
+ CountDownLatch latch = new CountDownLatch(1);
ConfigRetriever configRetriever = ConfigRetriever.create(vertx);
configRetriever.getConfig().onComplete(ar -> {
if (ar.failed()) {
throw new IllegalStateException("Unable to load service config", ar.cause());
}
- jsonConfig = ar.result();
+ synchronized (lock) {
+ jsonConfig = ar.result();
+ }
+ latch.countDown();
});
+ try {
+ if (!latch.await(5, TimeUnit.SECONDS)) {
+ throw new IllegalStateException("Timeout waiting for config");
+ }
+ } catch (InterruptedException e) {
+ Thread.currentThread().interrupt();
+ throw new IllegalStateException("Interrupted while loading config", e);
+ }
}
Don't forget to add these imports:
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;
public int getServerPort() { | ||
return jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add port range validation
The server port should be validated to ensure it's within the valid range (0-65535).
public int getServerPort() {
- return jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT);
+ int port = jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT);
+ if (port < 0 || port > 65535) {
+ throw new IllegalStateException("Invalid port number: " + port);
+ }
+ return port;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
public int getServerPort() { | |
return jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT); | |
} | |
public int getServerPort() { | |
int port = jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT); | |
if (port < 0 || port > 65535) { | |
throw new IllegalStateException("Invalid port number: " + port); | |
} | |
return port; | |
} |
Class<?> apiClass = Class.forName(className, true, apiClassLoader); | ||
if (!Api.class.isAssignableFrom(apiClass)) { | ||
throw new IllegalArgumentException( | ||
"Class " + className + " does not extend the Api abstract class" | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle exceptions when instantiating the API class
When calling newInstance
, several exceptions can be thrown (e.g., InstantiationException
, IllegalAccessException
, InvocationTargetException
). Ensure that these exceptions are properly handled or propagated with meaningful messages to aid in debugging.
Wrap the instantiation in a try-catch block and provide informative error messages:
try {
Constructor<?> constructor = apiClass.getConstructor(scala.collection.immutable.Map.class);
return (Api) constructor.newInstance(scalaPropsMap);
} catch (NoSuchMethodException | InstantiationException | IllegalAccessException | InvocationTargetException e) {
throw new RuntimeException("Failed to instantiate API class: " + className, e);
}
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
Outdated
Show resolved
Hide resolved
maybeFeatureResponses.onFailure(err -> { | ||
List<String> failureMessages = Collections.singletonList(err.getMessage()); | ||
ctx.response() | ||
.setStatusCode(500) | ||
.putHeader("content-type", "application/json") | ||
.end(new JsonObject().put("errors", failureMessages).encode()); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log exception details and avoid leaking internal messages
Currently, the exception err
in maybeFeatureResponses.onFailure
is not logged, and its message is returned to the client. This may leak sensitive information and hinders debugging.
Apply the following diff to log the exception and send a generic error message to the client:
+ logger.error("Failed to fetch features", err);
- List<String> failureMessages = Collections.singletonList(err.getMessage());
+ List<String> failureMessages = Collections.singletonList("Internal server error.");
ctx.response()
.setStatusCode(500)
.putHeader("content-type", "application/json")
.end(new JsonObject().put("errors", failureMessages).encode());
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piyush-zlai is this ready for review?
@tchow-zlai - yeah pretty much. It's a port from the OSS pr. I need to rebase to get all the test fixes in which I can do later but that won't change any of the code in this pr. |
1fc51f3
to
bc3391d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (5)
service/README.md (4)
9-10
: Minor formatting improvement neededAdd a hyphen to "gRPC-based" for better readability.
-web framework which supports HTTP and gRPC based services. +web framework which supports HTTP and gRPC-based services.🧰 Tools
🪛 LanguageTool
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...(BASED_HYPHEN)
14-28
: Enhance setup instructionsConsider the following improvements:
- Add minimum version requirements for sbt and Java
- Use relative paths instead of hardcoded
~/workspace/chronon
- Document common error scenarios and their solutions
- Combine compound words: "submodule" and "subdirectory"
🧰 Tools
🪛 LanguageTool
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...(EN_COMPOUNDS_SUB_MODULE)
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...(EN_COMPOUNDS_SUB_DIRECTORY)
47-49
: Improve curl examples formattingConsider using code blocks with example responses for better readability:
-$ curl 'http://localhost:9000/ping' -$ curl 'http://localhost:9000/config' -$ curl -X POST 'http://localhost:9000/v1/features/join/quickstart%2Ftraining_set.v2' -H 'Content-Type: application/json' -d '[{"user_id": "5"}]' +curl 'http://localhost:9000/ping' +# Response: pong + +curl 'http://localhost:9000/config' +# Response: {...} + +curl -X POST \ + 'http://localhost:9000/v1/features/join/quickstart%2Ftraining_set.v2' \ + -H 'Content-Type: application/json' \ + -d '[{"user_id": "5"}]' +# Response: {...}🧰 Tools
🪛 Markdownlint (0.35.0)
47-47: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
52-76
: Enhance metrics documentationConsider adding:
- Metrics retention policy
- Aggregation intervals
- Recommended monitoring thresholds
- Dashboard setup instructions
build.sbt (1)
313-324
: Consider simplifying merge strategyThe current merge strategy could be simplified by leveraging the root project's strategy.
assembly / assemblyMergeStrategy := { case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard - case PathList("META-INF", xs @ _*) => MergeStrategy.first - case PathList("javax", "activation", xs @ _*) => MergeStrategy.first - case PathList("org", "apache", "logging", xs @ _*) => MergeStrategy.first - case PathList("org", "slf4j", xs @ _*) => MergeStrategy.first + case PathList("META-INF", _*) => MergeStrategy.filterDistinctLines case "application.conf" => MergeStrategy.concat case "reference.conf" => MergeStrategy.concat case x => val oldStrategy = (assembly / assemblyMergeStrategy).value oldStrategy(x) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
build.sbt
(3 hunks)online/src/main/scala/ai/chronon/online/Metrics.scala
(2 hunks)service/README.md
(1 hunks)service/src/main/java/ai/chronon/service/ApiProvider.java
(1 hunks)service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
(1 hunks)service/src/main/java/ai/chronon/service/ConfigStore.java
(1 hunks)service/src/main/java/ai/chronon/service/WebServiceVerticle.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java
(1 hunks)service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java
(1 hunks)service/src/main/resources/example_config.json
(1 hunks)service/src/main/resources/logback.xml
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerTest.java
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- online/src/main/scala/ai/chronon/online/Metrics.scala
- service/src/main/java/ai/chronon/service/ApiProvider.java
- service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
- service/src/main/java/ai/chronon/service/ConfigStore.java
- service/src/main/java/ai/chronon/service/WebServiceVerticle.java
- service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
- service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java
- service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java
- service/src/main/resources/example_config.json
- service/src/main/resources/logback.xml
- service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java
- service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerTest.java
🧰 Additional context used
🪛 LanguageTool
service/README.md
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...
(BASED_HYPHEN)
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...
(EN_COMPOUNDS_SUB_MODULE)
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...
(EN_COMPOUNDS_SUB_DIRECTORY)
🪛 Markdownlint (0.35.0)
service/README.md
47-47: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
🔇 Additional comments (4)
build.sbt (4)
57-57
: LGTM!
The service project is correctly added to the root project's aggregate.
269-270
: LGTM with a note
Good use of centralized version management. The implementation is correct, but please note this is affected by the version update recommendation made earlier.
274-276
: Add cross-compilation settings
The previous review comment about adding cross-compilation settings is still valid. The service module should specify cross-compilation settings like other modules.
283-303
: Verify dependency compatibility
The mix of Vert.x, Netty, and Spark dependencies might cause runtime conflicts.
✅ Verification successful
Dependency configuration is correctly managed
The codebase already handles the potential conflicts between Vert.x, Netty, and Spark dependencies:
- The service module explicitly forces Netty version to 4.1.111.Final to resolve conflicts with Spark dependencies, as documented in the comment: "force netty versions -> without this we conflict with the versions pulled in from our online module's spark deps"
- Spark dependencies are properly marked as "provided" in relevant modules using
spark_all_provided
- The build.sbt file includes proper dependency overrides and exclusions to manage conflicts
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for potential dependency conflicts in the service module
# List all dependency versions
sbt 'service/dependencyTree' | grep -E 'vertx|netty|spark'
# Check for multiple versions of the same dependency
sbt 'service/dependencyTree' | sort | uniq -c | sort -nr | grep -E 'vertx|netty|spark' | grep -v '^\s*1\s'
Length of output: 353
Script:
#!/bin/bash
# Let's try a different approach to analyze dependencies
# First, let's check if there are any Spark dependencies in build.sbt
rg -A 3 -B 3 "spark" build.sbt
# Then check for any other build files that might contain Spark dependencies
fd -e sbt . | xargs rg "spark"
# Look for any explicit Netty version overrides or exclusions
rg -i "netty" build.sbt
# Check for any comments or documentation about dependency conflicts
rg -i "conflict|override|exclude" build.sbt
Length of output: 8289
lazy val service = (project in file("service")) | ||
.dependsOn(online) | ||
.settings( | ||
assembly / assemblyJarName := s"${name.value}-${version.value}.jar", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove duplicate assemblyJarName setting
The assemblyJarName is defined twice in the service project settings.
.settings(
assembly / assemblyJarName := s"${name.value}-${version.value}.jar",
assembly / artifact := {
val art = (assembly / artifact).value
art.withClassifier(Some("assembly"))
},
addArtifact(assembly / artifact, assembly),
libraryDependencies ++= Seq(
// ... dependencies ...
),
- assembly / assemblyJarName := s"${name.value}-${version.value}.jar",
Also applies to: 305-305
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good observation - what happens if we don't specify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikhil-zlai, if we don't specify assembly / assemblyJarName
, the sbt-assembly plugin will use its default naming convention for the assembled JAR file, which is typically assembly.jar
. This means the generated JAR won't include the project name or version in its filename.
Not specifying a custom assemblyJarName
can lead to confusion or potential overwriting of JAR files if multiple projects are assembled, as their artifacts might have the same default name. By specifying:
assembly / assemblyJarName := s"${name.value}-${version.value}.jar"
we ensure that the assembled JAR file includes both the project name and version, making it uniquely identifiable and preventing naming conflicts. This helps maintain clear and consistent artifact naming across different builds.
"com.typesafe" % "config" % "1.4.3", | ||
// force netty versions -> without this we conflict with the versions pulled in from | ||
// our online module's spark deps which causes the web-app to not serve up content | ||
"io.netty" % "netty-all" % "4.1.111.Final", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update Netty version
The current Netty version might have security vulnerabilities.
-"io.netty" % "netty-all" % "4.1.111.Final",
+"io.netty" % "netty-all" % "4.1.112.Final",
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"io.netty" % "netty-all" % "4.1.111.Final", | |
"io.netty" % "netty-all" % "4.1.112.Final", |
build.sbt
Outdated
lazy val slf4jApiVersion = "1.7.36" | ||
lazy val logbackClassicVersion = "1.2.11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider updating to newer versions of logging libraries
The current versions have known vulnerabilities:
- logback-classic 1.2.11 has CVE-2023-6378 and CVE-2023-6481
- slf4j-api 1.7.36 is from an older branch; 2.0.x is the current stable branch
Recommend updating to:
-lazy val slf4jApiVersion = "1.7.36"
-lazy val logbackClassicVersion = "1.2.11"
+lazy val slf4jApiVersion = "2.0.12"
+lazy val logbackClassicVersion = "1.4.14"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
lazy val slf4jApiVersion = "1.7.36" | |
lazy val logbackClassicVersion = "1.2.11" | |
lazy val slf4jApiVersion = "2.0.12" | |
lazy val logbackClassicVersion = "1.4.14" |
service/README.md
Outdated
|
||
We can now use this to start up the feature service: | ||
```bash | ||
~/workspace/chronon $ java -jar service/target/scala-2.12/service-*.jar run ai.chronon.service.WebServiceVerticle \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this command know of the mongo jar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is configured configured in the example_config.json along with other props for the mongo kv store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(vertx allows you to configure on the command line or cfg file so we could potentially inline all those configs on the launch command line)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see. either is fine w/ me. cli would definitely be nicer / more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah agree that it would be nice to wrap this in a command line script for better ergonomics. We'll still want some parameterization there to pass params to configure kv store and such. Lets iterate on this as we start wiring up with k8s, we can get to a nice place based on our needs there + local dev.
service/src/main/java/ai/chronon/service/WebServiceVerticle.java
Outdated
Show resolved
Hide resolved
bc3391d
to
b931a6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (9)
service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java (1)
9-12
: Enhance JavaDoc documentation.While the current documentation explains the purpose, it would be beneficial to include:
- Description of the response structure
- Example usage
- Parameter descriptions for the nested Result class
/** * PoJo capturing the response we return back as part of /v1/features/groupby and /v1/features/join endpoints * when the individual bulkGet lookups were either all successful or partially successful. + * + * The response contains a list of Result objects, each representing the outcome of a single lookup: + * - For successful lookups: contains features and entity keys + * - For failed lookups: contains error message + * + * Example: + * { + * "results": [ + * {"status": "Success", "features": {...}, "entityKeys": {...}}, + * {"status": "Failure", "error": "Not found"} + * ] + * } */service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java (2)
47-55
: Consider making EntityKeyToValues immutableThe inner class
EntityKeyToValues
has public fields which could be modified after construction. Consider making the fields final and private with getter methods.private static class EntityKeyToValues { - public Map<String, Object> entityKeys; - public JTry<Map<String, Object>> features; + private final Map<String, Object> entityKeys; + private final JTry<Map<String, Object>> features; public EntityKeyToValues(Map<String, Object> keys, JTry<Map<String, Object>> values) { this.entityKeys = keys; this.features = values; } + public Map<String, Object> getEntityKeys() { + return entityKeys; + } + public JTry<Map<String, Object>> getFeatures() { + return features; + } }
39-44
: Consider splitting GroupBy and Join handlersThe current design handles both GroupBy and Join operations in a single class. As these features evolve, they might require different handling logic, validation rules, or error handling. Consider splitting them into separate handler classes that implement a common interface.
This would improve:
- Separation of concerns
- Maintainability
- Testability
- Future extensibility
Example interface:
public interface FeatureHandler { void handle(RoutingContext ctx); } public class GroupByHandler implements FeatureHandler { // Specific implementation for GroupBy } public class JoinHandler implements FeatureHandler { // Specific implementation for Join }service/src/main/java/ai/chronon/service/ConfigStore.java (1)
46-48
: Consider making the config load timeout configurableThe 1-second timeout for waiting on the configuration load might be too short in some environments. Consider making this timeout configurable to enhance flexibility.
Apply this diff to make the timeout configurable:
private static final int DEFAULT_PORT = 8080; +private static final long CONFIG_LOAD_TIMEOUT_SECONDS = 1; //... try { if (!latch.await(CONFIG_LOAD_TIMEOUT_SECONDS, TimeUnit.SECONDS)) { throw new IllegalStateException("Timed out waiting for Vertx config read"); } } catch (InterruptedException e) { Thread.currentThread().interrupt(); throw new IllegalStateException("Interrupted while loading config", e); }
service/src/main/java/ai/chronon/service/ApiProvider.java (1)
42-46
: Ensure proper resource management ofURLClassLoader
Consider closing the
URLClassLoader
after loading the API class to prevent potential resource leaks.Apply this diff to close the class loader:
URLClassLoader apiClassLoader = new URLClassLoader( new URL[]{jarUrl}, ApiProvider.class.getClassLoader() ); +try { // Load and instantiate the API implementation Class<?> apiClass = Class.forName(className, true, apiClassLoader); // ... existing code ... +} finally { + apiClassLoader.close(); +}Ensure that closing the class loader does not interfere with the application's functionality.
service/README.md (4)
10-10
: Typographical correction: Hyphenate 'gRPC-based services'In line 10, "gRPC based services" should be hyphenated as "gRPC-based services" for correct grammar.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...(BASED_HYPHEN)
14-14
: Typographical correction: Combine 'sub-module' into 'submodule'In line 14, "sub-module" is usually written as one word: "submodule".
🧰 Tools
🪛 LanguageTool
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...(EN_COMPOUNDS_SUB_MODULE)
27-27
: Typographical correction: Combine 'sub-directory' into 'subdirectory'In line 27, "sub-directory" should be written as "subdirectory".
🧰 Tools
🪛 LanguageTool
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...(EN_COMPOUNDS_SUB_DIRECTORY)
47-49
: Style improvement: Remove dollar signs from command examplesTo enhance readability and allow easy copying of commands, consider removing the dollar signs from the command examples.
Apply this diff:
$ curl 'http://localhost:9000/ping' $ curl 'http://localhost:9000/config' $ curl -X POST 'http://localhost:9000/v1/features/join/quickstart%2Ftraining_set.v2' -H 'Content-Type: application/json' -d '[{"user_id": "5"}]' +curl 'http://localhost:9000/ping' +curl 'http://localhost:9000/config' +curl -X POST 'http://localhost:9000/v1/features/join/quickstart%2Ftraining_set.v2' -H 'Content-Type: application/json' -d '[{"user_id": "5"}]'And in line 82:
$ curl -X POST 'http://localhost:9000/v1/features/join/quickstart%2Ftraining_set.v2' -H 'Content-Type: application/json' -d '[{"user_id": "5"}, {"user_id": "7"}]' +curl -X POST 'http://localhost:9000/v1/features/join/quickstart%2Ftraining_set.v2' -H 'Content-Type: application/json' -d '[{"user_id": "5"}, {"user_id": "7"}]'
Also applies to: 82-82
🧰 Tools
🪛 Markdownlint (0.35.0)
47-47: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
build.sbt
(3 hunks)online/src/main/scala/ai/chronon/online/Metrics.scala
(2 hunks)service/README.md
(1 hunks)service/src/main/java/ai/chronon/service/ApiProvider.java
(1 hunks)service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
(1 hunks)service/src/main/java/ai/chronon/service/ConfigStore.java
(1 hunks)service/src/main/java/ai/chronon/service/WebServiceVerticle.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java
(1 hunks)service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java
(1 hunks)service/src/main/resources/example_config.json
(1 hunks)service/src/main/resources/logback.xml
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerTest.java
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- service/src/main/resources/example_config.json
- online/src/main/scala/ai/chronon/online/Metrics.scala
- service/src/main/java/ai/chronon/service/handlers/FeaturesRouter.java
- service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
- service/src/main/resources/logback.xml
- service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerJsonSerDeTest.java
- service/src/main/java/ai/chronon/service/WebServiceVerticle.java
- service/src/test/java/ai/chronon/service/handlers/FeaturesHandlerTest.java
🧰 Additional context used
📓 Learnings (1)
service/src/main/java/ai/chronon/service/ApiProvider.java (2)
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#70
File: service/src/main/java/ai/chronon/service/ApiProvider.java:6-6
Timestamp: 2024-12-03T04:04:33.809Z
Learning: The import `scala.util.ScalaVersionSpecificCollectionsConverter` in `service/src/main/java/ai/chronon/service/ApiProvider.java` is correct and should not be flagged in future reviews.
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#70
File: service/src/main/java/ai/chronon/service/ApiProvider.java:25-25
Timestamp: 2024-12-03T04:04:28.349Z
Learning: In `ApiProvider.java`, the method `buildApi` should continue to throw `Exception` rather than specific exceptions.
🪛 LanguageTool
service/README.md
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...
(BASED_HYPHEN)
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...
(EN_COMPOUNDS_SUB_MODULE)
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...
(EN_COMPOUNDS_SUB_DIRECTORY)
[uncategorized] ~86-~86: You might be missing the article “the” here.
Context: ...200 (some / all successful lookups). In case of the 200 response, the payload looks ...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
🪛 Markdownlint (0.35.0)
service/README.md
47-47: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
82-82: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
🔇 Additional comments (12)
service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java (5)
15-19
: Protect against list modification for true immutability.
While the results
field is final, its contents can still be modified. Consider defensive copying to ensure true immutability.
21-23
: Prevent list modification via getter.
The getter returns a direct reference to the internal list, which could be modified by clients.
32-35
: Add null check in builder's results setter.
The results setter should validate its input to prevent NullPointerException.
59-64
: Add validation in Result constructor.
The Result constructor should validate its state:
- Status should be required
- Error should only be present for Failure status
- Features map should be immutable
70-72
: Prevent map modification via getter.
The features and entityKeys getters should return an unmodifiable view of the maps.
Also applies to: 74-76
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java (1)
74-79
: Avoid exposing internal exception messages in HTTP responses
This issue was previously identified and the recommendation still stands. Replace the exception message with a generic error message.
service/src/main/java/ai/chronon/service/ConfigStore.java (2)
55-57
:
Add port range validation in getServerPort()
The method getServerPort()
should validate the port number to ensure it is within the valid range (0-65535) to prevent potential errors or security issues.
Apply this diff to add port range validation:
public int getServerPort() {
int port = jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT);
+ if (port < 0 || port > 65535) {
+ throw new IllegalStateException("Invalid port number: " + port);
+ }
return port;
}
79-81
:
Mask sensitive information in encodeConfig()
The encodeConfig()
method may expose sensitive configuration data. Consider masking or removing sensitive fields before encoding the configuration to prevent accidental exposure.
Apply this diff to sanitize sensitive fields:
public String encodeConfig() {
+ synchronized (lock) {
+ // Create a copy and remove sensitive fields
+ JsonObject sanitized = jsonConfig.copy();
+ // List of sensitive keys to mask
+ String[] sensitiveKeys = {ONLINE_API_PROPS};
+ for (String key : sensitiveKeys) {
+ if (sanitized.containsKey(key)) {
+ sanitized.put(key, "****");
+ }
+ }
+ return sanitized.encodePrettily();
+ }
}
build.sbt (4)
291-291
: Remove duplicate assemblyJarName
setting
The assemblyJarName
is defined twice in the service
project settings. Remove the duplicate to clean up the build configuration.
Apply this diff:
.settings(
assembly / assemblyJarName := s"${name.value}-${version.value}.jar",
assembly / artifact := {
val art = (assembly / artifact).value
art.withClassifier(Some("assembly"))
},
addArtifact(assembly / artifact, assembly),
libraryDependencies ++= Seq(
// ... dependencies ...
),
- assembly / assemblyJarName := s"${name.value}-${version.value}.jar",
Also applies to: 319-319
290-290
: Add cross-compilation settings to the service
project
To ensure compatibility across Scala versions, the service
project should specify crossScalaVersions
like other modules.
Apply this diff:
.settings(
+ crossScalaVersions := supportedVersions,
assembly / assemblyJarName := s"${name.value}-${version.value}.jar",
33-34
:
Update logging library versions to address security vulnerabilities
The current versions of slf4j-api
(1.7.36
) and logback-classic
(1.2.11
) have known security vulnerabilities. Consider updating to the latest stable versions.
Apply this diff:
lazy val slf4jApiVersion = "1.7.36"
lazy val logbackClassicVersion = "1.2.11"
+lazy val slf4jApiVersion = "2.0.12"
+lazy val logbackClassicVersion = "1.4.14"
306-306
:
Update Netty version to address potential security vulnerabilities
The current Netty version "4.1.111.Final"
may have known security issues. Consider updating to a newer version, such as "4.1.112.Final"
, to enhance security.
Apply this diff:
"io.netty" % "netty-all" % "4.1.111.Final",
+"io.netty" % "netty-all" % "4.1.112.Final",
public Builder addResult(Result result) { | ||
this.results.add(result); | ||
return this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add null validation in addResult method.
The addResult method should validate its input parameter.
public Builder addResult(Result result) {
+ if (result == null) {
+ throw new NullPointerException("result cannot be null");
+ }
this.results.add(result);
return this;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
public Builder addResult(Result result) { | |
this.results.add(result); | |
return this; | |
} | |
public Builder addResult(Result result) { | |
if (result == null) { | |
throw new NullPointerException("result cannot be null"); | |
} | |
this.results.add(result); | |
return this; | |
} |
public Builder status(Status status) { | ||
this.status = status; | ||
return this; | ||
} | ||
|
||
public Builder features(Map<String, Object> features) { | ||
this.features = features; | ||
return this; | ||
} | ||
|
||
public Builder entityKeys(Map<String, Object> entityKeys) { | ||
this.entityKeys = entityKeys; | ||
return this; | ||
} | ||
|
||
public Builder error(String error) { | ||
this.error = error; | ||
return this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation in Result builder methods.
The builder methods should validate their inputs to prevent invalid states.
public Builder status(Status status) {
+ if (status == null) {
+ throw new NullPointerException("status cannot be null");
+ }
this.status = status;
return this;
}
public Builder features(Map<String, Object> features) {
+ if (features != null) {
+ features.values().forEach(v -> {
+ if (v == null) {
+ throw new IllegalArgumentException("feature values cannot be null");
+ }
+ });
+ }
this.features = features;
return this;
}
public Builder entityKeys(Map<String, Object> entityKeys) {
+ if (entityKeys != null) {
+ entityKeys.values().forEach(v -> {
+ if (v == null) {
+ throw new IllegalArgumentException("entity key values cannot be null");
+ }
+ });
+ }
this.entityKeys = entityKeys;
return this;
}
public Builder error(String error) {
+ if (error != null && error.trim().isEmpty()) {
+ throw new IllegalArgumentException("error message cannot be empty");
+ }
this.error = error;
return this;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
public Builder status(Status status) { | |
this.status = status; | |
return this; | |
} | |
public Builder features(Map<String, Object> features) { | |
this.features = features; | |
return this; | |
} | |
public Builder entityKeys(Map<String, Object> entityKeys) { | |
this.entityKeys = entityKeys; | |
return this; | |
} | |
public Builder error(String error) { | |
this.error = error; | |
return this; | |
} | |
public Builder status(Status status) { | |
if (status == null) { | |
throw new NullPointerException("status cannot be null"); | |
} | |
this.status = status; | |
return this; | |
} | |
public Builder features(Map<String, Object> features) { | |
if (features != null) { | |
features.values().forEach(v -> { | |
if (v == null) { | |
throw new IllegalArgumentException("feature values cannot be null"); | |
} | |
}); | |
} | |
this.features = features; | |
return this; | |
} | |
public Builder entityKeys(Map<String, Object> entityKeys) { | |
if (entityKeys != null) { | |
entityKeys.values().forEach(v -> { | |
if (v == null) { | |
throw new IllegalArgumentException("entity key values cannot be null"); | |
} | |
}); | |
} | |
this.entityKeys = entityKeys; | |
return this; | |
} | |
public Builder error(String error) { | |
if (error != null && error.trim().isEmpty()) { | |
throw new IllegalArgumentException("error message cannot be empty"); | |
} | |
this.error = error; | |
return this; | |
} |
List<JavaRequest> requests = maybeRequest.getValue(); | ||
CompletableFuture<List<JavaResponse>> resultsJavaFuture = | ||
entityType.equals(EntityType.GroupBy) ? fetcher.fetchGroupBys(requests) : fetcher.fetchJoin(requests); | ||
// wrap the Java future we get in a Vert.x Future to not block the worker thread | ||
Future<List<EntityKeyToValues>> maybeFeatureResponses = | ||
Future.fromCompletionStage(resultsJavaFuture) | ||
.map(result -> result.stream().map(FeaturesHandler::responseToPoJo) | ||
.collect(Collectors.toList())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout handling and request size limits
The feature fetching logic should include:
- Timeout handling for the CompletableFuture
- Maximum limit for bulk request size to prevent resource exhaustion
+ private static final int MAX_BULK_REQUESTS = 100;
+ private static final Duration REQUEST_TIMEOUT = Duration.ofSeconds(30);
List<JavaRequest> requests = maybeRequest.getValue();
+ if (requests.size() > MAX_BULK_REQUESTS) {
+ ctx.response()
+ .setStatusCode(400)
+ .putHeader("content-type", "application/json")
+ .end(new JsonObject().put("errors",
+ Collections.singletonList("Exceeded maximum bulk request size of " + MAX_BULK_REQUESTS)).encode());
+ return;
+ }
CompletableFuture<List<JavaResponse>> resultsJavaFuture =
entityType.equals(EntityType.GroupBy) ? fetcher.fetchGroupBys(requests) : fetcher.fetchJoin(requests);
+ resultsJavaFuture = resultsJavaFuture.orTimeout(REQUEST_TIMEOUT.toMillis(), TimeUnit.MILLISECONDS);
Committable suggestion skipped: line range outside the PR's diff.
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
Outdated
Show resolved
Hide resolved
public static JTry<List<JavaRequest>> parseJavaRequest(String name, RequestBody body) { | ||
TypeReference<List<Map<String, Object>>> ref = new TypeReference<List<Map<String, Object>>>() { }; | ||
try { | ||
List<Map<String, Object>> entityKeysList = objectMapper.readValue(body.asString(), ref); | ||
List<JavaRequest> requests = entityKeysList.stream().map(m -> new JavaRequest(name, m)).collect(Collectors.toList()); | ||
return JTry.success(requests); | ||
} catch (Exception e) { | ||
return JTry.failure(e); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance request parsing with validation and specific error handling
The parseJavaRequest
method could benefit from:
- Validation of parsed entity keys
- More specific exception handling to provide better error messages
public static JTry<List<JavaRequest>> parseJavaRequest(String name, RequestBody body) {
TypeReference<List<Map<String, Object>>> ref = new TypeReference<List<Map<String, Object>>>() { };
try {
List<Map<String, Object>> entityKeysList = objectMapper.readValue(body.asString(), ref);
+ // Validate entity keys
+ for (Map<String, Object> keys : entityKeysList) {
+ if (keys == null || keys.isEmpty()) {
+ throw new IllegalArgumentException("Entity keys cannot be null or empty");
+ }
+ // Add additional validation rules as needed
+ }
List<JavaRequest> requests = entityKeysList.stream()
.map(m -> new JavaRequest(name, m))
.collect(Collectors.toList());
return JTry.success(requests);
- } catch (Exception e) {
+ } catch (JsonParseException e) {
+ return JTry.failure(new IllegalArgumentException("Invalid JSON format"));
+ } catch (JsonMappingException e) {
+ return JTry.failure(new IllegalArgumentException("Invalid request structure"));
+ } catch (IllegalArgumentException e) {
+ return JTry.failure(e);
+ } catch (Exception e) {
+ logger.error("Unexpected error parsing request", e);
return JTry.failure(e);
}
}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
build.sbt (1)
297-314
: Group dependencies by purposeConsider organizing dependencies into logical groups for better maintainability:
libraryDependencies ++= Seq( + // Vert.x core dependencies "io.vertx" % "vertx-core" % "4.5.10", "io.vertx" % "vertx-web" % "4.5.10", "io.vertx" % "vertx-config" % "4.5.10", + + // Logging dependencies "ch.qos.logback" % "logback-classic" % logbackClassicVersion, "org.slf4j" % "slf4j-api" % slf4jApiVersion, + + // Configuration "com.typesafe" % "config" % "1.4.3", + + // Netty dependencies "io.netty" % "netty-all" % "4.1.111.Final", + + // Metrics dependencies "io.vertx" % "vertx-micrometer-metrics" % "4.5.10", "io.micrometer" % "micrometer-registry-statsd" % "1.13.6", + + // Test dependencies "junit" % "junit" % "4.13.2" % Test, "com.novocode" % "junit-interface" % "0.11" % Test, "mockito" % "mockito-core" % "5.12.0" % Test, "io.vertx" % "vertx-unit" % "4.5.10" % Test, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
build.sbt
(3 hunks)
🔇 Additional comments (6)
build.sbt (6)
58-58
: LGTM: Service project correctly added to root aggregate
The service project is properly integrated into the build system.
283-284
: LGTM: Consistent logging versions
Good use of version variables to ensure consistency across projects.
288-290
: Add cross-compilation settings
The service module should specify cross-compilation settings like other modules to ensure compatibility across Scala versions.
291-291
: Remove duplicate assemblyJarName setting
The assemblyJarName is defined twice in the service project settings.
Also applies to: 316-316
306-306
: Update Netty version
The current Netty version might have security vulnerabilities.
323-335
: LGTM: Merge strategy properly configured
The merge strategy is well-defined and handles common conflict cases appropriately.
"io.vertx" % "vertx-web" % "4.5.10", | ||
"io.vertx" % "vertx-config" % "4.5.10", | ||
"ch.qos.logback" % "logback-classic" % logbackClassicVersion, | ||
"org.slf4j" % "slf4j-api" % slf4jApiVersion, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're at the point of commiting to web frameworks, could we make a call on the logging implementation we're using project-wide? If we're using logback for app logging we might as well do the same for spark etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although I'd slightly prefer log4j2 (for the sole reason of not wanting to write xml
😬 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind swapping to logj2 if we can mimic all the settings we have going in the logback.xml. I like the settings to limit file sizes, roll over, write to log + stdout async. I don't really think its the end of the world though if we do have different configs for logging for spark jobs & services. The considerations are fairly different so its fine if they're separate I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as we don't combine classpaths between hub and spark in the final version of this it's 👍 with me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets do log4j2 everywhere - preferably with mostly the same settings/pattern file in all environments. Hunting these log lib class path inconsistencies is gnarly. Also less things for folks to learn.
I like logback's functionality and log4j2 syntax. I think spark likes log4j2 as well - as their recommended logging lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark uses slf4j so it's technically compatible with either - although if we were doing logback for spark we have to be sure to do the correct classpath sanitization in the spark builds. I do like log4j2's syntax. They do have size-based appending: https://logging.apache.org/log4j/2.x/manual/appenders/rolling-file.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets address this in a follow up - I don't mind swapping to log4j2 since you folks seem to be in favor of it. I don't want to work through the testing of the configs etc again as part of the critical path of this PR. It look a little iterating and I don't think it's worth re-spending the time immediately, I / someone can pick this up async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - want to defer to @nikhil-zlai probably for a more contextual review.
@@ -30,6 +30,8 @@ lazy val jackson_2_15 = "2.15.2" | |||
lazy val avro_1_11 = "1.11.2" | |||
lazy val circeVersion = "0.14.9" | |||
lazy val deltaVersion = "3.2.0" | |||
lazy val slf4jApiVersion = "2.0.12" | |||
lazy val logbackClassicVersion = "1.5.6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets do log4j2 everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with log4j2 though I do want to ensure the settings we've included in the logback.xml are mimicked correctly in the log4j2 props as many of them are useful for prod safety and perf (async logging, log rotation on file sizes etc). If we want to swap I'll propose doing this in a follow up to not drag this PR out.
); | ||
|
||
// Load and instantiate the API implementation | ||
Class<?> apiClass = Class.forName(className, true, apiClassLoader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be null as we create a Optional.ofNullable in the cfg store and check the optional is present else throw an ex. It could be empty / misconfigured in which case we'd throw an error at startup.
public Optional<String> getOnlineJar() { | ||
return Optional.ofNullable(jsonConfig.getString(ONLINE_JAR)); | ||
} | ||
|
||
public Optional<String> getOnlineClass() { | ||
return Optional.ofNullable(jsonConfig.getString(ONLINE_CLASS)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should never be null right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be if we forget to configure I think
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
Outdated
Show resolved
Hide resolved
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Lets replace the occurrences of "Feature" into something more general and we are good to go.
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
Outdated
Show resolved
Hide resolved
service/src/main/java/ai/chronon/service/handlers/FeaturesHandler.java
Outdated
Show resolved
Hide resolved
42c2e28
to
4f61c89
Compare
@nikhil-zlai - good for a re-look when you get a chance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (8)
service/README.md (2)
47-49
: Enhance API documentation.The curl examples need more context:
- Add descriptions for each endpoint
- Document expected responses
- Include error scenarios and status codes
🧰 Tools
🪛 Markdownlint (0.35.0)
47-47: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
85-86
: Improve error documentation.The error response structure should be documented with:
- Example payloads for different error scenarios
- Detailed explanation of error codes
- Troubleshooting guidance
🧰 Tools
🪛 LanguageTool
[uncategorized] ~86-~86: You might be missing the article “the” here.
Context: ...200 (some / all successful lookups). In case of the 200 response, the payload looks ...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
service/src/main/java/ai/chronon/service/FetcherVerticle.java (4)
24-24
: Avoid Throwing Generic 'Exception' in Method SignaturesThrowing the generic
Exception
in method signatures reduces clarity about what exceptions might occur and can make error handling less precise.Consider removing
throws Exception
from thestartHttpServer
method if it's not necessary, and handle specific exceptions within the method. For thestart
method, since it's overriding theAbstractVerticle
method which can throwException
, it's acceptable.Apply this diff to refine the exception handling:
public class FetcherVerticle extends AbstractVerticle { @Override public void start(Promise<Void> startPromise) throws Exception { ConfigStore cfgStore = new ConfigStore(vertx); startHttpServer(cfgStore.getServerPort(), cfgStore.encodeConfig(), ApiProvider.buildApi(cfgStore), startPromise); } - protected void startHttpServer(int port, String configJsonString, Api api, Promise<Void> startPromise) throws Exception { + protected void startHttpServer(int port, String configJsonString, Api api, Promise<Void> startPromise) { // Implementation }Also applies to: 29-29
57-60
: Avoid Variable Shadowing of 'server' in Lambda ExpressionThe lambda parameter
server
in the.onSuccess()
method shadows the class fieldserver
, which can cause confusion or unintended behavior.Consider renaming the lambda parameter to avoid shadowing:
server.requestHandler(router) .listen(port) - .onSuccess(server -> { - logger.info("HTTP server started on port {}", server.actualPort()); + .onSuccess(httpServer -> { + logger.info("HTTP server started on port {}", httpServer.actualPort()); startPromise.complete(); })
50-53
: Make HTTP Server Options ConfigurableCurrently, the TCP keep-alive and idle timeout settings are hardcoded. Making these options configurable enhances flexibility and allows adjustments without code changes.
Consider adding these configurations to your
ConfigStore
and applying them here:HttpServerOptions httpOptions = new HttpServerOptions() - .setTcpKeepAlive(true) - .setIdleTimeout(60); + .setTcpKeepAlive(cfgStore.isTcpKeepAliveEnabled()) + .setIdleTimeout(cfgStore.getIdleTimeout());Ensure that the
ConfigStore
class providesisTcpKeepAliveEnabled()
andgetIdleTimeout()
methods.
15-16
: Typographical Issue in Comment at Line 15There's an unnecessary
/
at the end of line 15 in the comment, which may be a typographical error.Apply this diff to correct the comment:
* Entry point for the Chronon fetcher endpoints. We wire up our API routes and configure and launch our HTTP service here. * We choose to use just 1 verticle for now as it allows us to keep things simple and we don't need to scale - * independently deploy different endpoint routes. + * or independently deploy different endpoint routes.service/src/test/java/ai/chronon/service/handlers/FetchHandlerTest.java (1)
285-285
: Remove redundant semicolonThere's an extra semicolon that should be removed to clean up the code.
Apply this diff:
- async.complete();; + async.complete();service/src/main/java/ai/chronon/service/ConfigStore.java (1)
73-77
: Optimize handling of empty API propertiesCurrently, when
apiProps
isnull
, a newHashMap
is created each timegetOnlineApiProps()
is called. Since this method may be called multiple times, it's more efficient to return a cached immutable empty map.Apply this diff to optimize the handling:
+ private static final Map<String, String> EMPTY_API_PROPS = Collections.emptyMap(); public Map<String, String> getOnlineApiProps() { JsonObject apiProps = jsonConfig.getJsonObject(ONLINE_API_PROPS); if (apiProps == null) { - return new HashMap<String, String>(); + return EMPTY_API_PROPS; } return apiProps.stream().collect(Collectors.toMap( Map.Entry::getKey, e -> String.valueOf(e.getValue()) )); }Don't forget to add the import:
import java.util.Collections;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (15)
.github/workflows/test_scala_no_spark.yaml
(1 hunks)build.sbt
(3 hunks)online/src/main/scala/ai/chronon/online/Metrics.scala
(2 hunks)service/README.md
(1 hunks)service/src/main/java/ai/chronon/service/ApiProvider.java
(1 hunks)service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
(1 hunks)service/src/main/java/ai/chronon/service/ConfigStore.java
(1 hunks)service/src/main/java/ai/chronon/service/FetcherVerticle.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FetchHandler.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FetcherRouter.java
(1 hunks)service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java
(1 hunks)service/src/main/resources/example_config.json
(1 hunks)service/src/main/resources/logback.xml
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FetchHandlerJsonSerDeTest.java
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FetchHandlerTest.java
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- service/src/main/resources/example_config.json
- service/src/main/java/ai/chronon/service/ChrononServiceLauncher.java
- service/src/main/resources/logback.xml
- online/src/main/scala/ai/chronon/online/Metrics.scala
- service/src/main/java/ai/chronon/service/model/GetFeaturesResponse.java
🧰 Additional context used
📓 Learnings (2)
service/src/main/java/ai/chronon/service/ApiProvider.java (2)
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#70
File: service/src/main/java/ai/chronon/service/ApiProvider.java:6-6
Timestamp: 2024-12-03T04:04:33.809Z
Learning: The import `scala.util.ScalaVersionSpecificCollectionsConverter` in `service/src/main/java/ai/chronon/service/ApiProvider.java` is correct and should not be flagged in future reviews.
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#70
File: service/src/main/java/ai/chronon/service/ApiProvider.java:25-25
Timestamp: 2024-12-03T04:04:28.349Z
Learning: In `ApiProvider.java`, the method `buildApi` should continue to throw `Exception` rather than specific exceptions.
service/src/main/java/ai/chronon/service/ConfigStore.java (1)
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#70
File: service/src/main/java/ai/chronon/service/ConfigStore.java:55-77
Timestamp: 2024-12-07T22:05:19.764Z
Learning: In the `ConfigStore` class, synchronization of getter methods accessing `jsonConfig` isn't necessary because `jsonConfig` is fully initialized before being accessed and isn't modified afterward.
🪛 yamllint (1.35.1)
.github/workflows/test_scala_no_spark.yaml
[error] 73-73: no new line character at the end of file
(new-line-at-end-of-file)
🪛 LanguageTool
service/README.md
[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...e web framework which supports HTTP and gRPC based services. ## Running locally To buil...
(BASED_HYPHEN)
[misspelling] ~14-~14: This word is normally spelled as one.
Context: ...# Running locally To build the service sub-module: ```bash ~/workspace/chronon $ sbt "pro...
(EN_COMPOUNDS_SUB_MODULE)
[misspelling] ~27-~27: This word is normally spelled as one.
Context: ...ite out a file in the target/scala-2.12 sub-directory. We can now use this to start up the f...
(EN_COMPOUNDS_SUB_DIRECTORY)
[uncategorized] ~86-~86: You might be missing the article “the” here.
Context: ...200 (some / all successful lookups). In case of the 200 response, the payload looks ...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
🪛 Markdownlint (0.35.0)
service/README.md
47-47: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
48-48: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
49-49: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
82-82: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
🔇 Additional comments (13)
.github/workflows/test_scala_no_spark.yaml (1)
70-73
: LGTM: Service test step is properly configured.
The new step for running service tests is correctly configured with:
- Consistent memory settings (
SBT_OPTS
) - Correct Scala version (2.12.18)
- Proper placement in the workflow sequence
🧰 Tools
🪛 yamllint (1.35.1)
[error] 73-73: no new line character at the end of file
(new-line-at-end-of-file)
service/README.md (1)
38-41
: Document configuration schema.
The example_config.json is referenced but its schema is not documented.
build.sbt (4)
33-34
: LGTM: Latest logging library versions.
The logging library versions are up-to-date and secure:
- slf4j-api: 2.0.12
- logback-classic: 1.5.6
291-291
: Remove duplicate assemblyJarName setting.
The assemblyJarName is defined twice in the service project settings.
Also applies to: 316-316
306-306
: Update Netty version.
The current Netty version might have security vulnerabilities.
301-302
: Consider standardizing logging implementation.
As discussed in previous comments, consider migrating to log4j2 for consistency across the project.
service/src/main/java/ai/chronon/service/handlers/FetcherRouter.java (1)
30-39
: LGTM!
The routing configuration correctly sets up the endpoints for group-by and join fetch operations. The use of handler functions enhances code modularity and readability.
service/src/test/java/ai/chronon/service/handlers/FetchHandlerJsonSerDeTest.java (3)
16-31
: LGTM!
The testParsingOfSimpleJavaRequests
method effectively verifies the parsing of valid Java requests and ensures that the data types of the parsed keys are correct.
33-44
: LGTM!
The testParsingInvalidRequest
method correctly checks the behavior when parsing an invalid JSON request, ensuring that the parsing failure is properly handled.
46-56
: LGTM!
The testParsingOneValidAndInvalidRequest
method appropriately tests the parsing failure when the request contains both valid and invalid entries, verifying robustness in error handling.
service/src/main/java/ai/chronon/service/ConfigStore.java (2)
55-57
:
Add port range validation in getServerPort()
The getServerPort()
method does not validate whether the retrieved port number falls within the valid range of 0
to 65535
. Using an invalid port number can cause the server to fail at runtime.
Apply this diff to add port range validation:
public int getServerPort() {
- return jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT);
+ int port = jsonConfig.getInteger(SERVER_PORT, DEFAULT_PORT);
+ if (port < 0 || port > 65535) {
+ throw new IllegalStateException("Invalid port number: " + port);
+ }
+ return port;
}
Likely invalid or redundant comment.
85-87
:
Sanitize sensitive information before encoding config
The encodeConfig()
method serializes the entire configuration, which may include sensitive information like API keys or passwords. Exposing these details can lead to security vulnerabilities.
Apply this diff to mask sensitive fields before encoding:
public String encodeConfig() {
+ // Create a copy and remove sensitive fields
+ JsonObject sanitizedConfig = jsonConfig.copy();
+ if (sanitizedConfig.containsKey(ONLINE_API_PROPS)) {
+ sanitizedConfig.put(ONLINE_API_PROPS, "****");
+ }
+ return sanitizedConfig.encodePrettily();
- return jsonConfig.encodePrettily();
}
Likely invalid or redundant comment.
service/src/main/java/ai/chronon/service/ApiProvider.java (1)
47-57
: 🛠️ Refactor suggestion
Ensure correct constructor is used for API instantiation
The current implementation uses apiClass.getConstructors()[0]
to instantiate the API class, which assumes that the first constructor matches the expected parameters. This may lead to errors if the class has multiple constructors in a different order.
Apply this diff to explicitly obtain the correct constructor:
+import java.lang.reflect.Constructor;
+import java.lang.reflect.InvocationTargetException;
// Load and instantiate the API implementation
Class<?> apiClass = Class.forName(className, true, apiClassLoader);
if (!Api.class.isAssignableFrom(apiClass)) {
throw new IllegalArgumentException(
"Class " + className + " does not extend the Api abstract class"
);
}
+// Obtain the constructor that accepts scala.collection.immutable.Map
+Constructor<?> constructor = apiClass.getConstructor(scala.collection.immutable.Map.class);
+try {
+ return (Api) constructor.newInstance(scalaPropsMap);
+} catch (InstantiationException | IllegalAccessException | InvocationTargetException e) {
+ throw new RuntimeException("Failed to instantiate API class: " + className, e);
+}
-return (Api) apiClass.getConstructors()[0].newInstance(scalaPropsMap);
// Add route to show current configuration | ||
router.get("/config").handler(ctx -> { | ||
ctx.response() | ||
.putHeader("content-type", "application/json") | ||
.end(configJsonString); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical Security Issue: Exposing Configuration via '/config' Endpoint
Exposing the application's configuration through the /config
endpoint may lead to sensitive information disclosure, including secrets, API keys, or database credentials. This poses a significant security risk.
Consider removing this endpoint or securing it properly. If it's necessary for debugging purposes, ensure that it is not accessible in production environments or requires authentication and only returns non-sensitive information.
Apply this diff to remove the /config
endpoint:
--- a/service/src/main/java/ai/chronon/service/FetcherVerticle.java
+++ b/service/src/main/java/ai/chronon/service/FetcherVerticle.java
@@ -42,14 +42,6 @@ public class FetcherVerticle extends AbstractVerticle {
// Health check route
router.get("/ping").handler(ctx -> {
ctx.json("Pong!");
});
- // Add route to show current configuration
- router.get("/config").handler(ctx -> {
- ctx.response()
- .putHeader("content-type", "application/json")
- .end(configJsonString);
- });
-
// Start HTTP server
HttpServerOptions httpOptions =
new HttpServerOptions()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Add route to show current configuration | |
router.get("/config").handler(ctx -> { | |
ctx.response() | |
.putHeader("content-type", "application/json") | |
.end(configJsonString); | |
}); | |
// Health check route | |
router.get("/ping").handler(ctx -> { | |
ctx.json("Pong!"); | |
}); |
List<String> failureMessages = Collections.singletonList(err.getMessage()); | ||
ctx.response() | ||
.setStatusCode(500) | ||
.putHeader("content-type", "application/json") | ||
.end(new JsonObject().put("errors", failureMessages).encode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid exposing internal exception messages to clients
Exposing internal errors can leak sensitive information. Replace the exception message with a generic error message to prevent potential security risks.
Apply this diff to fix the issue:
- List<String> failureMessages = Collections.singletonList(err.getMessage());
+ List<String> failureMessages = Collections.singletonList("Internal server error.");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
List<String> failureMessages = Collections.singletonList(err.getMessage()); | |
ctx.response() | |
.setStatusCode(500) | |
.putHeader("content-type", "application/json") | |
.end(new JsonObject().put("errors", failureMessages).encode()); | |
List<String> failureMessages = Collections.singletonList("Internal server error."); | |
ctx.response() | |
.setStatusCode(500) | |
.putHeader("content-type", "application/json") | |
.end(new JsonObject().put("errors", failureMessages).encode()); |
List<String> errorMessages = Collections.singletonList(maybeRequest.getException().getMessage()); | ||
ctx.response() | ||
.setStatusCode(400) | ||
.putHeader("content-type", "application/json") | ||
.end(new JsonObject().put("errors", errorMessages).encode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid exposing internal exception messages to clients
Returning internal exception messages to clients can reveal sensitive information and potentially lead to security vulnerabilities. Consider returning a generic error message instead.
Apply this diff to fix the issue:
- List<String> errorMessages = Collections.singletonList(maybeRequest.getException().getMessage());
+ List<String> errorMessages = Collections.singletonList("Invalid request body.");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
List<String> errorMessages = Collections.singletonList(maybeRequest.getException().getMessage()); | |
ctx.response() | |
.setStatusCode(400) | |
.putHeader("content-type", "application/json") | |
.end(new JsonObject().put("errors", errorMessages).encode()); | |
List<String> errorMessages = Collections.singletonList("Invalid request body."); | |
ctx.response() | |
.setStatusCode(400) | |
.putHeader("content-type", "application/json") | |
.end(new JsonObject().put("errors", errorMessages).encode()); |
// Assert results | ||
vertx.setTimer(1000, id -> { | ||
verify(response).setStatusCode(200); | ||
verify(response).putHeader("content-type", "application/json"); | ||
verify(response).end(responseCaptor.capture()); | ||
|
||
// Verify response format | ||
JsonObject actualResponse = new JsonObject(responseCaptor.getValue()); | ||
GetFeaturesResponse.Result expectedResult = GetFeaturesResponse.Result.builder().status(Success).entityKeys(keys).features(featureMap).build(); | ||
validateSuccessfulResponse(actualResponse, Collections.singletonList(expectedResult), context); | ||
async.complete(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid using fixed delays in tests for asynchronous operations
Using vertx.setTimer
with a fixed delay can lead to flaky tests and slow execution. Instead, use asynchronous handlers or callbacks to perform assertions once the asynchronous operation completes.
Refactor the test to synchronize using the Async
object or callbacks. For example:
// Trigger call
handler.handle(routingContext);
// Perform assertions when response.end() is called
doAnswer(invocation -> {
// Assertions
verify(response).setStatusCode(200);
verify(response).putHeader("content-type", "application/json");
// Capture and validate the response
verify(response).end(responseCaptor.capture());
JsonObject actualResponse = new JsonObject(responseCaptor.getValue());
// Existing validation logic...
async.complete();
return null;
}).when(response).end(anyString());
// Assert results | ||
vertx.setTimer(1000, id -> { | ||
verify(response).setStatusCode(400); | ||
verify(response).putHeader("content-type", "application/json"); | ||
verify(response).end(responseCaptor.capture()); | ||
|
||
// Verify response format | ||
validateFailureResponse(responseCaptor.getValue(), context); | ||
async.complete(); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid using fixed delays in tests for asynchronous operations
Similar to the previous comment, refactor to eliminate vertx.setTimer
and use callbacks or handlers to ensure reliable and efficient test execution.
Apply the refactored approach as shown in the previous comment.
// Assert results | ||
vertx.setTimer(1000, id -> { | ||
verify(response).setStatusCode(500); | ||
verify(response).putHeader("content-type", "application/json"); | ||
verify(response).end(responseCaptor.capture()); | ||
|
||
// Verify response format | ||
validateFailureResponse(responseCaptor.getValue(), context); | ||
async.complete(); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid using fixed delays in tests for asynchronous operations
Consistently refactor the asynchronous test handling to improve reliability and performance.
Refactor this test method using the approach outlined earlier.
// Assert results | ||
vertx.setTimer(1000, id -> { | ||
verify(response).setStatusCode(200); | ||
verify(response).putHeader("content-type", "application/json"); | ||
verify(response).end(responseCaptor.capture()); | ||
|
||
// Verify response format | ||
JsonObject actualResponse = new JsonObject(responseCaptor.getValue()); | ||
GetFeaturesResponse.Result expectedResult1 = GetFeaturesResponse.Result.builder().status(Success).entityKeys(keys1).features(featureMap1).build(); | ||
GetFeaturesResponse.Result expectedResult2 = GetFeaturesResponse.Result.builder().status(Success).entityKeys(keys2).features(featureMap2).build(); | ||
|
||
List<GetFeaturesResponse.Result> expectedResultList = new ArrayList<GetFeaturesResponse.Result>() {{ | ||
add(expectedResult1); | ||
add(expectedResult2); | ||
}}; | ||
validateSuccessfulResponse(actualResponse, expectedResultList, context); | ||
async.complete(); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid using fixed delays in tests for asynchronous operations
Refactor to remove the fixed delay and synchronize the test using asynchronous callbacks.
Modify the test as previously suggested to handle asynchronous operations properly.
// Assert results | ||
vertx.setTimer(1000, id -> { | ||
verify(response).setStatusCode(200); | ||
verify(response).putHeader("content-type", "application/json"); | ||
verify(response).end(responseCaptor.capture()); | ||
|
||
// Verify response format | ||
JsonObject actualResponse = new JsonObject(responseCaptor.getValue()); | ||
GetFeaturesResponse.Result expectedResult1 = GetFeaturesResponse.Result.builder().status(Success).entityKeys(keys1).features(featureMap).build(); | ||
GetFeaturesResponse.Result expectedResult2 = GetFeaturesResponse.Result.builder().status(Failure).entityKeys(keys2).error("some failure!").build(); | ||
List<GetFeaturesResponse.Result> expectedResultList = new ArrayList<GetFeaturesResponse.Result>() {{ | ||
add(expectedResult1); | ||
add(expectedResult2); | ||
}}; | ||
validateSuccessfulResponse(actualResponse, expectedResultList, context); | ||
async.complete();; | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid using fixed delays in tests for asynchronous operations
Ensure all test methods follow best practices for asynchronous testing to prevent flaky tests.
Refactor this test method using callbacks or handlers as demonstrated earlier.
service/src/main/java/ai/chronon/service/handlers/FetcherRouter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit but lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
service/src/main/java/ai/chronon/service/handlers/FetchRouter.java (2)
16-28
: Simplify Code by Using Lambda ExpressionsThe inner classes
GroupByFetcherFunction
andJoinFetcherFunction
can be replaced with lambda expressions to make the code more concise and maintainable.Apply this diff to simplify the code:
- public static class GroupByFetcherFunction implements BiFunction<JavaFetcher, List<JavaRequest>, CompletableFuture<List<JavaResponse>>> { - @Override - public CompletableFuture<List<JavaResponse>> apply(JavaFetcher fetcher, List<JavaRequest> requests) { - return fetcher.fetchGroupBys(requests); - } - } - - public static class JoinFetcherFunction implements BiFunction<JavaFetcher, List<JavaRequest>, CompletableFuture<List<JavaResponse>>> { - @Override - public CompletableFuture<List<JavaResponse>> apply(JavaFetcher fetcher, List<JavaRequest> requests) { - return fetcher.fetchJoin(requests); - } - }Then update the route handlers in the
createFetchRoutes
method:router.post("/groupby/:name").handler(new FetchHandler(fetcher, new GroupByFetcherFunction())); router.post("/join/:name").handler(new FetchHandler(fetcher, new JoinFetcherFunction())); + // Simplify using lambda expressions + router.post("/groupby/:name").handler(new FetchHandler(fetcher, (f, reqs) -> f.fetchGroupBys(reqs))); + router.post("/join/:name").handler(new FetchHandler(fetcher, (f, reqs) -> f.fetchJoin(reqs)));
33-33
: Avoid Hardcoding Service NamesConsider externalizing the service name
"feature-service"
to a configuration file or a constant to improve maintainability and flexibility.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
service/src/main/java/ai/chronon/service/FetcherVerticle.java
(1 hunks)service/src/main/java/ai/chronon/service/handlers/FetchRouter.java
(1 hunks)service/src/test/java/ai/chronon/service/handlers/FetchHandlerTest.java
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- service/src/test/java/ai/chronon/service/handlers/FetchHandlerTest.java
🔇 Additional comments (4)
service/src/main/java/ai/chronon/service/FetcherVerticle.java (3)
24-27
: Initialization Logic is Correct
The start
method correctly initializes the configuration store and starts the HTTP server with the provided settings.
67-83
: Graceful Shutdown Implemented Properly
The stop
method gracefully shuts down the HTTP server and handles both success and failure cases appropriately.
42-47
:
Critical Security Issue: Exposing Configuration via '/config' Endpoint
Exposing the application's configuration through the /config
endpoint may lead to sensitive information disclosure, including secrets, API keys, or database credentials. This poses a significant security risk.
Consider removing this endpoint or securing it properly. If it's necessary for debugging purposes, ensure that it is not accessible in production environments or requires authentication and only returns non-sensitive information.
Apply this diff to remove the /config
endpoint:
- // Add route to show current configuration
- router.get("/config").handler(ctx -> {
- ctx.response()
- .putHeader("content-type", "application/json")
- .end(configJsonString);
- });
service/src/main/java/ai/chronon/service/handlers/FetchRouter.java (1)
30-39
: Routes Configured Appropriately
The routes for /groupby/:name
and /join/:name
are correctly set up using FetchHandler
, and the router is properly initialized.
## Summary Based on OSS PR - airbnb/chronon#873 ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced the Chronon Feature Fetching Service with capabilities for HTTP and gRPC services. - Added a new `service` project to enhance the overall project structure. - Implemented a new HTTP server for handling fetcher endpoints. - Added configuration options for metrics reporting and logging. - **Bug Fixes** - Improved error handling for fetch requests and configuration loading. - **Documentation** - Updated README to provide instructions for building and running the new service. - Added example configuration file for service setup. - **Tests** - Added comprehensive unit tests for the new fetch handler and its JSON parsing functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Based on OSS PR - airbnb/chronon#873 ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced the Chronon Feature Fetching Service with capabilities for HTTP and gRPC services. - Added a new `service` project to enhance the overall project structure. - Implemented a new HTTP server for handling fetcher endpoints. - Added configuration options for metrics reporting and logging. - **Bug Fixes** - Improved error handling for fetch requests and configuration loading. - **Documentation** - Updated README to provide instructions for building and running the new service. - Added example configuration file for service setup. - **Tests** - Added comprehensive unit tests for the new fetch handler and its JSON parsing functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Based on OSS PR - airbnb/chronon#873 ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced the Chronon Feature Fetching Service with capabilities for HTTP and gRPC services. - Added a new `service` project to enhance the overall project structure. - Implemented a new HTTP server for handling fetcher endpoints. - Added configuration options for metrics reporting and logging. - **Bug Fixes** - Improved error handling for fetch requests and configuration loading. - **Documentation** - Updated README to provide instructions for building and running the new service. - Added example configuration file for service setup. - **Tests** - Added comprehensive unit tests for the new fetch handler and its JSON parsing functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Based on OSS PR - airbnb/chronon#873 ## Cheour clientslist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced the Chronon Feature Fetching Service with capabilities for HTTP and gRPC services. - Added a new `service` project to enhance the overall project structure. - Implemented a new HTTP server for handling fetcher endpoints. - Added configuration options for metrics reporting and logging. - **Bug Fixes** - Improved error handling for fetch requests and configuration loading. - **Documentation** - Updated README to provide instructions for building and running the new service. - Added example configuration file for service setup. - **Tests** - Added comprehensive unit tests for the new fetch handler and its JSON parsing functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Based on OSS PR - airbnb/chronon#873
Checklist
Summary by CodeRabbit
Release Notes
New Features
service
project to enhance the overall project structure.Bug Fixes
Documentation
Tests