-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump the Cruise Control version to 2.5.142 #11288
Conversation
Signed-off-by: ShubhamRwt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @ShubhamRwt! Could you add the link of the related issue in the PR description too?
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
@ppatierno @im-konge Hi, Can we re-run the failed regression. I am not sure if its related to this PR |
@ShubhamRwt the test that is failing is the one that you enabled again for the CC. Could you please check it? Thanks. |
Hi, the failure on this tests seems to be unrelated to Cruise Control. The CC is pod is not deployed during this whole test. It seems like some logging changes in kafka 4.0.0 is triggering this failure. I spent a lot of time today to understand the failure. It looks like the pods are suppose to roll when we change the specification of logging to from no logging related configuration to new external logging but with Kafka 4.0.0 there is no rolling of the pods happening and I don't see any error in the logs. I also noticed that in some places we have comments like I have multiple things to ask with respect to the observations:
One other question is should we have a separate issue to track this problem and we can merge this PR since its not associated with failure. |
As I mentioned offline (and as you said in the comment as well), if it's not related to the CC and you don't want to fix this now, you can create an issue so someone else will have a look at it. So yeah, in order to move this PR forward, I would just create a new issue, write there what you observed, add it to the assumption comment and leave it disabled for now. For example me or someone else will have a look at it later. |
Signed-off-by: ShubhamRwt <[email protected]>
@im-konge Hi, Can you run the regressions again on this, whenever you get some time? Thanks |
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
The failure seems to be related to CC sending request to old pod rather then sending the request to new Pod. This is something which is coming up a lot lately even in the community. Maybe we should have some discussion around it on how we can tackle this. But again I guess this is something not intended to fix along with this PR |
Is it a test that we have had issues with before? Is it something we can reproduce locally?
Looking at the error, it appears to be related to the issue here [1], I assume this is the issue from the community you are mentioning, correct?
If we've determined that this issue is unrelated to this PR, could we consider one of the following options? (A) Keep this test disabled, as we are already re-enabling it as part of this PR. We can address this specific issue in another PR. (B) Fix the system test - perhaps by adding a line to refresh the [1] #11195 |
Did you already make the update to this PR? Or did you want to discuss it further before making the changes? |
@kyguy not yet, I was hoping to get views from @im-konge @ppatierno on this |
@ShubhamRwt it failed for the first time I guess or? Would adding the annotation fix the issue? I was referring in the issue you and @kyguy mentioned to little bit different problem, which seems to be not related. TBH I don't really like the idea of just disabling each of the test and creating a lot of issues when you can have a look and fix it by one line (if it's the case). Same for the comment of the logging test -> you can just change the comment there that when Kafka 3.9.x will be dropped, we can remove the test -> and you don't need to leave it on me or someone else. The issue there about failing test is there for quite a long time, so you can try the refresh annotation if it really fix something, but anyway if it fails on the CC rolling update (as I mentioned in the issue comment), someone will actually have a look at it. |
systemtest/src/test/java/io/strimzi/systemtest/log/LoggingChangeST.java
Outdated
Show resolved
Hide resolved
I agree with you @im-konge, Lets for now add the |
Signed-off-by: ShubhamRwt <[email protected]>
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-server</artifactId> | ||
</exclusion> | ||
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-server-common</artifactId> | ||
</exclusion> | ||
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-metadata</artifactId> | ||
</exclusion> | ||
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-raft</artifactId> | ||
</exclusion> | ||
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-group-coordinator</artifactId> | ||
</exclusion> | ||
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-storage</artifactId> | ||
</exclusion> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you come up with these exclusions? They do not seem to be a dependcies in the first place, so why exclude them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we build the operator and push the images, I was getting a lot collisions w.r.t these dependencies/jars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see some of the dependencies added in the build.gradle
file for CC ->
api "org.apache.kafka:kafka-server:$kafkaVersion"
api "org.apache.kafka:kafka-server-common:$kafkaVersion"
api "org.apache.kafka:kafka-storage:$kafkaVersion"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should likely check the pom file(s) and not the Gradle build files which mean many things. Also, if you see only "some" of them, it confirms that this is wrong and should be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review/suggestion Jakub, I was able to remove extra exclusions since I saw kafka-server
is the only exclusion which was required. Itt was the one bringing in all the other stuff. Thanks again
Signed-off-by: ShubhamRwt <[email protected]>
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
So the issue we are facing in the commnunity needs more investigation but I think that applying "refresh" here should be enough. Anyway, I see the |
As I'm discussing with Shubham. Should we annotate it with |
so as agreed offline, we can add the check of adding then "refresh" annotation only when the |
Signed-off-by: ShubhamRwt <[email protected]>
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
<exclusion> | ||
<groupId>org.apache.kafka</groupId> | ||
<artifactId>kafka-raft</artifactId> | ||
</exclusion> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the above two kafka-metadata
and kafka-raft
being brought as dependencies of the others? Just asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, its just kafka-server
which was a new addition to the CC pom
Type of change
Select the type of your PR
Description
This PR fixes #11199. This PR bumps Cruise Control to 2.5.142 which is compatible with Kafka 4.0.0
Checklist
Please go through this checklist and make sure all applicable tasks have been done