-
Notifications
You must be signed in to change notification settings - Fork 435
OCPBUGS-56220: Ensure the build controller restarts on upgrade #5100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@umohnani8: No Jira issue with key OSPBUGS-56220 exists in the tracker at https://issues.redhat.com/. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@umohnani8: This pull request references Jira Issue OCPBUGS-56220, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@umohnani8: This pull request references Jira Issue OCPBUGS-56220, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good, just a couple of clarifying questions!
// The default values from k8s seem to be set after the resource is created, so set those in the new spec | ||
// we are comparing with to ensure we don't get false positives | ||
for i := range required.Spec.Containers { | ||
ctr := &required.Spec.Containers[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is the key piece, ensuring that the container image is up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to ensure we don't keep updating because the existing and required config don't match. During testing, I noticed that even though nothing was changing with the config, we would always end up in updating with ApplyDeployment
, which does a DeepEqual
to confirm whether something changed or not before updating. Looking through the differences of existing and required, I saw that we were using default values for the TerminationMessagePath
and ImagePullPolicy
, which get set after the resource is created so existing had it but required did not causing the DeepEqual
to not return true.
This is an attempt to populate those default fields before hand so we are not updating constantly because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, understood!
} | ||
klog.Infof("Starting Machine OS Builder pod because MachineConfigPool(s) opted into layering") | ||
return optr.startMachineOSBuilderDeployment(mob, layeredMCPs) | ||
if len(layeredMCPs) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an if block at the end of the function with len(layeredMCPs) != 0
, which won't be called with the return enclosed here. Perhaps we should move that into this block as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, fixed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me. I'll defer to @cheesesashimi for final tags since he wrote the original deployment code.
/retest |
3 similar comments
/retest |
/retest |
/retest |
Fix the reconcilation of the build controller to ensure that new changes are actually applied when there is an update to the deployment. Signed-off-by: Urvashi <[email protected]>
/retest |
1 similar comment
/retest |
/test e2e-gcp-op-ocl |
It looks good to me. I appreciate the comments you added in the changes to the |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheesesashimi, djoshy, umohnani8 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
Pre-merge verification:
Just a small doubt regarding this error seen in os-machine-builder pod and saw this mentioned in ticket too is this related to the upgrade ?
|
/cherry-pick release-4.19 |
@umohnani8: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
1 similar comment
/retest |
/retest-required |
1 similar comment
/override ci/prow/e2e-gcp-op-single-node Passed previous and is otherwise failing on CI issues, overriding |
@yuqi-zhang: Overrode contexts on behalf of yuqi-zhang: ci/prow/e2e-gcp-op-single-node In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/override ci/prow/e2e-gcp-op Same reason as above |
@yuqi-zhang: Overrode contexts on behalf of yuqi-zhang: ci/prow/e2e-gcp-op In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@umohnani8: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@umohnani8: Jira Issue OCPBUGS-56220: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-56220 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@umohnani8: new pull request created: #5127 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[ART PR BUILD NOTIFIER] Distgit: ose-machine-config-operator |
Closes https://issues.redhat.com/browse/OCPBUGS-56220
- What I did
Fix the reconcilation of the build controller to
ensure that new changes are actually applied when
there is an update to the deployment.
- How to verify it
Upgrade from 4.18 to 4.19, ensure that the upgrade is successful and the machine-os-builder pod is started with the new image.
- Description for the changelog
Ensure the build controller restarts when updates are detected.