Skip to content

OCPBUGS-42303: QE;DNM; Gate ovn-controller starting post reboot until ovnkube controller syncs #2722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

martinkennelly
Copy link
Contributor

TODO; Must investigate if this is the right approach for all deployments. We need IC enabled.

This commit fixes OCPBUGS-42303.
If ovn-controller starts before ovnkube-controller syncs and the changes propagated to SB DB, then ovn-controller will consume stale SB DB data. This PR gates starting ovn-controller until ovnkube controller syncs. ovnkube-controller emits a file to non-persistent storage and we predicate the start on this.

/hold
/cc

cc @huiran0826

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 10, 2025
Copy link
Contributor

openshift-ci bot commented Jun 10, 2025

@martinkennelly: GitHub didn't allow me to request PR reviews from the following users: martinkennelly.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

TODO; Must investigate if this is the right approach for all deployments. We need IC enabled.

This commit fixes OCPBUGS-42303.
If ovn-controller starts before ovnkube-controller syncs and the changes propagated to SB DB, then ovn-controller will consume stale SB DB data. This PR gates starting ovn-controller until ovnkube controller syncs. ovnkube-controller emits a file to non-persistent storage and we predicate the start on this.

/hold
/cc

cc @huiran0826

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@martinkennelly martinkennelly changed the title QE;DNM; Gate ovn-controller starting post reboot until ovnkube controller syncs OCPBUGS-42303: QE;DNM; Gate ovn-controller starting post reboot until ovnkube controller syncs Jun 10, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 10, 2025
@openshift-ci-robot
Copy link
Contributor

@martinkennelly: This pull request references Jira Issue OCPBUGS-42303, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

TODO; Must investigate if this is the right approach for all deployments. We need IC enabled.

This commit fixes OCPBUGS-42303.
If ovn-controller starts before ovnkube-controller syncs and the changes propagated to SB DB, then ovn-controller will consume stale SB DB data. This PR gates starting ovn-controller until ovnkube controller syncs. ovnkube-controller emits a file to non-persistent storage and we predicate the start on this.

/hold
/cc

cc @huiran0826

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Jun 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: martinkennelly
Once this PR has been reviewed and has the lgtm label, please assign jacobtanenbaum for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@martinkennelly
Copy link
Contributor Author

nb; must check if sno or other deployments aren't IC.
/hold

@martinkennelly
Copy link
Contributor Author

/retest

@martinkennelly martinkennelly force-pushed the gate-ovn-con branch 2 times, most recently from 408f6e8 to 0012fdb Compare June 11, 2025 12:17
This commit fixes OCPBUGS-42303.
If ovn-controller starts before ovnkube-controller syncs and the changes
propagated to SB DB, then ovn-controller will consume stale SB DB data.
This PR gates starting ovn-controller until ovnkube controller syncs.
ovnkube-controller emits a file to non-persistent storage and we
predicate the start on this.

Signed-off-by: Martin Kennelly <[email protected]>
@martinkennelly
Copy link
Contributor Author

/testwith openshift/ovn-kubernetes/master/e2e-aws-ovn-upgrade ovn-kubernetes/ovn-kubernetes#5315

Copy link
Contributor

openshift-ci bot commented Jun 24, 2025

@martinkennelly, testwith: Error processing request. ERROR:

could not determine job runs: couldn't get PR from GitHub: ovn-kubernetes/ovn-kubernetes#5315: Get "http://ghproxy/repos/ovn-kubernetes/ovn-kubernetes/pulls/5315": failed to get installation id for org ovn-kubernetes: the github app is not installed in organization ovn-kubernetes

@martinkennelly
Copy link
Contributor Author

/testwith openshift/ovn-kubernetes/master/e2e-aws-ovn-upgrade openshift/ovn-kubernetes#2626

@martinkennelly
Copy link
Contributor Author

/testwith openshift/ovn-kubernetes/master/e2e-aws-ovn openshift/ovn-kubernetes#2626

Copy link
Contributor

openshift-ci bot commented Jun 24, 2025

@martinkennelly: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-hypershift-ovn-kubevirt 58d2fa8 link false /test e2e-aws-hypershift-ovn-kubevirt
ci/prow/4.20-upgrade-from-stable-4.19-e2e-gcp-ovn-upgrade 58d2fa8 link false /test 4.20-upgrade-from-stable-4.19-e2e-gcp-ovn-upgrade
ci/prow/e2e-azure-ovn-upgrade 58d2fa8 link true /test e2e-azure-ovn-upgrade
ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration 58d2fa8 link false /test e2e-aws-ovn-shared-to-local-gateway-mode-migration
ci/prow/e2e-metal-ipi-ovn-ipv6 58d2fa8 link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-ovn-ipsec-step-registry 58d2fa8 link true /test e2e-ovn-ipsec-step-registry
ci/prow/security 58d2fa8 link false /test security
ci/prow/4.20-upgrade-from-stable-4.19-e2e-azure-ovn-upgrade 58d2fa8 link false /test 4.20-upgrade-from-stable-4.19-e2e-azure-ovn-upgrade
ci/prow/e2e-vsphere-ovn-dualstack 58d2fa8 link false /test e2e-vsphere-ovn-dualstack
ci/prow/e2e-aws-ovn-ipsec-upgrade 58d2fa8 link true /test e2e-aws-ovn-ipsec-upgrade
ci/prow/e2e-ovn-hybrid-step-registry 58d2fa8 link false /test e2e-ovn-hybrid-step-registry
ci/prow/e2e-gcp-ovn-upgrade 58d2fa8 link true /test e2e-gcp-ovn-upgrade
ci/prow/hypershift-e2e-aks 58d2fa8 link true /test hypershift-e2e-aks
ci/prow/e2e-aws-ovn-ipsec-serial 58d2fa8 link false /test e2e-aws-ovn-ipsec-serial
ci/prow/e2e-network-mtu-migration-ovn-ipv6 58d2fa8 link false /test e2e-network-mtu-migration-ovn-ipv6
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 58d2fa8 link false /test e2e-vsphere-ovn-dualstack-primaryv6
ci/prow/e2e-aws-ovn-serial 58d2fa8 link false /test e2e-aws-ovn-serial
ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration 58d2fa8 link false /test e2e-aws-ovn-local-to-shared-gateway-mode-migration
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp 58d2fa8 link true /test e2e-metal-ipi-ovn-dualstack-bgp
ci/prow/e2e-aws-ovn-upgrade 58d2fa8 link true /test e2e-aws-ovn-upgrade
ci/prow/4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade 58d2fa8 link false /test 4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade
ci/prow/e2e-openstack-ovn 58d2fa8 link false /test e2e-openstack-ovn
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw 58d2fa8 link true /test e2e-metal-ipi-ovn-dualstack-bgp-local-gw
ci/prow/e2e-network-mtu-migration-ovn-ipv4 58d2fa8 link false /test e2e-network-mtu-migration-ovn-ipv4

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants