Skip to content

SDN-4168: Fix IPsec tests for monitor failures #29437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

pperiyasamy
Copy link
Member

@pperiyasamy pperiyasamy commented Jan 14, 2025

This PR fixes following issues to stabilize monitor tests while running IPsec tests.

When IPsec tests configuring certificates into libreswan nss db for north south traffic via a machine config, it's rebooting worker nodes by default which still makes a monitor test to fail. Actually it is not required to reboot the nodes just for configuring certs on the nss db. Hence adding node disruption machine configuration policy so that nodes are not rebooted while deploying certificates on the worker nodes.

When IPsec mode are changed across tests within IPsec test suite, it causes reboot of ovnkube-node daemonset pods, It's expected workload traffic would fail temporarily until pods are settle down after IPsec is properly configured in every node's OVN and OvS across the cluster. So we should not test ipsec mode change in the ipsec test suite and instead for every ipsec mode, there should be one CI lane, then in the test corresponding configuration and traffic must be tested. So it merges everything with a single test which can be run from CI lanes for Full and External IPsec modes.

Depends on openshift/machine-config-operator#4864.

@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch 2 times, most recently from 0edd5b6 to dd54a1d Compare January 16, 2025 12:17
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

Copy link

openshift-trt bot commented Jan 16, 2025

Job Failure Risk Analysis for sha: dd54a1d

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial High
[sig-imageregistry][Serial] Image signature workflow can push a signed image to openshift registry and verify it [apigroup:user.openshift.io][apigroup:image.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/serial]
This test has passed 100.00% of 185 runs on jobs [periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-single-node-serial] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-ipsec-serial Low
[bz-kube-storage-version-migrator] clusteroperator/kube-storage-version-migrator should not change condition/Available
This test has passed 69.80% of 4076 runs on release 4.19 [Overall] in the last week.

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch from dd54a1d to e68a744 Compare January 24, 2025 15:56
Copy link

openshift-trt bot commented Feb 12, 2025

Job Failure Risk Analysis for sha: e68a744

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-ipsec-serial Low
[bz-kube-storage-version-migrator] clusteroperator/kube-storage-version-migrator should not change condition/Available
This test has passed 70.95% of 4743 runs on release 4.19 [Overall] in the last week.

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch from e68a744 to 744915f Compare February 17, 2025 08:23
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch from 744915f to cab8327 Compare February 17, 2025 10:41
@pperiyasamy
Copy link
Member Author

/assign @tssurya

@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

1 similar comment
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

Copy link

openshift-trt bot commented Feb 17, 2025

Job Failure Risk Analysis for sha: cab8327

Job Name Failure Risk
pull-ci-openshift-origin-master-okd-scos-e2e-aws-ovn High
[sig-arch] Only known images used by tests
This test has passed 100.00% of 28 runs on jobs [periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn] in the last 14 days.
pull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgrade IncompleteTests
Tests for this run (104) are below the historical average (1679): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node Medium
[sig-node] static pods should start after being created
This test has passed 94.12% of 34 runs on jobs [periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-single-node] in the last 14 days.

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch from cab8327 to 59fcece Compare February 17, 2025 17:39
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

2 similar comments
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

Copy link

openshift-trt bot commented Feb 18, 2025

Job Failure Risk Analysis for sha: f41d22e

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade IncompleteTests
Tests for this run (1994) are below the historical average (4443): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@pperiyasamy
Copy link
Member Author

/assign @huiran0826

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch from f41d22e to dba2246 Compare February 18, 2025 13:01
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

@pperiyasamy pperiyasamy force-pushed the ipsec-debug-monitor-test-failure branch 2 times, most recently from 8eaaaa4 to 5db298d Compare February 18, 2025 16:53
@pperiyasamy pperiyasamy changed the title [DNM] Disable external ipsec mode test Fix monitor test failures for IPsec serial e2e test Feb 18, 2025
@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

Copy link

openshift-trt bot commented Mar 10, 2025

Job Failure Risk Analysis for sha: c53288d

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-etcd-scaling Medium
[bz-etcd][invariant] alert/etcdMembersDown should not be at or above info
Potential external regression detected for High Risk Test analysis
pull-ci-openshift-origin-main-e2e-azure IncompleteTests
Tests for this run (23) are below the historical average (2288): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-kube-apiserver-rollout Low
[Conformance][Suite:openshift/kube-apiserver/rollout][Jira:"kube-apiserver"][sig-kube-apiserver] kube-apiserver should roll out new revisions without disruption [apigroup:config.openshift.io][apigroup:operator.openshift.io]
This test has passed 42.86% of 7 runs on release 4.19 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:informing Network:ovn NetworkStack:ipv4 Owner:eng Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week.

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New tests seen in this PR at sha: c53288d

  • "[sig-network][Feature:IPsec] when using openshift ovn-kubernetes check traffic with IPsec [apigroup:config.openshift.io] [Suite:openshift/network/ipsec]" [Total: 4, Pass: 4, Fail: 0, Flake: 0]

@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-ipsec-serial

Copy link
Contributor

openshift-ci bot commented Mar 11, 2025

@pperiyasamy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-kube-apiserver-rollout c53288d link false /test e2e-aws-ovn-kube-apiserver-rollout
ci/prow/e2e-openstack-ovn c53288d link false /test e2e-openstack-ovn
ci/prow/e2e-aws-disruptive c53288d link false /test e2e-aws-disruptive
ci/prow/e2e-azure c53288d link false /test e2e-azure
ci/prow/e2e-gcp-fips-serial c53288d link false /test e2e-gcp-fips-serial
ci/prow/okd-e2e-gcp c53288d link false /test okd-e2e-gcp
ci/prow/e2e-aws-ovn-etcd-scaling c53288d link false /test e2e-aws-ovn-etcd-scaling
ci/prow/4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback c53288d link false /test 4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback
ci/prow/e2e-azure-ovn-etcd-scaling c53288d link false /test e2e-azure-ovn-etcd-scaling
ci/prow/e2e-azure-ovn-upgrade c53288d link false /test e2e-azure-ovn-upgrade
ci/prow/e2e-metal-ipi-ovn-dualstack c53288d link false /test e2e-metal-ipi-ovn-dualstack
ci/prow/e2e-gcp-disruptive c53288d link false /test e2e-gcp-disruptive
ci/prow/e2e-vsphere-ovn-etcd-scaling c53288d link false /test e2e-vsphere-ovn-etcd-scaling
ci/prow/e2e-aws-ovn-single-node-upgrade c53288d link false /test e2e-aws-ovn-single-node-upgrade
ci/prow/e2e-metal-ipi-virtualmedia c53288d link false /test e2e-metal-ipi-virtualmedia
ci/prow/e2e-metal-ipi-serial-ovn-ipv6 c53288d link false /test e2e-metal-ipi-serial-ovn-ipv6
ci/prow/e2e-openstack-serial c53288d link false /test e2e-openstack-serial
ci/prow/e2e-metal-ipi-ovn-dualstack-local-gateway c53288d link false /test e2e-metal-ipi-ovn-dualstack-local-gateway
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 c53288d link false /test e2e-vsphere-ovn-dualstack-primaryv6
ci/prow/e2e-aws-ovn-ipsec-serial c53288d link false /test e2e-aws-ovn-ipsec-serial

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

openshift-trt bot commented Mar 11, 2025

Job Failure Risk Analysis for sha: c53288d

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-etcd-scaling Medium
[bz-etcd][invariant] alert/etcdMembersDown should not be at or above info
Potential external regression detected for High Risk Test analysis
pull-ci-openshift-origin-main-e2e-azure IncompleteTests
Tests for this run (23) are below the historical average (2173): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-kube-apiserver-rollout Low
[Conformance][Suite:openshift/kube-apiserver/rollout][Jira:"kube-apiserver"][sig-kube-apiserver] kube-apiserver should roll out new revisions without disruption [apigroup:config.openshift.io][apigroup:operator.openshift.io]
This test has passed 28.57% of 7 runs on release 4.19 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:informing Network:ovn NetworkStack:ipv4 Owner:eng Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week.

Open Bugs
Component Readiness: [kube-apiserver] [Other] test regressed

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New Test Risks for sha: c53288d

Job Name New Test Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-ipsec-serial High - "[sig-network][Feature:IPsec] when using openshift ovn-kubernetes check traffic with IPsec [apigroup:config.openshift.io] [Suite:openshift/network/ipsec]" is a new test that was not present in all runs against the current commit.

New tests seen in this PR at sha: c53288d

  • "[sig-network][Feature:IPsec] when using openshift ovn-kubernetes check traffic with IPsec [apigroup:config.openshift.io] [Suite:openshift/network/ipsec]" [Total: 5, Pass: 5, Fail: 0, Flake: 0]

@pperiyasamy pperiyasamy changed the title Fix IPsec tests for monitor failures SDN-4168: Fix IPsec tests for monitor failures Mar 12, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 12, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 12, 2025

@pperiyasamy: This pull request references SDN-4168 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This PR fixes following issues to stabilize monitor tests while running IPsec tests.

When IPsec tests configuring certificates into libreswan nss db for north south traffic via a machine config, it's rebooting worker nodes by default which still makes a monitor test to fail. Actually it is not required to reboot the nodes just for configuring certs on the nss db. Hence adding node disruption machine configuration policy so that nodes are not rebooted while deploying certificates on the worker nodes.

When IPsec mode are changed across tests within IPsec test suite, it causes reboot of ovnkube-node daemonset pods, It's expected workload traffic would fail temporarily until pods are settle down after IPsec is properly configured in every node's OVN and OvS across the cluster. So we should not test ipsec mode change in the ipsec test suite and instead for every ipsec mode, there should be one CI lane, then in the test corresponding configuration and traffic must be tested. So it merges everything with a single test which can be run from CI lanes for Full and External IPsec modes.

Depends on openshift/machine-config-operator#4864.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pperiyasamy
Copy link
Member Author

/hold cancel

we are tracking nmstate-operator failure with bug https://issues.redhat.com/browse/OCPBUGS-52845 and get it checked while testing IPsec CI lane from PR openshift/release#61740.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 12, 2025
@pperiyasamy
Copy link
Member Author

/assign @dgoodwin

@dgoodwin
Copy link
Contributor

All looks good but you appear to have renamed some tests, in which case for things to work best you should submit a test rename request here after this merges: https://github.com/openshift-eng/ci-test-mapping

Doing so will allow component readiness to track the new tests performance against it's old name in 4.18 at GA time.

/approve

Copy link
Contributor

openshift-ci bot commented Mar 12, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin, martinkennelly, pperiyasamy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 12, 2025
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 4c1d2ce and 2 for PR HEAD c53288d in total

1 similar comment
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 4c1d2ce and 2 for PR HEAD c53288d in total

@openshift-merge-bot openshift-merge-bot bot merged commit 59d86be into openshift:main Mar 12, 2025
34 of 54 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: openshift-enterprise-tests
This PR has been included in build openshift-enterprise-tests-container-v4.19.0-202503130217.p0.g59d86be.assembly.stream.el9.
All builds following this will include this PR.

pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Mar 13, 2025
As per changes in openshift/origin#29437 for IPsec E2E
tests, each IPsec mode Full and External must be tested with two different CI
lanes, so this commit replaces existing e2e-aws-ovn-ipsec-serial CI lane with
e2e-aws-ovn-ipsec-full-mode and e2e-aws-ovn-ipsec-external-mode CI lanes.

This commit also makes both jobs mandatory and periodic jobs which helps
to make IPsec eligible for component readiness.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Mar 19, 2025
As per changes in openshift/origin#29437 for IPsec E2E
tests, each IPsec mode Full and External must be tested with two different CI
lanes, so this commit replaces existing e2e-aws-ovn-ipsec-serial CI lane with
e2e-aws-ovn-ipsec-full-mode and e2e-aws-ovn-ipsec-external-mode CI lanes.

This commit also makes both jobs as default presubmit for cluster network operator.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Apr 4, 2025
As per changes in openshift/origin#29437 for IPsec E2E
tests, each IPsec mode Full and External must be tested with two different CI
lanes, so this commit replaces existing e2e-aws-ovn-ipsec-serial CI lane with
e2e-aws-ovn-ipsec-full-mode and e2e-aws-ovn-ipsec-external-mode CI lanes.

This commit also makes both jobs as default presubmit for cluster network operator.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Apr 11, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Apr 14, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
(cherry picked from commit f5df7d1)
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Apr 16, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request Apr 16, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
pperiyasamy added a commit to pperiyasamy/release that referenced this pull request May 28, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
openshift-merge-bot bot pushed a commit to openshift/release that referenced this pull request Jun 5, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
mehabhalodiya pushed a commit to mehabhalodiya/release that referenced this pull request Jun 12, 2025
As per changes in openshift/origin#29437 for IPsec E2E tests, each IPsec mode
Full and External must be tested separately, so this commit updates
openshift-e2e-test step with new test type called ipsec-suite and ipsec test
suite is executed under this test type for each ipsec modes.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants