-
Notifications
You must be signed in to change notification settings - Fork 1.9k
ci-operator/step-registry: Unify OPENSHIFT_*_RELEASE_IMAGE dependencies #12959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
openshift/ci-tools#1315 will give us rehearsals. |
openshift/ci-tools#1315 has landed and been published. /test pj-rehearse |
Hmm:
I dunno what that's about. I'll try again after #12966 lands to get the rehearsals down to a reasonable number. |
f2a7424
to
2784ae6
Compare
LGTM |
…grade-minor Some issues only turn up in minor-version bumps [1], and with this change machine-config developers will be able to comment on a pull request with: /test e2e-upgrade-minor to launch a job that installs the previous minor's OCP candidate release and updates to a release built from the in-flight pull request. Generated by manually editing ci-operator/config following [2], running: $ make update and editing the configurable always_run and optional properties [3]. WIP because I'm waiting on openshift#12959. I'll rebase this on top after that lands. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1890250#c5 [2]: https://github.com/openshift/ci-docs/blame/047e38285a227888733b1f5a49fd844d047d50f8/content/en/docs/architecture/ci-operator.md#L278-L283 [3]: https://github.com/openshift/ci-tools/blob/d45a03ce3a38d79a1b7bf82f35919505dc3d1c9e/GENERATOR.md#hand-edited-prow-configuration
Not sure what that's about, but it seems like it should be unrelated to my changes. |
and similar for e2e-vsphere-operator. These also seem orthogonal to my changes. |
Another e2e-vsphere-operator job:
Similar for odh-manifests-e2e, cluster-api-provider-gcp-master-e2e-gcp-operator, and openshift-cluster-samples-operator-master-e2e-aws-jenkins.
|
Whole bunch more who died in bootstrap waiting on a Kube API:
openshift-installer-master-e2e-vsphere-upi did too:
but may not have gathered logs (it also complains about |
Digging into the openshift-builder-master-e2e-aws-builds failure: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/12959/rehearse-12959-pull-ci-openshift-builder-master-e2e-aws-builds/1319045517484756992/artifacts/e2e-aws-builds/ipi-install-install/log-bundle-20201021231758.tar >log-bundle.tar.gz
$ tar xOz log-bundle-20201021231758/bootstrap/journals/bootkube.log <log-bundle.tar.gz | head -n6
-- Logs begin at Wed 2020-10-21 22:53:00 UTC, end at Wed 2020-10-21 23:17:59 UTC. --
Oct 21 22:53:46 ip-10-0-26-21 systemd[1]: Started Bootstrap a Kubernetes cluster.
Oct 21 22:53:56 ip-10-0-26-21 bootkube.sh[2218]: Moving OpenShift manifests in with the rest of them
Oct 21 22:53:56 ip-10-0-26-21 bootkube.sh[2218]: Rendering Cluster Version Operator Manifests...
Oct 21 22:53:56 ip-10-0-26-21 bootkube.sh[2218]: Rendering CEO Manifests...
Oct 21 22:54:01 ip-10-0-26-21 bootkube.sh[2218]: Error: unknown flag: --etcd-ca-key
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/12959/rehearse-12959-pull-ci-openshift-builder-master-e2e-aws-builds/1319045517484756992/artifacts/e2e-aws-builds/ipi-install-install/.openshift_install.log | grep 'Found override for release image.' So I'm somehow fumbling my attempt to export |
2784ae6
to
d4f03b5
Compare
Ok, more green with d4f03b5848e8. Summarizing the failures:
Drilling into openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/12959/rehearse-12959-pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6/1319069014600716288/build-log.txt | grep -o 'OPENSHIFT_.*RELEASE_IMAGE.*' | sort | uniq -c
10 OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=virthost.ostest.test.metalkube.org:5000/localimages/local-release-image:latest
24 OPENSHIFT_RELEASE_IMAGE=registry.build01.ci.openshift.org/ci-op-dc93bctg/release@sha256:557e4f6955e28f535f63c2b51c1c1636e279546b0fa44633b4e4ef4267c4327a
1 OPENSHIFT_RELEASE_IMAGE=registry.build01.ci.openshift.org/ci-op-dc93bctg/release@sha256:557e4f6955e28f535f63c2b51c1c1636e279546b0fa44633b4e4ef4267c4327a' Comparing with a production run: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-network-operator/761/pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6/1318951606636515328/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-devscripts-setup/container-logs/test.log | grep -o 'OPENSHIFT_.*RELEASE_IMAGE.*' | sort | uniq -c
10 OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=virthost.ostest.test.metalkube.org:5000/localimages/local-release-image:latest
24 OPENSHIFT_RELEASE_IMAGE=registry.build01.ci.openshift.org/ci-op-rihshcq3/release@sha256:473efc3ec597ff3bdb94214e71fb4727878f04859880116157be98119eb01028
1 OPENSHIFT_RELEASE_IMAGE=registry.build01.ci.openshift.org/ci-op-rihshcq3/release@sha256:473efc3ec597ff3bdb94214e71fb4727878f04859880116157be98119eb01028' So that seems reasonable. I dunno. As far as I can tell this is ready to merge, and I expect we'll hear about it if any of the rehearsal failures are because I broke something. |
I guess @stbenjam , @andfasano , or one of the other baremetalds approvers may want to weigh in on whether I've broken their job ;) |
The failures on My main concern (not necessarily blocking) is about introducing now such change in the baremetalds repo, while there is another in-flight PR #12779 that will require the |
Higher where? Setting defaults on the steps with the ability to override at both the workflow and job-config level, which is what we have today, covers all the use cases I can think of. |
Directly at the workflow level. As I said, it's not a major concern for me, since it's something that eventually could be reviewed later if really needed. |
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
d4f03b5
to
1cc2c8e
Compare
Rebased on master with d4f03b5848 -> 1cc2c8e0da. |
1cc2c8e
to
dd50db3
Compare
Tests generally install one OpenShift release, and then perform some actions to validate that release. Sometimes the actions include updating to a different release, or a series of different releases. With this commit, I'm pivoting to dependency naming that more clearly reflects these goals, instead of "RELEASE_IMAGE_LATEST", which was a reflection of the default release:latest value, and not an expression of the intended semantics. I've shifted to explicitly declaring these image dependencies, taking advantage of openshift/ci-tools@782008c873 (ci-operator: allow users to depend on images in steps, 2020-07-30, openshift/ci-tools#1044), instead of relying on the implicit RELEASE_IMAGE_LATEST and similar. I've dropped explicit dependencies that were not used in a step's associated commands, like RELEASE_IMAGE_LATEST in ipi-install since eb8eb32 (step-registry: add upgrade workflows, 2020-08-25, openshift#11247). I've dropped explicit dependency overrides that set the same value as the default. For example, there was no need to set: OPENSHIFT_UPGRADE_RELEASE_IMAGE: "release:latest" in openshift-upgrade-aws-loki, because release:latest is already the default in all steps that consume OPENSHIFT_UPGRADE_RELEASE_IMAGE. I've dropped the _OVERRIDE suffix from the dependencies, because while we may set that suffix when communicating with the openshift-install command, it's a legitimate knob and not a strange override in each steps' public API.
dd50db3
to
eb611c1
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wking The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -8,8 +8,6 @@ ref: | |||
memory: 2Gi | |||
dependencies: | |||
- name: "release:latest" | |||
env: OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vrutkovs , why was this not defaulting to stable:initial
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stable
imagestream doesn't have a release imagetag, but I'm fine with defaulting this to release:initial
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@@ -259,9 +259,6 @@ tests: | |||
- as: e2e-aws-selfupgrade | |||
steps: | |||
cluster_profile: aws | |||
dependencies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self upgrade tests need to have
OPENSHIFT_INSTALL_RELEASE_IMAGE: release:latest
OPENSHIFT_UPGRADE_RELEASE_IMAGE: release:initial
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
@wking: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Tests generally install one OpenShift release, and then perform some actions to validate that release. Sometimes the actions include updating to a different release, or a series of different releases. With this commit, I'm pivoting to dependency naming that more clearly reflects these goals, instead of
RELEASE_IMAGE_LATEST
, which was a reflection of the defaultrelease:latest
value, and not an expression of the intended semantics.I've shifted to explicitly declaring these image
dependencies
, taking advantage of openshift/ci-tools@782008c873 (openshift/ci-tools#1044), instead of relying on the implicitRELEASE_IMAGE_LATEST
and similar.I've dropped explicit
dependencies
that were not used in a step's associated commands, likeRELEASE_IMAGE_LATEST
in ipi-install since eb8eb32 (#11247).I've dropped explicit dependency overrides that set the same value as the default. For example, there was no need to set:
in
openshift-upgrade-aws-loki
, becauserelease:latest
is already the default in all steps that consumeOPENSHIFT_UPGRADE_RELEASE_IMAGE
.I've dropped the
_OVERRIDE
suffix from thedependencies
, because while we may set that suffix when communicating with theopenshift-install
command, it's a legitimate knob and not a strange override in each steps' public API.