Skip to content

[release-4.18] OCPBUGS-45005: Add retry to ccoctl gcp create functions #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #781

/assign openshift-ci-robot

The ccoct gcp create functions occassionaly fail when recently created
resources have not yet replicated in the cloud. This change adds retry
functionality to increase success rate when this happens.
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Detected clone of Jira Issue OCPBUGS-44933 with correct target version. Will retitle the PR to link to the clone.
/retitle [release-4.18] OCPBUGS-45005: Add retry to ccoctl gcp create functions

In response to this:

This is an automated cherry-pick of #781

/assign openshift-ci-robot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot changed the title [release-4.18] OCPBUGS-44933: Add retry to ccoctl gcp create functions [release-4.18] OCPBUGS-45005: Add retry to ccoctl gcp create functions Dec 4, 2024
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Dec 4, 2024
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-45005, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.18.0) matches configured target version for branch (4.18.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note type set to "Release Note Not Required"
  • dependent bug Jira Issue OCPBUGS-44933 is in the state MODIFIED, which is one of the valid states (MODIFIED, ON_QA, VERIFIED)
  • dependent Jira Issue OCPBUGS-44933 targets the "4.19.0" version, which is one of the valid target versions: 4.19.0
  • bug has dependents

Requesting review from QA contact:
/cc @jianping-shu

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is an automated cherry-pick of #781

/assign openshift-ci-robot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@huangmingxia
Copy link

Built image using cluster bot build openshift/cloud-credential-operator#792 - the verification was successful.

The logs show that there is a delay when creating resources, and the system will retry the resource creation.

12-04 17:14:36.696  [INFO] Creating workload identity and IAM roles
12-04 17:14:36.697  Running Command: ./ccoctl gcp create-all  --name='mihuang1204g-10378' --project 'openshift-qe' --region='us-central1' --credentials-requests-dir='/home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/cco-cred-requests' --output-dir '/home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts'
12-04 17:14:36.697  2024/12/04 09:14:36 Credentials loaded from environment variable "GOOGLE_CREDENTIALS", file "/home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/gcpcreds20241204-270-1na6lje"
12-04 17:14:36.697  2024/12/04 09:14:36 Generating RSA keypair
12-04 17:14:37.256  2024/12/04 09:14:37 Writing private key to /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/serviceaccount-signer.private
12-04 17:14:37.256  2024/12/04 09:14:37 Writing public key to /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/serviceaccount-signer.public
12-04 17:14:37.256  2024/12/04 09:14:37 Copying signing key for use by installer
12-04 17:14:37.814  2024/12/04 09:14:37 Workload identity pool created with name mihuang1204g-10378
12-04 17:14:38.376  2024/12/04 09:14:38 Bucket mihuang1204g-10378-oidc created
12-04 17:14:38.934  2024/12/04 09:14:38 Bucket mihuang1204g-10378-oidc is set to be publicly readable
12-04 17:14:38.934  2024/12/04 09:14:38 OpenID Connect discovery document in the S3 bucket mihuang1204g-10378-oidc at .well-known/openid-configuration updated
12-04 17:14:38.934  2024/12/04 09:14:38 Reading public key
12-04 17:14:38.934  2024/12/04 09:14:38 JSON web key set (JWKS) in the S3 bucket mihuang1204g-10378-oidc at keys.json updated
12-04 17:14:39.493  2024/12/04 09:14:39 workload identity provider created with name mihuang1204g-10378
12-04 17:14:39.493  2024/12/04 09:14:39 Wrote cluster authentication manifest at path /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/cluster-authentication-02-config.yaml
12-04 17:14:39.493  2024/12/04 09:14:39 Issuer URL (serviceAccountIssuer) is https://storage.googleapis.com/mihuang1204g-10378-oidc
12-04 17:14:39.493  2024/12/04 09:14:39 Ignoring CredentialsRequest openshift-cloud-credential-operator/openshift-cluster-api-gcp with tech-preview annotation
12-04 17:14:46.013  2024/12/04 09:14:45 IAM service account mihuang1204g-10378-openshift-gcp-ccm created
12-04 17:14:46.013  2024/12/04 09:14:45 Existing IAM custom role openshift-qe-openshift-gcp-ccm found, updating permissions
12-04 17:14:48.518  2024/12/04 09:14:48 Updated policy bindings for IAM service account mihuang1204g-10378-openshift-gcp-ccm
12-04 17:14:48.518  2024/12/04 09:14:48 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-cloud-controller-manager-gcp-ccm-cloud-credentials-credentials.yaml
12-04 17:14:53.761  2024/12/04 09:14:53 IAM service account mihuang1204g-10378-openshift-machine-api-gcp created
12-04 17:14:53.761  2024/12/04 09:14:53 Existing IAM custom role openshift-qe-openshift-machine-api-gcp found, updating permissions
12-04 17:14:53.761  2024/12/04 09:14:53 Unexpected permissions found on existing custom role openshift-qe-openshift-machine-api-gcp: compute.instanceGroups.use
12-04 17:14:55.146  2024/12/04 09:14:55 Unable to add predefined roles to IAM service account, retrying...
12-04 17:15:09.963  2024/12/04 09:15:07 Updated policy bindings for IAM service account mihuang1204g-10378-openshift-machine-api-gcp
12-04 17:15:09.963  2024/12/04 09:15:07 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-machine-api-gcp-cloud-credentials-credentials.yaml
12-04 17:15:13.218  2024/12/04 09:15:13 IAM service account mihuang1204g-10378-cloud-credential-operator-gcp-ro-creds created
12-04 17:15:13.472  2024/12/04 09:15:13 Existing IAM custom role openshift-qe-cloud-credential-operator-gcp-ro-creds found, updating permissions
12-04 17:15:13.472  2024/12/04 09:15:13 Unexpected permissions found on existing custom role openshift-qe-cloud-credential-operator-gcp-ro-creds: iam.roles.list
12-04 17:15:15.978  2024/12/04 09:15:15 Updated policy bindings for IAM service account mihuang1204g-10378-cloud-credential-operator-gcp-ro-creds
12-04 17:15:15.979  2024/12/04 09:15:15 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-cloud-credential-operator-cloud-credential-operator-gcp-ro-creds-credentials.yaml
12-04 17:15:21.223  2024/12/04 09:15:21 IAM service account mihuang1204g-10378-openshift-image-registry-gcs created
12-04 17:15:21.477  2024/12/04 09:15:21 Existing IAM custom role openshift-qe-openshift-image-registry-gcs found, updating permissions
12-04 17:15:24.730  2024/12/04 09:15:24 Updated policy bindings for IAM service account mihuang1204g-10378-openshift-image-registry-gcs
12-04 17:15:24.730  2024/12/04 09:15:24 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-image-registry-installer-cloud-credentials-credentials.yaml
12-04 17:15:29.984  2024/12/04 09:15:29 IAM service account mihuang1204g-10378-openshift-ingress-gcp created
12-04 17:15:30.240  2024/12/04 09:15:30 Existing IAM custom role openshift-qe-openshift-ingress-gcp found, updating permissions
12-04 17:15:31.598  2024/12/04 09:15:31 Unable to add predefined roles to IAM service account, retrying...
12-04 17:15:46.414  2024/12/04 09:15:43 Updated policy bindings for IAM service account mihuang1204g-10378-openshift-ingress-gcp
12-04 17:15:46.414  2024/12/04 09:15:43 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-ingress-operator-cloud-credentials-credentials.yaml
12-04 17:15:49.670  2024/12/04 09:15:49 IAM service account mihuang1204g-10378-openshift-cloud-network-config-controller-gcp created
12-04 17:15:49.925  2024/12/04 09:15:49 Existing IAM custom role openshift-qe-openshift-cloud-network-config-controller-gcp found, updating permissions
12-04 17:15:52.430  2024/12/04 09:15:51 Updated policy bindings for IAM service account mihuang1204g-10378-openshift-cloud-network-config-controller-gcp
12-04 17:15:52.430  2024/12/04 09:15:51 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-cloud-network-config-controller-cloud-credentials-credentials.yaml
12-04 17:15:57.658  2024/12/04 09:15:57 IAM service account mihuang1204g-10378-openshift-gcp-pd-csi-driver-operator created
12-04 17:15:57.912  2024/12/04 09:15:57 Existing IAM custom role openshift-qe-openshift-gcp-pd-csi-driver-operator found, updating permissions
12-04 17:16:00.417  2024/12/04 09:16:00 Updated policy bindings for IAM service account mihuang1204g-10378-openshift-gcp-pd-csi-driver-operator
12-04 17:16:00.417  2024/12/04 09:16:00 Saved credentials configuration to: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/pre_action/sts/manifests/openshift-cluster-csi-drivers-gcp-pd-cloud-credentials-credentials.yaml
12-04 17:16:00.418  cluster-authentication-02-config.yaml
12-04 17:16:00.418  openshift-cloud-controller-manager-gcp-ccm-cloud-credentials-credentials.yaml
12-04 17:16:00.418  openshift-cloud-credential-operator-cloud-credential-operator-gcp-ro-creds-credentials.yaml
12-04 17:16:00.418  openshift-cloud-network-config-controller-cloud-credentials-credentials.yaml
12-04 17:16:00.418  openshift-cluster-csi-drivers-gcp-pd-cloud-credentials-credentials.yaml
12-04 17:16:00.418  openshift-image-registry-installer-cloud-credentials-credentials.yaml
12-04 17:16:00.418  openshift-ingress-operator-cloud-credentials-credentials.yaml
12-04 17:16:00.418  openshift-machine-api-gcp-cloud-credentials-credentials.yaml
12-04 17:16:00.418  bound-service-account-signing-key.key
12-04 17:16:00.418  [INFO] >> Create gcp sts Resources Done.
12-04 17:16:00.418  [INFO] No any pre-action!
12-04 17:16:00.697  waiting for operation up to 36000 seconds..

The cluster was successfully installed.

12-04 18:00:20.333  NAME                                       VERSION                                                   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
12-04 18:00:20.333  authentication                             4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      3m41s   
12-04 18:00:20.333  baremetal                                  4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.334  cloud-controller-manager                   4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      33m     
12-04 18:00:20.334  cloud-credential                           4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      28m     
12-04 18:00:20.334  cluster-autoscaler                         4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      28m     
12-04 18:00:20.334  config-operator                            4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      30m     
12-04 18:00:20.334  console                                    4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      10m     
12-04 18:00:20.334  control-plane-machine-set                  4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      25m     
12-04 18:00:20.335  csi-snapshot-controller                    4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.335  dns                                        4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      28m     
12-04 18:00:20.335  etcd                                       4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      28m     
12-04 18:00:20.335  image-registry                             4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      14m     
12-04 18:00:20.335  ingress                                    4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      16m     
12-04 18:00:20.335  insights                                   4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.336  kube-apiserver                             4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      26m     
12-04 18:00:20.336  kube-controller-manager                    4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      26m     
12-04 18:00:20.336  kube-scheduler                             4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      26m     
12-04 18:00:20.336  kube-storage-version-migrator              4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.336  machine-api                                4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      20m     
12-04 18:00:20.336  machine-approver                           4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      28m     
12-04 18:00:20.337  machine-config                             4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.337  marketplace                                4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.337  monitoring                                 4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      11m     
12-04 18:00:20.337  network                                    4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      33m     
12-04 18:00:20.337  node-tuning                                4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      23m     
12-04 18:00:20.337  olm                                        4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.338  openshift-apiserver                        4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      22m     
12-04 18:00:20.338  openshift-controller-manager               4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      24m     
12-04 18:00:20.338  openshift-samples                          4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      20m     
12-04 18:00:20.338  operator-lifecycle-manager                 4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.338  operator-lifecycle-manager-catalog         4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.338  operator-lifecycle-manager-packageserver   4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      24m     
12-04 18:00:20.339  service-ca                                 4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      30m     
12-04 18:00:20.339  storage                                    4.18.0-0.ci.test-2024-12-04-090944-ci-ln-51b4vst-latest   True        False         False      29m     
12-04 18:00:20.339  NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
12-04 18:00:20.339  master   rendered-master-3636e11b46c8b1637cbc423c24c0f2d8   True      False      False      3              3                   3                     0                      29m
12-04 18:00:20.339  worker   rendered-worker-7e19b8cc8f200bcd124520ff224bc925   True      False      False      3              3                   3                     0                      29m

Copy link

codecov bot commented Dec 4, 2024

Codecov Report

Attention: Patch coverage is 28.57143% with 20 lines in your changes missing coverage. Please review.

Project coverage is 46.99%. Comparing base (a49adf6) to head (5279294).
Report is 2 commits behind head on release-4.18.

Files with missing lines Patch % Lines
...kg/cmd/provisioning/gcp/create_service_accounts.go 23.07% 9 Missing and 1 partial ⚠️
...visioning/gcp/create_workload_identity_provider.go 28.57% 9 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@               Coverage Diff                @@
##           release-4.18     #792      +/-   ##
================================================
- Coverage         47.03%   46.99%   -0.05%     
================================================
  Files                97       97              
  Lines             11835    11856      +21     
================================================
+ Hits               5567     5572       +5     
- Misses             5655     5671      +16     
  Partials            613      613              
Files with missing lines Coverage Δ
...md/provisioning/azure/create_managed_identities.go 57.71% <100.00%> (ø)
...kg/cmd/provisioning/gcp/create_service_accounts.go 52.50% <23.07%> (-1.31%) ⬇️
...visioning/gcp/create_workload_identity_provider.go 49.42% <28.57%> (-1.50%) ⬇️

@jstuever
Copy link
Contributor

jstuever commented Dec 4, 2024

/retest

@jstuever
Copy link
Contributor

jstuever commented Dec 4, 2024

/lgtm
/approve
/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Dec 4, 2024
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2024
Copy link
Contributor

openshift-ci bot commented Dec 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jstuever, openshift-cherrypick-robot

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 4, 2024
@jstuever
Copy link
Contributor

jstuever commented Dec 4, 2024

/test e2e-gcp-manual-oidc

@jstuever
Copy link
Contributor

jstuever commented Dec 4, 2024

/hold
for e2e-gcp-manual-oidc

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2024
Copy link
Contributor

openshift-ci bot commented Dec 4, 2024

@openshift-cherrypick-robot: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jstuever
Copy link
Contributor

jstuever commented Dec 4, 2024

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2024
@huangmingxia
Copy link

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Dec 5, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 4ed7424 into openshift:release-4.18 Dec 5, 2024
14 checks passed
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue OCPBUGS-45005: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-45005 has been moved to the MODIFIED state.

In response to this:

This is an automated cherry-pick of #781

/assign openshift-ci-robot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-cloud-credential-operator
This PR has been included in build ose-cloud-credential-operator-container-v4.18.0-202412042342.p0.g4ed7424.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants