Skip to content

BlueGreen Rollout Stuck in Progressing Blocking new Rollout #4294

Open
@Suraiya-Hameed

Description

@Suraiya-Hameed

Checklist:

  • I've included steps to reproduce the bug.
  • I've included the version of argo rollouts.

Describe the bug

While using BlueGreen strategy in Argo Rollouts (v1.8.2), I encountered a scenario where a rollout became stuck in a "processing" state and could not be recovered or promoted to stable. After triggering a new deployment, the new ReplicaSet is scaled fully and marked as preview, despite previewReplicaCount being set to 1. This new RS never gets promoted to stable/active, and the old RS remains marked as stable. I have attempted multiple recovery steps including:

  • Deploying new versions (new Helm releases)
  • Helm rollback
  • Manual promotion (kubectl argo rollouts promote)
  • Restarting the rollout

None of these actions have resolved the issue. I’m unable to deploy any new version, and the rollout remains stuck in this broken state.

This issue occurred once and I have not been able to reproduce it again. One notable event in metrics was a scaling event (likely via HPA or KEDA) that occurred during the rollout, which may be relevant. I have both prePromotionAnalysis and postPromotionAnalysis configured with auto-promotion enabled.

Both the preview and active/stable RS gets scaled on scaling event.

To Reproduce

Unfortunately, this issue is not consistently reproducible. Here’s what led up to the incident:

  1. Apply a rollout with BlueGreen strategy
  2. Ensure previewReplicaCount is set (in our case, to 1)
  3. Trigger a new rollout (e.g. Helm upgrade)
  4. Observe rollout gets stuck — new RS is fully scaled and stuck in preview
  5. Attempting manual promotion or restarting rollout fails to resolve

Expected behavior

New ReplicaSet should be marked as preview with only previewReplicaCount replicas, proceed through pre- and post-promotion analysis, and then be promoted to stable. Old ReplicaSet should be demoted and eventually scaled down.

Screenshots

Image

Version

1.8.2
Logs

# Paste the logs from the rollout controller

# Logs for the entire controller:
kubectl logs -n argo-rollouts deployment/argo-rollouts

# Logs for a specific rollout:
kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME
time="2025-05-23T22:24:25Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:25Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:25Z" level=info msg="Timed out (false) [last progress check: 2025-05-23 22:20:00 +0000 UTC - now: 2025-05-23 22:24:25.993005501 +0000 UTC m=+15950.933307485]" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Patched: {\"status\":{\"availableReplicas\":10,\"conditions\":[{\"lastTransitionTime\":\"2025-05-22T21:33:55Z\",\"lastUpdateTime\":\"2025-05-22T21:33:55Z\",\"message\":\"Rollout is not healthy\",\"reason\":\"RolloutHealthy\",\"status\":\"False\",\"type\":\"Healthy\"},{\"lastTransitionTime\":\"2025-05-22T21:34:43Z\",\"lastUpdateTime\":\"2025-05-22T21:34:43Z\",\"message\":\"RolloutCompleted\",\"reason\":\"RolloutCompleted\",\"status\":\"False\",\"type\":\"Completed\"},{\"lastTransitionTime\":\"2025-05-23T22:09:38Z\",\"lastUpdateTime\":\"2025-05-23T22:20:00Z\",\"message\":\"ReplicaSet \\\"simpleapp-instance-3-6d9b8c7dbb\\\" is progressing.\",\"reason\":\"ReplicaSetUpdated\",\"status\":\"True\",\"type\":\"Progressing\"},{\"lastTransitionTime\":\"2025-05-23T22:24:25Z\",\"lastUpdateTime\":\"2025-05-23T22:24:25Z\",\"message\":\"Rollout does not have minimum availability\",\"reason\":\"AvailableReason\",\"status\":\"False\",\"type\":\"Available\"}],\"message\":\"updated replicas are still becoming available\",\"readyReplicas\":10}}" generation=190 namespace=simpleapp-ns-3 resourceVersion=9475232 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="persisted to informer" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9475232 rollout=simpleapp-instance-3 time_ms=30.673155
time="2025-05-23T22:24:26Z" level=info msg="Started syncing rollout" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Reconciling stable ReplicaSet 'simpleapp-instance-3-6c55dfcff'" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Timed out (false) [last progress check: 2025-05-23 22:20:00 +0000 UTC - now: 2025-05-23 22:24:26.093617553 +0000 UTC m=+15951.033919537]" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="No status changes. Skipping patch" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Queueing up rollout for a progress after 333s" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:26Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3 time_ms=73.379848
time="2025-05-23T22:24:42Z" level=info msg="Started syncing rollout" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciling stable ReplicaSet 'simpleapp-instance-3-6c55dfcff'" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Patched: {\"status\":{\"availableReplicas\":11,\"conditions\":[{\"lastTransitionTime\":\"2025-05-22T21:33:55Z\",\"lastUpdateTime\":\"2025-05-22T21:33:55Z\",\"message\":\"Rollout is not healthy\",\"reason\":\"RolloutHealthy\",\"status\":\"False\",\"type\":\"Healthy\"},{\"lastTransitionTime\":\"2025-05-22T21:34:43Z\",\"lastUpdateTime\":\"2025-05-22T21:34:43Z\",\"message\":\"RolloutCompleted\",\"reason\":\"RolloutCompleted\",\"status\":\"False\",\"type\":\"Completed\"},{\"lastTransitionTime\":\"2025-05-23T22:09:38Z\",\"lastUpdateTime\":\"2025-05-23T22:24:42Z\",\"message\":\"ReplicaSet \\\"simpleapp-instance-3-6d9b8c7dbb\\\" is progressing.\",\"reason\":\"ReplicaSetUpdated\",\"status\":\"True\",\"type\":\"Progressing\"},{\"lastTransitionTime\":\"2025-05-23T22:24:42Z\",\"lastUpdateTime\":\"2025-05-23T22:24:42Z\",\"message\":\"Rollout has minimum availability\",\"reason\":\"AvailableReason\",\"status\":\"True\",\"type\":\"Available\"}],\"message\":\"active service cutover pending\",\"readyReplicas\":11}}" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="persisted to informer" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479503 rollout=simpleapp-instance-3 time_ms=29.567562
time="2025-05-23T22:24:42Z" level=info msg="Started syncing rollout" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciling stable ReplicaSet 'simpleapp-instance-3-6c55dfcff'" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Timed out (false) [last progress check: 2025-05-23 22:24:42 +0000 UTC - now: 2025-05-23 22:24:42.326139903 +0000 UTC m=+15967.266441887]" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="No status changes. Skipping patch" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Queueing up rollout for a progress after 599s" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:24:42Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3 time_ms=3.029035
time="2025-05-23T22:29:11Z" level=info msg="syncing service" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3 service=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="syncing service" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3 service=simpleapp-instance-3-preview
time="2025-05-23T22:29:11Z" level=info msg="Started syncing rollout" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling stable ReplicaSet 'simpleapp-instance-3-6c55dfcff'" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Timed out (false) [last progress check: 2025-05-23 22:24:42 +0000 UTC - now: 2025-05-23 22:29:11.797111677 +0000 UTC m=+16236.737413651]" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="No status changes. Skipping patch" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Queueing up rollout for a progress after 330s" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3 time_ms=5.669488
time="2025-05-23T22:29:11Z" level=info msg="Started syncing rollout" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling stable ReplicaSet 'simpleapp-instance-3-6c55dfcff'" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Timed out (false) [last progress check: 2025-05-23 22:24:42 +0000 UTC - now: 2025-05-23 22:29:11.891613395 +0000 UTC m=+16236.831915379]" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="No status changes. Skipping patch" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Queueing up rollout for a progress after 330s" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3 time_ms=94.977824
time="2025-05-23T22:29:11Z" level=info msg="Started syncing rollout" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling stable ReplicaSet 'simpleapp-instance-3-6c55dfcff'" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling Pre Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciling Post Promotion Analysis" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Timed out (false) [last progress check: 2025-05-23 22:24:42 +0000 UTC - now: 2025-05-23 22:29:11.997078567 +0000 UTC m=+16236.937380541]" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="No status changes. Skipping patch" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Queueing up rollout for a progress after 330s" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T22:29:11Z" level=info msg="Reconciliation completed" generation=190 namespace=simpleapp-ns-3 resourceVersion=9479675 rollout=simpleapp-instance-3 time_ms=104.95787800000001

Error logs

time="2025-05-23T16:57:03Z" level=info msg="failed to sync ephemeral metadata nil to ReplicaSet simpleapp-instance-3-7ff87c4998: error updating replicaset in updateReplicaSet simpleapp-instance-3-7ff87c4998: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-7ff87c4998\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T16:57:03Z" level=error msg="roCtx.reconcile err failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-7ff87c4998: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-7ff87c4998\": the object has been modified; please apply your changes to the latest version and try again" generation=169 namespace=simpleapp-ns-3 resourceVersion=9217413 rollout=simpleapp-instance-3
time="2025-05-23T16:57:03Z" level=error msg="rollout syncHandler error: failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-7ff87c4998: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-7ff87c4998\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:11:50Z" level=info msg="failed to sync ephemeral metadata &PodTemplateMetadata{Labels:map[string]string{role: preview,},Annotations:map[string]string{},} to ReplicaSet simpleapp-instance-3-7ff87c4998: error updating replicaset in updateReplicaSet simpleapp-instance-3-7ff87c4998: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-7ff87c4998\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:11:50Z" level=error msg="roCtx.reconcile err failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-7ff87c4998: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-7ff87c4998\": the object has been modified; please apply your changes to the latest version and try again" generation=170 namespace=simpleapp-ns-3 resourceVersion=9250514 rollout=simpleapp-instance-3
time="2025-05-23T17:11:50Z" level=error msg="rollout syncHandler error: failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-7ff87c4998: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-7ff87c4998\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:11:51Z" level=error msg="roCtx.reconcile err Operation cannot be fulfilled on pods \"simpleapp-instance-3-58f9d87cb9-4m4s6\": the object has been modified; please apply your changes to the latest version and try again" generation=170 namespace=simpleapp-ns-3 resourceVersion=9250514 rollout=simpleapp-instance-3
time="2025-05-23T17:11:51Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on pods \"simpleapp-instance-3-58f9d87cb9-4m4s6\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:13:51Z" level=info msg="failed to sync ephemeral metadata &PodTemplateMetadata{Labels:map[string]string{role: preview,},Annotations:map[string]string{},} to ReplicaSet simpleapp-instance-3-58f9d87cb9: error updating replicaset in updateReplicaSet simpleapp-instance-3-58f9d87cb9: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-58f9d87cb9\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:13:51Z" level=error msg="roCtx.reconcile err failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-58f9d87cb9: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-58f9d87cb9\": the object has been modified; please apply your changes to the latest version and try again" generation=171 namespace=simpleapp-ns-3 resourceVersion=9255787 rollout=simpleapp-instance-3
time="2025-05-23T17:13:51Z" level=error msg="rollout syncHandler error: failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-58f9d87cb9: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-58f9d87cb9\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:13:51Z" level=error msg="roCtx.reconcile err Operation cannot be fulfilled on pods \"simpleapp-instance-3-58f9d87cb9-b4xlj\": the object has been modified; please apply your changes to the latest version and try again" generation=171 namespace=simpleapp-ns-3 resourceVersion=9255787 rollout=simpleapp-instance-3
time="2025-05-23T17:13:51Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on pods \"simpleapp-instance-3-58f9d87cb9-b4xlj\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:17:37Z" level=info msg="failed to sync ephemeral metadata &PodTemplateMetadata{Labels:map[string]string{role: preview,},Annotations:map[string]string{},} to ReplicaSet simpleapp-instance-3-5ddf795d47: error updating replicaset in updateReplicaSet simpleapp-instance-3-5ddf795d47: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-5ddf795d47\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:17:37Z" level=error msg="roCtx.reconcile err failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-5ddf795d47: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-5ddf795d47\": the object has been modified; please apply your changes to the latest version and try again" generation=172 namespace=simpleapp-ns-3 resourceVersion=9260020 rollout=simpleapp-instance-3
time="2025-05-23T17:17:37Z" level=error msg="rollout syncHandler error: failed to sync ephemeral metadata: error updating replicaset in updateReplicaSet simpleapp-instance-3-5ddf795d47: Operation cannot be fulfilled on replicasets.apps \"simpleapp-instance-3-5ddf795d47\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3
time="2025-05-23T17:17:37Z" level=error msg="roCtx.reconcile err Operation cannot be fulfilled on pods \"simpleapp-instance-3-5ddf795d47-72khp\": the object has been modified; please apply your changes to the latest version and try again" generation=172 namespace=simpleapp-ns-3 resourceVersion=9260020 rollout=simpleapp-instance-3
time="2025-05-23T17:17:37Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on pods \"simpleapp-instance-3-5ddf795d47-72khp\": the object has been modified; please apply your changes to the latest version and try again" namespace=simpleapp-ns-3 rollout=simpleapp-instance-3


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions