Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollouts cycle into degraded state during blue-green pause #3843

Closed
2 tasks done
miles-w-3 opened this issue Sep 20, 2024 · 2 comments · Fixed by #3845
Closed
2 tasks done

Rollouts cycle into degraded state during blue-green pause #3843

miles-w-3 opened this issue Sep 20, 2024 · 2 comments · Fixed by #3845
Labels
bug Something isn't working

Comments

@miles-w-3
Copy link
Contributor

miles-w-3 commented Sep 20, 2024

Checklist:

  • I've included steps to reproduce the bug.
  • I've included the version of argo rollouts.

Describe the bug
When a Rollout using the blue-green deployment strategy is left in a suspended state for 15+ minutes, it cycles into a Degraded state before regularly getting set back to suspended.

To Reproduce
Trigger a blue-green preview for your rollout, then leave it in a suspended state. Eventually, you will see events cycling it between Suspended to degraded and back.

Expected behavior
The rollout remains in a consistent suspended state until resumed or aborted

Screenshots

Version
first discovered on 2.32.2, still reproducible on master

Logs
Shows rollout switching to degraded state, then switching back to paused

INFO[0917] Processing completed                          resource=default/fish
INFO[0917] Patched: {"status":{"conditions":[{"lastTransitionTime":"2024-09-11T05:30:19Z","lastUpdateTime":"2024-09-11T05:30:19Z","message":"Rollout has minimum availability","reason":"AvailableReason","status":"True","type":"Available"},{"lastTransitionTime":"2024-09-20T13:24:36Z","lastUpdateTime":"2024-09-20T13:24:36Z","message":"Rollout is not healthy","reason":"RolloutHealthy","status":"False","type":"Healthy"},{"lastTransitionTime":"2024-09-20T13:24:36Z","lastUpdateTime":"2024-09-20T13:24:36Z","message":"RolloutCompleted","reason":"RolloutCompleted","status":"False","type":"Completed"},{"lastTransitionTime":"2024-09-20T13:24:37Z","lastUpdateTime":"2024-09-20T13:24:37Z","message":"Rollout is paused","reason":"RolloutPaused","status":"True","type":"Paused"},{"lastTransitionTime":"2024-09-20T19:08:33Z","lastUpdateTime":"2024-09-20T19:08:33Z","message":"ReplicaSet \"fish-79bfcd94f7\" has timed out progressing.","reason":"ProgressDeadlineExceeded","status":"False","type":"Progressing"}],"message":"ProgressDeadlineExceeded: ReplicaSet \"fish-79bfcd94f7\" has timed out progressing.","phase":"Degraded"}}  generation=3 namespace=default resourceVersion=64792278 rollout=fish
INFO[0917] persisted to informer                         generation=3 namespace=default resourceVersion=64799182 rollout=fish
INFO[0917] Reconciliation completed                      generation=3 namespace=default resourceVersion=64792278 rollout=fish time_ms=97.999375
INFO[0917] Started syncing rollout                       generation=3 namespace=default resourceVersion=64799182 rollout=fish
INFO[0917] invalidated cache for resource in namespace: argo-rollouts with the name: argo-rollouts-notification-configmap
INFO[0917] Patched conditions: {"status":{"conditions":[{"lastTransitionTime":"2024-09-11T05:30:19Z","lastUpdateTime":"2024-09-11T05:30:19Z","message":"Rollout has minimum availability","reason":"AvailableReason","status":"True","type":"Available"},{"lastTransitionTime":"2024-09-20T13:24:36Z","lastUpdateTime":"2024-09-20T13:24:36Z","message":"Rollout is not healthy","reason":"RolloutHealthy","status":"False","type":"Healthy"},{"lastTransitionTime":"2024-09-20T13:24:36Z","lastUpdateTime":"2024-09-20T13:24:36Z","message":"RolloutCompleted","reason":"RolloutCompleted","status":"False","type":"Completed"},{"lastTransitionTime":"2024-09-20T13:24:37Z","lastUpdateTime":"2024-09-20T13:24:37Z","message":"Rollout is paused","reason":"RolloutPaused","status":"True","type":"Paused"},{"lastTransitionTime":"2024-09-20T19:08:33Z","lastUpdateTime":"2024-09-20T19:08:33Z","message":"Rollout is paused","reason":"RolloutPaused","status":"Unknown","type":"Progressing"}],"message":"BlueGreenPause","phase":"Paused"}}  generation=3 namespace=default resourceVersion=64799182 rollout=fish

I believe this is a bug where the logic to exclude paused states from the progression timeout is only checking for canary pauses, not blue-green pauses. I will try to add logic to also check for a blue-green pause


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@miles-w-3 miles-w-3 added the bug Something isn't working label Sep 20, 2024
@ipeacocks
Copy link

It's kind of not a bug but feature. You can increase progressDeadlineSeconds which by default is 10 mins.

@miles-w-3
Copy link
Contributor Author

The progressDeadlineSeconds are not supposed to increase while the Rollout is in a paused, according to the spec here:

  # The maximum time in seconds in which a rollout must make progress during
  # an update, before it is considered to be failed. Argo Rollouts will
  # continue to process failed rollouts and a condition with a
  # ProgressDeadlineExceeded reason will be surfaced in the rollout status.
  # Note that progress will not be estimated during the time a rollout is
  # paused.
  # Defaults to 600s
  progressDeadlineSeconds: 600

meeech pushed a commit to CircleCI-Public/argo-rollouts that referenced this issue Feb 10, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…Fixes argoproj#3843 (argoproj#3845)

add check for overall pause condition to indefinite step

Signed-off-by: Miles <[email protected]>
Co-authored-by: Miles <[email protected]>
tperdue321 pushed a commit to tperdue321/argo-rollouts that referenced this issue Mar 28, 2025

Verified

This commit was signed with the committer’s verified signature.
phclark Philip Clark
…Fixes argoproj#3843 (argoproj#3845)

add check for overall pause condition to indefinite step

Signed-off-by: Miles <[email protected]>
Co-authored-by: Miles <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants