You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you consider adding support for the tolerateFailuresUntilDeadline field for Helm deployments in Skaffold?
Context
In GKE Autopilot clusters, Helm deployments sometimes fail in Skaffold due to delays caused by node autoscaling. For example, if a node is deleted during a deployment, the associated pod needs to be recreated on a new node. This process can take some time.
Even though Kubernetes eventually recreates the pod and the deployment completes successfully, Skaffold may already report the deployment as failed.
Why this is needed
Currently, there’s no mechanism for Helm deployments in Skaffold to tolerate temporary scheduling issues. Supporting tolerateFailuresUntilDeadline for Helm—similar to what was introduced for Cloud Run in v2.16.0 — would allow Skaffold to wait before marking the deployment as failed.
This would make deployments in Autopilot environments more resilient and improve reliability in GitHub Actions and other CI/CD setups.
The text was updated successfully, but these errors were encountered:
Could you consider adding support for the
tolerateFailuresUntilDeadline
field for Helm deployments in Skaffold?Context
In GKE Autopilot clusters, Helm deployments sometimes fail in Skaffold due to delays caused by node autoscaling. For example, if a node is deleted during a deployment, the associated pod needs to be recreated on a new node. This process can take some time.
Even though Kubernetes eventually recreates the pod and the deployment completes successfully, Skaffold may already report the deployment as failed.
Why this is needed
Currently, there’s no mechanism for Helm deployments in Skaffold to tolerate temporary scheduling issues. Supporting
tolerateFailuresUntilDeadline
for Helm—similar to what was introduced for Cloud Run in v2.16.0 — would allow Skaffold to wait before marking the deployment as failed.This would make deployments in Autopilot environments more resilient and improve reliability in GitHub Actions and other CI/CD setups.
The text was updated successfully, but these errors were encountered: