Skip to content

Support tolerateFailuresUntilDeadline for Helm deployments #9809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
esabdull opened this issue May 5, 2025 · 0 comments
Open

Support tolerateFailuresUntilDeadline for Helm deployments #9809

esabdull opened this issue May 5, 2025 · 0 comments

Comments

@esabdull
Copy link

esabdull commented May 5, 2025

Could you consider adding support for the tolerateFailuresUntilDeadline field for Helm deployments in Skaffold?

Context
In GKE Autopilot clusters, Helm deployments sometimes fail in Skaffold due to delays caused by node autoscaling. For example, if a node is deleted during a deployment, the associated pod needs to be recreated on a new node. This process can take some time.

Even though Kubernetes eventually recreates the pod and the deployment completes successfully, Skaffold may already report the deployment as failed.

Why this is needed
Currently, there’s no mechanism for Helm deployments in Skaffold to tolerate temporary scheduling issues. Supporting tolerateFailuresUntilDeadline for Helm—similar to what was introduced for Cloud Run in v2.16.0 — would allow Skaffold to wait before marking the deployment as failed.

This would make deployments in Autopilot environments more resilient and improve reliability in GitHub Actions and other CI/CD setups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant