-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EnableFullEviction for RemovePodsViolatingNodeAffinity #1363
Conversation
Signed-off-by: Jack Francis <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Here's a quick overview of this feature in action (built an image from this branch and smoke-tested it on a Cluster API CAPZ cluster in Azure). A watch stream of pod replicas that have a "foo=bar" nodeAffinity ( $ k get pods -l run=php-apache -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
php-apache-7674886bb6-89mxt 0/1 Pending 0 10m <none> <none> <none> <none>
php-apache-7674886bb6-cjfpb 0/1 Pending 0 10m <none> <none> <none> <none>
php-apache-7674886bb6-x5b54 0/1 Pending 0 10m <none> <none> <none> <none>
php-apache-7674886bb6-89mxt 0/1 Pending 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 0/1 Pending 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 Pending 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 0/1 ContainerCreating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 ContainerCreating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 0/1 ContainerCreating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 0/1 ContainerCreating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 ContainerCreating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 0/1 ContainerCreating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 1/1 Running 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 1/1 Running 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 1/1 Running 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 1/1 Running 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 1/1 Terminating 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-52dwx 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-x5b54 1/1 Running 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 1/1 Terminating 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-52dwx 0/1 Pending 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-x5b54 1/1 Terminating 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-slknb 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-x5b54 1/1 Terminating 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 1/1 Running 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-slknb 0/1 Pending 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-52dwx 0/1 ContainerCreating 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-cjfpb 1/1 Terminating 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-slknb 0/1 ContainerCreating 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-cjfpb 1/1 Terminating 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 Pending 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 ContainerCreating 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-89mxt 1/1 Terminating 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 1/1 Terminating 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 1/1 Terminating 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 0/1 Terminating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 0/1 Terminating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-slknb 0/1 ContainerCreating 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 Terminating 0 11m <none> capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 ContainerCreating 0 0s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-52dwx 0/1 ContainerCreating 0 1s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-89mxt 0/1 Terminating 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 0/1 Terminating 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-89mxt 0/1 Terminating 0 11m 192.168.91.134 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 Terminating 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 Terminating 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-cjfpb 0/1 Terminating 0 11m 192.168.91.135 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-8fdtg 1/1 Running 0 1s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-x5b54 0/1 Terminating 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 0/1 Terminating 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-x5b54 0/1 Terminating 0 11m 192.168.91.136 capz-e2e-rqyahs-vmss-mp-0000002 <none> <none>
php-apache-7674886bb6-slknb 1/1 Running 0 2s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-52dwx 1/1 Running 0 2s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-slknb 1/1 Running 0 20s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-slknb 1/1 Terminating 0 20s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-cpzpz 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-52dwx 1/1 Running 0 20s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-cpzpz 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-slknb 1/1 Terminating 0 20s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-52dwx 1/1 Terminating 0 20s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-p27jj 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-8fdtg 1/1 Running 0 20s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-p27jj 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-52dwx 1/1 Terminating 0 20s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-8fdtg 1/1 Terminating 0 20s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-h274v 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-8fdtg 1/1 Terminating 0 20s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-h274v 0/1 Pending 0 0s <none> <none> <none> <none>
php-apache-7674886bb6-8fdtg 1/1 Terminating 0 20s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-52dwx 1/1 Terminating 0 20s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 Terminating 0 20s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-52dwx 0/1 Terminating 0 20s <none> capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-52dwx 0/1 Terminating 0 21s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-52dwx 0/1 Terminating 0 21s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-52dwx 0/1 Terminating 0 21s 192.168.54.135 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 Terminating 0 21s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 Terminating 0 21s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-8fdtg 0/1 Terminating 0 21s 192.168.54.134 capz-e2e-rqyahs-vmss-mp-0000004 <none> <none>
php-apache-7674886bb6-slknb 1/1 Terminating 0 21s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-slknb 0/1 Terminating 0 21s <none> capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-slknb 0/1 Terminating 0 22s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-slknb 0/1 Terminating 0 22s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none>
php-apache-7674886bb6-slknb 0/1 Terminating 0 22s 192.168.211.71 capz-e2e-rqyahs-vmss-mp-0000003 <none> <none> What the above watch stream shows is the result of doing the following on the cluster: $ k label nodes capz-e2e-rqyahs-vmss-mp-0000002 foo=bar
node/capz-e2e-rqyahs-vmss-mp-0000002 labeled
$ k label nodes capz-e2e-rqyahs-vmss-mp-0000003 foo=bar
node/capz-e2e-rqyahs-vmss-mp-0000003 labeled
$ k label nodes capz-e2e-rqyahs-vmss-mp-0000004 foo=bar
node/capz-e2e-rqyahs-vmss-mp-0000004 labeled
$ k label nodes capz-e2e-rqyahs-vmss-mp-0000002 foo-
node/capz-e2e-rqyahs-vmss-mp-0000002 unlabeled
$ k label nodes capz-e2e-rqyahs-vmss-mp-0000003 foo-
node/capz-e2e-rqyahs-vmss-mp-0000003 unlabeled
$ k label nodes capz-e2e-rqyahs-vmss-mp-0000004 foo- To explain what we see:
$ k get pods -l run=php-apache -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
php-apache-7674886bb6-cpzpz 0/1 Pending 0 6m15s <none> <none> <none> <none>
php-apache-7674886bb6-h274v 0/1 Pending 0 6m15s <none> <none> <none> <none>
php-apache-7674886bb6-p27jj 0/1 Pending 0 6m15s <none> <none> <none> <none> The outcome from step 3 above is a new behavior that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we were to do something like this, I would rather see it as an option in something like NodeFit, or somewhere more general since this same logic could apply to other strategies like taints and inter pod affinity.
I also would give this a more descriptive name like EvictWithoutFit
or something that makes it clear that this will likely just put pods into a Pending state. I understand that is the goal, but what is the use case for this? I'm curious what your scenario is that it's preferable to end up with all the pods stuck Pending. Maybe there is another avenue we could explore that better suits your needs?
@damemi thanks for the detailed feedback. I actually agree that this would be ideal in a more general area, I'll scaffold that up to see how it looks. Here's my use-case: in a multi-cluster environment, I'd like to be able to leverage descheduler as a trigger to indicate when a workload no longer has any suitable node to run on according to its requirements (e.g., nodeSelector, taints). More specifically:
And so, the current default descheduler behavior can prevent the above for a simple scenario where there are a small number of pod replicas able to fit onto a single node.
So from a high level, I'd like to be able to leverage descheduler to definitely signal (via all pods in a non-Running state) that a cluster no longer has any suitable nodes to run my workload, so that I can move those workloads to another cluster. |
+1 for making the configuration part of NodeFit to e.g disable the check. Disabling the check might be translated into "refresh as many pods as you can ignoring whether they get re-scheduled to any node". We have various limits on the number of evictions to configure that can be tuned to increase the impact. Another option is to turn the NodeFit into a plugin and disable/enable the plugin as needed. Providing the requested functionality for free. Anyone can then build a custom NodeFit plugin and define use case specific policies. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
@jackfrancis: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This PR adds a
EnableFullEviction
configuration option to theRemovePodsViolatingNodeAffinity
plugin. Enabling it would look like this:The purpose of this feature is to enable eviction of all pod replicas whose declared nodeAffinity configuration no longer matches the node they are currently scheduled onto, even if there is no other node in the cluster that has a nodeAffinity match. That last part (in italics) is the change that I'm advocating for here.
TODO: docs and updated helm chart