-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kill-host-pods.py: filter pods by node #4819
Conversation
Starting with k8s 1.32, AuthorizeNodeWithSelectors is enabled by default: https://kubernetes.io/docs/reference/access-authn-authz/node/ If the rbac microk8s addon is enabled, the kube-apiserver will run with "--authorization-mode=RBAC,Node". This means that kublets (system:node:$node) will no longer be allowed to access pods that reside on other nodes. For this reason, the "kill-host-pods.py" script is now getting access denied errors: Error from server (Forbidden): pods is forbidden: User "system:node:myhostname" cannot list resource "pods" in API group "" at the cluster scope: can only list/watch pods with spec.nodeName field selector As suggested by the error message, we'll solve it by filtering pods by the node name. Fixes: #4802
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, I'll run some manual tests and merge if all is well
Thanks! fwiw, I used the following to trigger a pod cleanup:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I just ran into the linked issue on upgrading to 1.32 - this PR is a sensible fix, but I wasn't aware that it is necessary to manually sync
|
@ianroberts Thanks for bringing this up. It seems to be by design, the default-hooks are copied over only for fresh installations, probably so that it won't override custom hooks: Lines 223 to 227 in 636c313
We have the following options:
|
Another PR fixed one of the default hooks [1], however we're not copying over the hooks from $SNAP/default-hooks to $SNAP/hooks when refreshing existing snap installations [2]. This is probably by design so that we don't override user defined hooks However, it doesn't seem to be documented anywhere. As suggested by the team, this patch will copy over the reconcile.d/10-pods-restart hook. Downsides: * copying just one of the hooks seems a bit unintuitive and inconsistent * we may override user hooks, which can be unexpected Alternatives: * document the fact that these hooks are not refreshed automatically and that users can/should copy over the default hooks * always refresh all hooks [1] #4819 [2] #4819 (comment)
@petrutlucian94 if someone can merge #4473 then that will at least mitigate the issue as it fixes the kill script itself rather than just the hook that calls the kill script. |
@ianroberts #4473 seems to fix a slightly different issue. Without the
Anyway, I've retriggered the CI jobs on your PR, it should pass now. Once we have the green light from the CI, it should be ready to merge. |
Another PR fixed one of the default hooks [1], however we're not copying over the hooks from $SNAP/default-hooks to $SNAP/hooks when refreshing existing snap installations [2]. This is probably by design so that we don't override user defined hooks However, it doesn't seem to be documented anywhere. As suggested by the team, this patch will copy over the reconcile.d/10-pods-restart hook. Downsides: * copying just one of the hooks seems a bit unintuitive and inconsistent * we may override user hooks, which can be unexpected Alternatives: * document the fact that these hooks are not refreshed automatically and that users can/should copy over the default hooks * always refresh all hooks [1] #4819 [2] #4819 (comment)
Starting with k8s 1.32, AuthorizeNodeWithSelectors is enabled by default: https://kubernetes.io/docs/reference/access-authn-authz/node/
If the rbac microk8s addon is enabled, the kube-apiserver will run with "--authorization-mode=RBAC,Node". This means that kublets (system:node:$node) will no longer be allowed to access pods that reside on other nodes.
For this reason, the "kill-host-pods.py" script is now getting access denied errors:
As suggested by the error message, we'll solve it by filtering pods by the node name.
Fixes: #4802
Summary
Changes
Testing
Possible Regressions
Checklist
Notes