Description
Around the time of the release of EKS 1.23 we started noticing that EKS is more aggressively scaling out/in its kube-apiservers. We are seeing them being replaced more frequently. What we also notice is that whenever a kube-apiserver is removed (it no longer appears in kubectl get endpoints kubernetes -n default
) it doesn't close existing connections. Instead, whenever a client makes a request to this removed kube-apiserver over an existing connection, the kube-apiserver returns a 401 Unauthorized. This seems to happen every time a kube-apiserver is scaled down. Applications might not be triggered by this 401 Unauthorized to re-establish their connection to the kube-apiserver. Instead, they might think that certain API resources are not available and act accordingly. This happens for example with the latest release of cilium.
I believe that whenever a kube-apiserver is removed as an endpoint, it should also immediately close all of its client connections; forcing the clients to establish a new connection.
Related issue: cilium/cilium#20915