Open
Description
Context:
There is a new addon for EKS clusters: https://aws.amazon.com/about-aws/whats-new/2024/12/node-health-monitoring-auto-repair-amazon-eks/
One of its tasks is to check if there is any networking interface that has DOWN state:
{"level":"info","ts":"2024-12-23T10:04:50Z","msg":"handling export request","source":"networking","condition":{"Reason":"InterfaceNotUp","Message":"Interface \"nodelocaldns\" is not up","Severity":"Fatal","MinOccurrences":0}}
It is presented in AWS console:



I think the heuristic is correct: it is unusual situation that should be reported when networking interface is down.
I made the workaround with the patch for DaemonSet:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-local-dns
namespace: kube-system
spec:
template:
spec:
initContainers:
- name: interface-up
image: public.ecr.aws/docker/library/alpine:latest
restartPolicy: Always
command:
- /bin/sh
- -c
- |
while :; do
while :; do
ip link set dev nodelocaldns up && break
sleep 1
done
sleep 30
done
resources:
requests:
cpu: 10m
memory: 16Mi
limits:
cpu: 10m
memory: 16Mi
securityContext:
capabilities:
add:
- NET_ADMIN
This is the sidecar that sets the interface up. It is not really UP but rather with state UNKNOWN however now it is not reported by eks-node-monitoring-agent anymore.
I think the proper way would be to call LinkSetUp
in AddDummyDevice
function.
Metadata
Metadata
Assignees
Labels
No labels