Skip to content

Interface "nodelocaldns" is not up #666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dex4er opened this issue Dec 23, 2024 · 2 comments
Open

Interface "nodelocaldns" is not up #666

dex4er opened this issue Dec 23, 2024 · 2 comments

Comments

@dex4er
Copy link

dex4er commented Dec 23, 2024

Context:

There is a new addon for EKS clusters: https://aws.amazon.com/about-aws/whats-new/2024/12/node-health-monitoring-auto-repair-amazon-eks/

One of its tasks is to check if there is any networking interface that has DOWN state:

{"level":"info","ts":"2024-12-23T10:04:50Z","msg":"handling export request","source":"networking","condition":{"Reason":"InterfaceNotUp","Message":"Interface \"nodelocaldns\" is not up","Severity":"Fatal","MinOccurrences":0}}

It is presented in AWS console:

image image image

I think the heuristic is correct: it is unusual situation that should be reported when networking interface is down.

I made the workaround with the patch for DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-local-dns
  namespace: kube-system
spec:
  template:
    spec:
      initContainers:
        - name: interface-up
          image: public.ecr.aws/docker/library/alpine:latest
          restartPolicy: Always
          command:
            - /bin/sh
            - -c
            - |
              while :; do
                while :; do
                  ip link set dev nodelocaldns up && break
                  sleep 1
                done
                sleep 30
              done
          resources:
            requests:
              cpu: 10m
              memory: 16Mi
            limits:
              cpu: 10m
              memory: 16Mi
          securityContext:
            capabilities:
              add:
                - NET_ADMIN

This is the sidecar that sets the interface up. It is not really UP but rather with state UNKNOWN however now it is not reported by eks-node-monitoring-agent anymore.

I think the proper way would be to call LinkSetUp in AddDummyDevice function.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 23, 2025
@dex4er
Copy link
Author

dex4er commented Mar 23, 2025

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants