Skip to content

With cri-dockerd 0.2.4, K3s with docker via cri-dockerd on Ubuntu 22.04 fails even if --network-plugin=cni is configured #104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adelton opened this issue Aug 5, 2022 · 2 comments · Fixed by #103

Comments

@adelton
Copy link

adelton commented Aug 5, 2022

Up until 0.2.3, I was able to run K3s with docker on Ubuntu 22.04 using

apt update
apt remove -y moby-engine moby-containerd moby-runc   # needed on GitHub Action environment to remove conflicts with docker.io
apt install -y docker.io jq
curl -s https://api.github.com/repos/Mirantis/cri-dockerd/releases/latest | jq -r '.assets[].browser_download_url' | grep jammy_amd64.deb | tee /dev/stderr | xargs curl -LO
apt install -y ./cri-dockerd_*.deb
curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 --container-runtime-endpoint=unix:///var/run/cri-dockerd.sock --kubelet-arg=cgroup-driver=systemd
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
( set +x ; while true ; do if kubectl get nodes | tee /dev/stderr | grep -q '\bReady\b' ; then break ; else sleep 5 ; fi ; done )

The K3s cluster quickly stabilizes into

NAME                  STATUS     ROLES                  AGE   VERSION
docker.example.test   NotReady   control-plane,master   1s    v1.24.3+k3s1
NAME                  STATUS     ROLES                  AGE   VERSION
docker.example.test   NotReady   control-plane,master   6s    v1.24.3+k3s1
NAME                  STATUS   ROLES                  AGE   VERSION
docker.example.test   Ready    control-plane,master   11s   v1.24.3+k3s1

It still works today when I replace the curl selecting the cri-dockerd release with the last-working 0.2.3 tag:

curl -s https://api.github.com/repos/Mirantis/cri-dockerd/releases/tags/v0.2.3 | jq -r '.assets[].browser_download_url' | grep jammy_amd64.deb | tee /dev/stderr | xargs curl -LO

Alas, with https://github.com/Mirantis/cri-dockerd/releases/download/v0.2.4/cri-dockerd_0.2.4.3-0.ubuntu-jammy_amd64.deb the K3s cluster never gets to the Ready state.

The #93 and #99 indicate that with 0.2.4, --network-plugin=cni is needed to revert back to the pre-0.2.4 behaviour. So I changed the steps to add that option:

[...]
curl -s https://api.github.com/repos/Mirantis/cri-dockerd/releases/latest | jq -r '.assets[].browser_download_url' | grep jammy_amd64.deb | tee /dev/stderr | xargs curl -LO
apt install -y ./cri-dockerd_*.deb

# override ExecStart, add the --network-plugin=cni option
mkdir /etc/systemd/system/cri-docker.service.d
( echo '[Service]' ; echo 'ExecStart=' ; sed 's/ExecStart=.*/& --network-plugin=cni/;t;d' /lib/systemd/system/cri-docker.service ) | sudo tee /etc/systemd/system/cri-docker.service.d/network-plugin.conf
systemctl daemon-reload
systemctl restart cri-docker

curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 --container-runtime-endpoint=unix:///var/run/cri-dockerd.sock --kubelet-arg=cgroup-driver=systemd
[...]

However, the K3s cluster never gets to the Ready state. The journal has a stream of

k3s[2975]: E0805 08:24:09.548625    2975 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
k3s[2975]: E0805 08:24:14.569182    2975 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
k3s[2975]: E0805 08:24:19.589066    2975 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"

messages.

The cri-dockerd seems to be running with the expected option:

# systemctl status cri-docker --lines=0
● cri-docker.service - CRI Interface for Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/cri-docker.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/cri-docker.service.d
             └─network-plugin.conf
     Active: active (running) since Fri 2022-08-05 08:23:13 UTC; 5min ago
TriggeredBy: ● cri-docker.socket
       Docs: https://docs.mirantis.com
   Main PID: 2803 (cri-dockerd)
      Tasks: 9
     Memory: 13.5M
        CPU: 2.014s
     CGroup: /system.slice/cri-docker.service
             └─2803 /usr/bin/cri-dockerd --container-runtime-endpoint fd:// --network-plugin=cni

What else besides --network-plugin=cni needs to be changed for 0.2.4 to make it work as well as 0.2.3 did?

@adelton
Copy link
Author

adelton commented Aug 5, 2022

Ahh, so it seems that the correct parameter to get the 0.2.3 behaviour that works with K3s is actually no network-plugin value, meaning --network-plugin=.

@evol262
Copy link
Contributor

evol262 commented Aug 5, 2022

Ahh, so it seems that the correct parameter to get the 0.2.3 behaviour that works with K3s is actually no network-plugin value, meaning --network-plugin=.

Yeah, not intuitive. I just mentioned that in a comment, but your experimenting/grepping beat me there. I added an additional arg to explicitly select no-op to hopefully make your script look a little nicer.

I'd certainly expect k3s to use a CNI, though. Doesn't it use Flannel by default? I do see this, though, which is... interesting. Separately (thanks Google), I came across this, which does make it look like they may be somewhat deterministic now, as long as it's k3s and not k3os, maybe. Do these exist on your system? If so, they're definitely not the default paths, but either symlinking them or using these as args for --cni-bin-dir/etc may resolve without using no-op:

Key type path Description
cni.paths.bin string "/var/lib/rancher/k3s/data/current/bin" CNI plugin binaries folder for k3s. Change to /opt/cni/bin for non k3s
cni.paths.config string "/var/lib/rancher/k3s/agent/etc/cni/net.d" CNI config folder for k3s. Change to /etc/cni/net.d for non k3s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants