-
Notifications
You must be signed in to change notification settings - Fork 2.2k
ContainerID.scope/cgroup-procs: no such file or directory #4620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Usually the scope has succeeded if there are no more processes left it in. So that If your failed scope was also running |
Perhaps I didn't explain clearly enough. "The scope succeeded" indeed means that the container's primary process has completed execution, which implies that the container should exit. However, I am puzzled because the container remains in a running state and I can still execute the exec command. |
Does |
Yes, these are the records of my previous operations. I'm not sure if you can see the images I provided. crio-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope succeeded,but the container is still running and the crio-conmon-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope unit is still exits. Jan 09 10:07:45 ceasphere23-node-3 systemd[1]: Started crio-conmon-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope.
Jan 09 10:07:46 ceasphere23-node-3 conmon[330022]: conmon 0644511bc2734f5ff9f9 <ninfo>: addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach}
Jan 09 10:07:46 ceasphere23-node-3 conmon[330022]: conmon 0644511bc2734f5ff9f9 <ninfo>: terminal_ctrl_fd: 12
Jan 09 10:07:46 ceasphere23-node-3 conmon[330022]: conmon 0644511bc2734f5ff9f9 <ninfo>: winsz read side: 16, winsz write side: 16
Jan 09 10:08:10 ceasphere23-node-3 systemd[1]: Started libcontainer container 0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.
Jan 09 10:08:13 ceasphere23-node-3 crio[27514]: time="2025-01-09 10:08:13.434244935+08:00" level=info msg="Created container 0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118: ccos-monitoring/prometheus-agent-0-0/prometheus" id=ddb88684-9f80-4cc8-a910-c5044be1a01c name=/runtime.v1.RuntimeService/CreateContainer
Jan 09 10:08:13 ceasphere23-node-3 hyperkube[57677]: I0109 10:08:13.434460 57677 remote_runtime.go:446] "[RemoteRuntimeService] CreateContainer" podSandboxID="7237e359fec361cef9d8adb215defbace399f033eb6ac56a942197e1bc02fb35" containerID="0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118"
Jan 09 10:08:13 ceasphere23-node-3 hyperkube[57677]: I0109 10:08:13.434532 57677 remote_runtime.go:459] "[RemoteRuntimeService] StartContainer" containerID="0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118" timeout="2m0s"
Jan 09 10:08:13 ceasphere23-node-3 crio[27514]: time="2025-01-09 10:08:13.434722361+08:00" level=info msg="Starting container: 0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118" id=e432b000-79cc-4498-a08f-b0e3fe21e658 name=/runtime.v1.RuntimeService/StartContainer
Jan 09 10:08:13 ceasphere23-node-3 crio[27514]: time="2025-01-09 10:08:13.469891032+08:00" level=info msg="Started container" PID=364493 containerID=0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118 description=ccos-monitoring/prometheus-agent-0-0/prometheus id=e432b000-79cc-4498-a08f-b0e3fe21e658 name=/runtime.v1.RuntimeService/StartContainer sandboxID=7237e359fec361cef9d8adb215defbace399f033eb6ac56a942197e1bc02fb35
Jan 09 10:08:13 ceasphere23-node-3 hyperkube[57677]: I0109 10:08:13.487397 57677 remote_runtime.go:477] "[RemoteRuntimeService] StartContainer Response" containerID="0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118"
Jan 09 10:08:13 ceasphere23-node-3 hyperkube[57677]: I0109 10:08:13.676730 57677 kubelet.go:2250] "SyncLoop (PLEG): event for pod" pod="ccos-monitoring/prometheus-agent-0-0" event=&{ID:f921ae10-59ef-4825-ae61-248f1989c789 Type:ContainerStarted Data:0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118}
Jan 09 10:08:16 ceasphere23-node-3 systemd[1]: crio-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope: Succeeded.
Jan 09 10:08:16 ceasphere23-node-3 systemd[1]: crio-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope: Consumed 4.450s CPU time.
Jan 09 10:08:37 ceasphere23-node-3 hyperkube[57677]: rpc error: code = Unknown desc = command error: time="2025-01-09T10:08:37+08:00" level=error msg="exec failed: unable to start container process: error adding pid 410135 to cgroups: failed to write 410135: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf921ae10_59ef_4825_ae61_248f1989c789.slice/crio-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope/cgroup.procs: no such file or directory" |
I can see the images, but I don't see any |
I didn't save those operations, but I did perform them using runc --root /run/runc list and runc --root /run/runc exec containerID. If I encounter this again in the future, I will provide these operation records. |
Hi~,meet the problem again [root@oss38 ~]# runc --root /run/runc list |grep 32da98c852206
32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6 23440 running /run/containers/storage/overlay-containers/32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6/userdata 2025-02-11T07:37:53.785872954Z root
[root@oss38 ~]# runc --root /run/runc state 32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6
{
"ociVersion": "1.0.2-dev",
"id": "32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6",
"pid": 23440,
"status": "running",
"bundle": "/run/containers/storage/overlay-containers/32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6/userdata",
"rootfs": "/var/lib/containers/storage/overlay/cf28b9918ff4109b94712e852849f3f19b26ebed41e3a4f98b835d41f4605158/merged",
"created": "2025-02-11T07:37:53.785872954Z",
"annotations": {
"ccos.io/scc": "storage-scc",
"io.container.manager": "cri-o",
"io.kubernetes.container.hash": "f7f5604c",
"io.kubernetes.container.name": "engine-leader",
"io.kubernetes.container.ports": "[{\"hostPort\":28999,\"containerPort\":28999,\"protocol\":\"TCP\"}]",
"io.kubernetes.container.restartCount": "1",
"io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
"io.kubernetes.container.terminationMessagePolicy": "File",
"io.kubernetes.cri-o.Annotations": "{\"io.kubernetes.container.hash\":\"f7f5604c\",\"io.kubernetes.container.ports\":\"[{\\\"hostPort\\\":28999,\\\"containerPort\\\":28999,\\\"protocol\\\":\\\"TCP\\\"}]\",\"io.kubernetes.container.restartCount\":\"1\",\"io.kubernetes.container.terminationMessagePath\":\"/dev/termination-log\",\"io.kubernetes.container.terminationMessagePolicy\":\"File\",\"io.kubernetes.pod.terminationGracePeriod\":\"30\"}",
"io.kubernetes.cri-o.ContainerID": "32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6",
"io.kubernetes.cri-o.ContainerType": "container",
"io.kubernetes.cri-o.Created": "2025-02-11T15:37:50.537389622+08:00",
"io.kubernetes.cri-o.Image": "c3302c0a53388150404a0e21c89ab52401f6de3ecb22915d090be23aadecae50",
"io.kubernetes.cri-o.ImageName": "image.cestc.cn/ccos-ceastor/engine-leader:CeaStor_3.2.5-6013-20250109223410",
"io.kubernetes.cri-o.ImageRef": "c3302c0a53388150404a0e21c89ab52401f6de3ecb22915d090be23aadecae50",
"io.kubernetes.cri-o.Labels": "{\"io.kubernetes.container.name\":\"engine-leader\",\"io.kubernetes.pod.name\":\"engine-leader-rfff8\",\"io.kubernetes.pod.namespace\":\"product-storage\",\"io.kubernetes.pod.uid\":\"207aaad1-659f-4db8-88e0-46e0cdcc3004\"}",
"io.kubernetes.cri-o.LogPath": "/var/log/pods/product-storage_engine-leader-rfff8_207aaad1-659f-4db8-88e0-46e0cdcc3004/engine-leader/1.log",
"io.kubernetes.cri-o.Metadata": "{\"name\":\"engine-leader\",\"attempt\":1}",
"io.kubernetes.cri-o.MountPoint": "/var/lib/containers/storage/overlay/cf28b9918ff4109b94712e852849f3f19b26ebed41e3a4f98b835d41f4605158/merged",
"io.kubernetes.cri-o.Name": "k8s_engine-leader_engine-leader-rfff8_product-storage_207aaad1-659f-4db8-88e0-46e0cdcc3004_1",
"io.kubernetes.cri-o.ResolvPath": "/run/containers/storage/overlay-containers/690959f6b540f7a6da448aa2b8c951ee07d08fe890be4730d90613ebde665d16/userdata/resolv.conf",
"io.kubernetes.cri-o.SandboxID": "690959f6b540f7a6da448aa2b8c951ee07d08fe890be4730d90613ebde665d16",
"io.kubernetes.cri-o.SandboxName": "k8s_engine-leader-rfff8_product-storage_207aaad1-659f-4db8-88e0-46e0cdcc3004_0",
"io.kubernetes.cri-o.SeccompProfilePath": "",
"io.kubernetes.cri-o.Stdin": "false",
"io.kubernetes.cri-o.StdinOnce": "false",
"io.kubernetes.cri-o.TTY": "false",
"io.kubernetes.cri-o.Volumes": "[{\"container_path\":\"/opt/storage/\",\"host_path\":\"/opt/storage\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/storage/\",\"host_path\":\"/etc/storage\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/root/.ssh\",\"host_path\":\"/root/.ssh\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/run\",\"host_path\":\"/var/lib/kubelet/pods/207aaad1-659f-4db8-88e0-46e0cdcc3004/volumes/kubernetes.io~empty-dir/runpath\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/localtime\",\"host_path\":\"/usr/share/zoneinfo/Asia/Shanghai\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/hosts\",\"host_path\":\"/var/lib/kubelet/pods/207aaad1-659f-4db8-88e0-46e0cdcc3004/etc-hosts\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/dev/termination-log\",\"host_path\":\"/var/lib/kubelet/pods/207aaad1-659f-4db8-88e0-46e0cdcc3004/containers/engine-leader/50dbbca3\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/log/storage\",\"host_path\":\"/var/log/storage\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/spool/cron/\",\"host_path\":\"/var/spool/cron\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/run/secrets/kubernetes.io/serviceaccount\",\"host_path\":\"/var/lib/kubelet/pods/207aaad1-659f-4db8-88e0-46e0cdcc3004/volumes/kubernetes.io~projected/kube-api-access-zb94p\",\"readonly\":true,\"propagation\":0,\"selinux_relabel\":false}]",
"io.kubernetes.pod.name": "engine-leader-rfff8",
"io.kubernetes.pod.namespace": "product-storage",
"io.kubernetes.pod.terminationGracePeriod": "30",
"io.kubernetes.pod.uid": "207aaad1-659f-4db8-88e0-46e0cdcc3004",
"kubernetes.io/config.seen": "2025-02-11T15:37:43.833442052+08:00",
"kubernetes.io/config.source": "api",
"org.systemd.property.After": "['crio.service']",
"org.systemd.property.CollectMode": "'inactive-or-failed'",
"org.systemd.property.DefaultDependencies": "true",
"org.systemd.property.TimeoutStopUSec": "uint64 30000000"
},
"owner": ""
}
[root@oss38 ~]# runc --root /run/runc exec -t 32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6 date
ERRO[0000] exec failed: unable to start container process: error adding pid 3146253 to cgroups: failed to write 3146253: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory
[root@oss38 ~]# runc --version
runc version 1.1.4
spec: 1.0.2-dev
go: go1.18.6
libseccomp: 2.5.0
|
Do you need any additional information? [root@oss38 ~]# systemctl status crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope
● crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope
Loaded: loaded (/run/systemd/transient/crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope; transient)
Transient: yes
Active: active (running) since Tue 2025-02-11 15:37:50 CST; 51min ago
Tasks: 2
Memory: 13.1M
CPU: 1.419s
CGroup: /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope
├─ 21942 /usr/bin/conmon -b /run/containers/storage/overlay-containers/32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6/userdata -c 32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6 --exi>
Feb 11 15:37:50 oss38 systemd[1]: Started crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope.
Feb 11 15:37:50 oss38 conmon[21942]: conmon 32da98c8522068be05a7 <ninfo>: addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach}
Feb 11 15:37:50 oss38 conmon[21942]: conmon 32da98c8522068be05a7 <ninfo>: terminal_ctrl_fd: 12
Feb 11 15:37:50 oss38 conmon[21942]: conmon 32da98c8522068be05a7 <ninfo>: winsz read side: 16, winsz write side: 16
[root@oss38 ~]# systemctl status crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope
Unit crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope could not be found.
[root@oss38 ~]# ls -alh /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice |grep 32da98c8
drwxr-xr-x 2 root root 0 Feb 11 15:37 crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope |
Partial logs Feb 11 15:37:50 oss38 systemd[1]: Started crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope.
Feb 11 15:37:50 oss38 conmon[21942]: conmon 32da98c8522068be05a7 <ninfo>: addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach}
Feb 11 15:37:50 oss38 conmon[21942]: conmon 32da98c8522068be05a7 <ninfo>: terminal_ctrl_fd: 12
Feb 11 15:37:50 oss38 conmon[21942]: conmon 32da98c8522068be05a7 <ninfo>: winsz read side: 16, winsz write side: 16
Feb 11 15:37:53 oss38 systemd[1]: Started libcontainer container 32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.
Feb 11 15:37:53 oss38 systemd[1]: crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope: Succeeded.
Feb 11 15:37:53 oss38 crio[4848]: time="2025-02-11 15:37:53.860868436+08:00" level=info msg="Created container 32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6: product-storage/engine-leader-rfff8/engine-leader" id=eba59564-b1f0-4eaf-85dc-2bfa8f3b87a8 name=/runtime.v1.RuntimeService/CreateContainer
Feb 11 15:37:53 oss38 hyperkube[5170]: I0211 15:37:53.861581 5170 remote_runtime.go:446] "[RemoteRuntimeService] CreateContainer" podSandboxID="690959f6b540f7a6da448aa2b8c951ee07d08fe890be4730d90613ebde665d16" containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6"
Feb 11 15:37:53 oss38 hyperkube[5170]: I0211 15:37:53.861885 5170 remote_runtime.go:459] "[RemoteRuntimeService] StartContainer" containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" timeout="2m0s"
Feb 11 15:37:53 oss38 crio[4848]: time="2025-02-11 15:37:53.862376310+08:00" level=info msg="Starting container: 32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" id=33c88659-5f4a-454e-9608-32b89e525e62 name=/runtime.v1.RuntimeService/StartContainer
Feb 11 15:37:53 oss38 crio[4848]: time="2025-02-11 15:37:53.904496086+08:00" level=info msg="Started container" PID=23440 containerID=32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6 description=product-storage/engine-leader-rfff8/engine-leader id=33c88659-5f4a-454e-9608-32b89e525e62 name=/runtime.v1.RuntimeService/StartContainer sandboxID=690959f6b540f7a6da448aa2b8c951ee07d08fe890be4730d90613ebde665d16
Feb 11 15:37:53 oss38 hyperkube[5170]: I0211 15:37:53.935656 5170 remote_runtime.go:477] "[RemoteRuntimeService] StartContainer Response" containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6"
Feb 11 15:37:54 oss38 hyperkube[5170]: I0211 15:37:54.234019 5170 kubelet.go:2250] "SyncLoop (PLEG): event for pod" pod="product-storage/engine-leader-rfff8" event=&{ID:207aaad1-659f-4db8-88e0-46e0cdcc3004 Type:ContainerStarted Data:32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6}
Feb 11 15:38:07 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:06+08:00" level=error msg="exec failed: unable to start container process: error adding pid 29095 to cgroups: failed to write 29095: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:07 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:07 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:07+08:00" level=error msg="exec failed: unable to start container process: error adding pid 29253 to cgroups: failed to write 29253: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:07 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:07 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:07+08:00" level=error msg="exec failed: unable to start container process: error adding pid 29390 to cgroups: failed to write 29390: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:07 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:07 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:07+08:00" level=error msg="exec failed: unable to start container process: error adding pid 29390 to cgroups: failed to write 29390: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:16 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:16+08:00" level=error msg="exec failed: unable to start container process: error adding pid 32020 to cgroups: failed to write 32020: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:16 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:16 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:16+08:00" level=error msg="exec failed: unable to start container process: error adding pid 32053 to cgroups: failed to write 32053: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:16 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:17 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:17+08:00" level=error msg="exec failed: unable to start container process: error adding pid 32080 to cgroups: failed to write 32080: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:17 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:17 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:17+08:00" level=error msg="exec failed: unable to start container process: error adding pid 32080 to cgroups: failed to write 32080: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:26 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:26+08:00" level=error msg="exec failed: unable to start container process: error adding pid 35104 to cgroups: failed to write 35104: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:26 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:26 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:26+08:00" level=error msg="exec failed: unable to start container process: error adding pid 35199 to cgroups: failed to write 35199: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:26 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:27 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:27+08:00" level=error msg="exec failed: unable to start container process: error adding pid 35251 to cgroups: failed to write 35251: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:27 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:27 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:27+08:00" level=error msg="exec failed: unable to start container process: error adding pid 35251 to cgroups: failed to write 35251: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:36 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:36+08:00" level=error msg="exec failed: unable to start container process: error adding pid 38806 to cgroups: failed to write 38806: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:36 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:36 oss38 systemd[4947]: run-runc-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6-runc.WfpeCY.mount: Succeeded.
Feb 11 15:38:36 oss38 hyperkube[5170]: rpc error: code = Unknown desc = command error: time="2025-02-11T15:38:36+08:00" level=error msg="exec failed: unable to start container process: error adding pid 38841 to cgroups: failed to write 38841: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice/crio-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope/cgroup.procs: no such file or directory"
Feb 11 15:38:36 oss38 hyperkube[5170]: > containerID="32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6" cmd=[/bin/sh -c /root/leader-service-live.sh]
Feb 11 15:38:36 oss38 systemd[18672]: run-runc-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6-runc.MxUu4Z.mount: Succeeded.
Feb 11 15:38:36 oss38 systemd[4947]: run-runc-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6-runc.MxUu4Z.mount: Succeeded.
Feb 11 15:38:36 oss38 systemd[1]: run-runc-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6-runc.MxUu4Z.mount: Succeeded. |
Another container [root@oss38 ~]# runc --root /run/runc list |grep 28a39c6a840cc
28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a 23438 running /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata 2025-02-11T07:37:53.728810543Z root
[root@oss38 ~]# runc --root /run/runc exec -t 28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a date
ERRO[0000] exec failed: unable to start container process: error adding pid 2739562 to cgroups: failed to write 2739562: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfd673ed5_bc0a_4ef4_aca6_3083d1ee619f.slice/crio-28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a.scope/cgroup.procs: no such file or directory
[root@oss38 ~]# runc --root /run/runc state 28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a
{
"ociVersion": "1.0.2-dev",
"id": "28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a",
"pid": 23438,
"status": "running",
"bundle": "/run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata",
"rootfs": "/var/lib/containers/storage/overlay/e78205534a801579cc2ebf41848a7f59686806754ead40525d993b1168b0fc3e/merged",
"created": "2025-02-11T07:37:53.728810543Z",
"annotations": {
"ccos.io/scc": "storage-scc",
"io.container.manager": "cri-o",
"io.kubernetes.container.hash": "781c3325",
"io.kubernetes.container.name": "engine-cds-blktarget-nvmeof-cont",
"io.kubernetes.container.ports": "[{\"hostPort\":4420,\"containerPort\":4420,\"protocol\":\"TCP\"}]",
"io.kubernetes.container.preStopHandler": "{\"exec\":{\"command\":[\"bin/sh\",\"-c\",\"/root/nvmeof_pre_stop.sh\"]}}",
"io.kubernetes.container.restartCount": "8",
"io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
"io.kubernetes.container.terminationMessagePolicy": "File",
"io.kubernetes.cri-o.Annotations": "{\"io.kubernetes.container.hash\":\"781c3325\",\"io.kubernetes.container.ports\":\"[{\\\"hostPort\\\":4420,\\\"containerPort\\\":4420,\\\"protocol\\\":\\\"TCP\\\"}]\",\"io.kubernetes.container.preStopHandler\":\"{\\\"exec\\\":{\\\"command\\\":[\\\"bin/sh\\\",\\\"-c\\\",\\\"/root/nvmeof_pre_stop.sh\\\"]}}\",\"io.kubernetes.container.restartCount\":\"8\",\"io.kubernetes.container.terminationMessagePath\":\"/dev/termination-log\",\"io.kubernetes.container.terminationMessagePolicy\":\"File\",\"io.kubernetes.pod.terminationGracePeriod\":\"30\"}",
"io.kubernetes.cri-o.ContainerID": "28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a",
"io.kubernetes.cri-o.ContainerType": "container",
"io.kubernetes.cri-o.Created": "2025-02-11T15:37:50.485499973+08:00",
"io.kubernetes.cri-o.Image": "3498a3ffb2a83fbd11f07f3f16f40cd24320756e7e0f53b97a4a1b895b60507b",
"io.kubernetes.cri-o.ImageName": "image.cestc.cn/ccos-ceastor/engine-cds-blktarget-nvmeof:test-no-check-tag-CeaStor-3.2.5-6013-20250109223410-012302",
"io.kubernetes.cri-o.ImageRef": "3498a3ffb2a83fbd11f07f3f16f40cd24320756e7e0f53b97a4a1b895b60507b",
"io.kubernetes.cri-o.Labels": "{\"io.kubernetes.container.name\":\"engine-cds-blktarget-nvmeof-cont\",\"io.kubernetes.pod.name\":\"engine-cds-blktarget-t6z4q\",\"io.kubernetes.pod.namespace\":\"product-storage\",\"io.kubernetes.pod.uid\":\"fd673ed5-bc0a-4ef4-aca6-3083d1ee619f\"}",
"io.kubernetes.cri-o.LogPath": "/var/log/pods/product-storage_engine-cds-blktarget-t6z4q_fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/engine-cds-blktarget-nvmeof-cont/8.log",
"io.kubernetes.cri-o.Metadata": "{\"name\":\"engine-cds-blktarget-nvmeof-cont\",\"attempt\":8}",
"io.kubernetes.cri-o.MountPoint": "/var/lib/containers/storage/overlay/e78205534a801579cc2ebf41848a7f59686806754ead40525d993b1168b0fc3e/merged",
"io.kubernetes.cri-o.Name": "k8s_engine-cds-blktarget-nvmeof-cont_engine-cds-blktarget-t6z4q_product-storage_fd673ed5-bc0a-4ef4-aca6-3083d1ee619f_8",
"io.kubernetes.cri-o.ResolvPath": "/run/containers/storage/overlay-containers/9c7aa779b2d0c0804aa967a527c69deba0978ccf0a218d10087d685257c5b20a/userdata/resolv.conf",
"io.kubernetes.cri-o.SandboxID": "9c7aa779b2d0c0804aa967a527c69deba0978ccf0a218d10087d685257c5b20a",
"io.kubernetes.cri-o.SandboxName": "k8s_engine-cds-blktarget-t6z4q_product-storage_fd673ed5-bc0a-4ef4-aca6-3083d1ee619f_0",
"io.kubernetes.cri-o.SeccompProfilePath": "",
"io.kubernetes.cri-o.Stdin": "false",
"io.kubernetes.cri-o.StdinOnce": "false",
"io.kubernetes.cri-o.TTY": "false",
"io.kubernetes.cri-o.Volumes": "[{\"container_path\":\"/mnt\",\"host_path\":\"/mnt\",\"readonly\":false,\"propagation\":2,\"selinux_relabel\":false},{\"container_path\":\"/tmp\",\"host_path\":\"/tmp\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/dev\",\"host_path\":\"/dev\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/hugepages-2Mi\",\"host_path\":\"/var/lib/kubelet/pods/fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/volumes/kubernetes.io~empty-dir/hugepage-2mi\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/localtime\",\"host_path\":\"/usr/share/zoneinfo/Asia/Shanghai\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/crash\",\"host_path\":\"/var/crash\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/storage/\",\"host_path\":\"/etc/storage/cbd/target\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/dev/shm\",\"host_path\":\"/dev/shm\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/dev/termination-log\",\"host_path\":\"/var/lib/kubelet/pods/fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/containers/engine-cds-blktarget-nvmeof-cont/d106b2d5\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/tmp\",\"host_path\":\"/var/tmp\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/run\",\"host_path\":\"/var/lib/kubelet/pods/fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/volumes/kubernetes.io~empty-dir/runpath\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/hosts\",\"host_path\":\"/var/lib/kubelet/pods/fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/etc-hosts\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/opt/storage/\",\"host_path\":\"/opt/storage\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/etc/cpu_set/\",\"host_path\":\"/etc/storage\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/log/storage\",\"host_path\":\"/var/log/storage/blktarget\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/usr/lib/modules\",\"host_path\":\"/usr/lib/modules\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/sys/fs/cgroup\",\"host_path\":\"/sys/fs/cgroup\",\"readonly\":false,\"propagation\":0,\"selinux_relabel\":false},{\"container_path\":\"/var/run/secrets/kubernetes.io/serviceaccount\",\"host_path\":\"/var/lib/kubelet/pods/fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/volumes/kubernetes.io~projected/kube-api-access-rtsx4\",\"readonly\":true,\"propagation\":0,\"selinux_relabel\":false}]",
"io.kubernetes.pod.name": "engine-cds-blktarget-t6z4q",
"io.kubernetes.pod.namespace": "product-storage",
"io.kubernetes.pod.terminationGracePeriod": "30",
"io.kubernetes.pod.uid": "fd673ed5-bc0a-4ef4-aca6-3083d1ee619f",
"kubernetes.io/config.seen": "2025-02-11T15:37:43.833381092+08:00",
"kubernetes.io/config.source": "api",
"org.systemd.property.After": "['crio.service']",
"org.systemd.property.CollectMode": "'inactive-or-failed'",
"org.systemd.property.DefaultDependencies": "true",
"org.systemd.property.TimeoutStopUSec": "uint64 30000000",
"workload.ccos.io/skip-cpumanager-management": "true"
},
"owner": ""
}
[root@oss38 ~]# ps -ef |grep 28a39c6a840cc
root 21944 1 0 15:37 ? 00:00:00 /usr/bin/conmon -b /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata -c 28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a --exit-dir /var/run/crio/exits -l /var/log/pods/product-storage_engine-cds-blktarget-t6z4q_fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/engine-cds-blktarget-nvmeof-cont/8.log --log-level info -n k8s_engine-cds-blktarget-nvmeof-cont_engine-cds-blktarget-t6z4q_product-storage_fd673ed5-bc0a-4ef4-aca6-3083d1ee619f_8 -P /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata -r /usr/bin/runc --runtime-arg --root=/run/runc --socket-dir-path /var/run/crio --syslog -u 28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a -s
root 2753044 3465958 0 18:17 pts/0 00:00:00 grep 28a39c6a840cc
[root@oss38 ~]# pstree -plTS 21944
conmon(21944)───sh(23438,ipc,mnt,pid,uts)───sleep(2761934)
[root@oss38 ~]# ps ufS 21944 23438 2761934
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 21944 0.0 0.0 81308 2428 ? Ssl 15:37 0:00 /usr/bin/conmon -b /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata -c 28a39c6a840cc5f388ac395ae
root 23438 0.2 0.0 4188 3308 ? Ss 15:37 0:27 \_ sh /root/start_nvmeof_tgt.sh
[root@oss38 ~]# strace -p 21944
strace: Process 21944 attached
restart_syscall(<... resuming interrupted restart_syscall ...>
[root@oss38 21944]# cat /proc/21944/stack
[<0>] poll_schedule_timeout.constprop.13+0x42/0x70
[<0>] do_sys_poll+0x3d6/0x590
[<0>] do_restart_poll+0x46/0x80
[<0>] do_syscall_64+0x5f/0x220
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[root@oss38 ~]# lsof -p 21944
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
conmon 21944 root cwd DIR 0,22 180 6904 /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata
conmon 21944 root rtd DIR 8,88 4096 128 /
conmon 21944 root txt REG 8,88 151944 403762523 /usr/bin/conmon
conmon 21944 root mem REG 8,88 149624 19239 /usr/lib64/libgpg-error.so.0.29.0
conmon 21944 root mem REG 8,88 14424 18807 /usr/lib64/libdl-2.28.so
conmon 21944 root mem REG 8,88 604504 18852 /usr/lib64/libpcre2-8.so.0.10.0
conmon 21944 root mem REG 8,88 14232 18393 /usr/lib64/libsecurity.so.0.0.0
conmon 21944 root mem REG 8,88 96224 143 /usr/lib64/libgcc_s-7.3.0-20220207.so.1
conmon 21944 root mem REG 8,88 1191944 19272 /usr/lib64/libgcrypt.so.20.2.6
conmon 21944 root mem REG 8,88 166176 18749 /usr/lib64/libselinux.so.1
conmon 21944 root mem REG 8,88 231800 19811 /usr/lib64/liblz4.so.1.9.2
conmon 21944 root mem REG 8,88 161832 19227 /usr/lib64/liblzma.so.5.2.5
conmon 21944 root mem REG 8,88 39384 18823 /usr/lib64/librt-2.28.so
conmon 21944 root mem REG 8,88 116328 18819 /usr/lib64/libpthread-2.28.so
conmon 21944 root mem REG 8,88 469104 19404 /usr/lib64/libpcre.so.1.2.12
conmon 21944 root mem REG 8,88 1791192 18805 /usr/lib64/libc-2.28.so
conmon 21944 root mem REG 8,88 688520 879794 /usr/lib64/libsystemd.so.0.27.0
conmon 21944 root mem REG 8,88 1229920 872307 /usr/lib64/libglib-2.0.so.0.6600.8
conmon 21944 root mem REG 8,88 26398 268657446 /usr/lib64/gconv/gconv-modules.cache
conmon 21944 root mem REG 8,88 162592 18798 /usr/lib64/ld-2.28.so
conmon 21944 root 0r CHR 1,3 0t0 5 /dev/null
conmon 21944 root 1w CHR 1,3 0t0 5 /dev/null
conmon 21944 root 2w CHR 1,3 0t0 5 /dev/null
conmon 21944 root 3u unix 0x00000000d7f14577 0t0 195617 type=STREAM
conmon 21944 root 4r CHR 1,3 0t0 5 /dev/null
conmon 21944 root 5u a_inode 0,13 0 13347 [eventfd]
conmon 21944 root 6w REG 8,85 87458 33675717 /var/log/pods/product-storage_engine-cds-blktarget-t6z4q_fd673ed5-bc0a-4ef4-aca6-3083d1ee619f/engine-cds-blktarget-nvmeof-cont/8.log
conmon 21944 root 7w CHR 1,3 0t0 5 /dev/null
conmon 21944 root 8r FIFO 0,12 0t0 195007 pipe
conmon 21944 root 10r FIFO 0,12 0t0 195008 pipe
conmon 21944 root 11r REG 0,36 0 30049 /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfd673ed5_bc0a_4ef4_aca6_3083d1ee619f.slice/crio-28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a.scope/memory.oom_control
conmon 21944 root 12r FIFO 0,22 0t0 7122 /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata/ctl
conmon 21944 root 13u unix 0x00000000c3298c9e 0t0 195009 type=DGRAM
conmon 21944 root 14u unix 0x00000000c1ec6e2d 0t0 195010 /proc/self/fd/12/attach type=SEQPACKET
conmon 21944 root 15w FIFO 0,22 0t0 7122 /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata/ctl
conmon 21944 root 16r FIFO 0,22 0t0 7123 /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata/winsz
conmon 21944 root 17w FIFO 0,22 0t0 7123 /run/containers/storage/overlay-containers/28a39c6a840cc5f388ac395aea5f969f91074aea718fd983b2475eb7a1e8fe4a/userdata/winsz
conmon 21944 root 18u a_inode 0,13 0 13347 [eventfd]
conmon 21944 root 19u a_inode 0,13 0 13347 [eventfd] |
Could you please help to provide more informations, for example:
And I want to know which cgroup path is in for the init process of the container. |
The container was restarted due to a probe timeout. Next time, I will provide the config.json file. Regarding issue 2 and issue 3, the crio-containerID.scope file no longer exists. I'm also puzzled that since the crio-containerID.scope no longer exists, the container should have exited. I want to understand this issue as well. # Only remain the crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope
[root@oss38 ~]# ls -alh /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod207aaad1_659f_4db8_88e0_46e0cdcc3004.slice |grep 32da98c8
drwxr-xr-x 2 root root 0 Feb 11 15:37 crio-conmon-32da98c8522068be05a7510a3eddb09cad6ed2f03286180d0e931d0d0b6224a6.scope |
Yes, cat /proc/$CONTAINER_INIT_PID/cgroups will be very interesting to take a look at ($CONTAINER_INIT_PID is the second column in Any journalctl entries related to container ID, too. Also, if you can reproduce it with a more recent version (runc v1.1.12 is a tad old). So far I see that the container is not gone but somehow systemd thinks it is (perhaps because the container init pid is not in the systemd cgroup). One reason for that might be that systemd cgroup manager is not always used (and fs cgroup manager ignores the absense of systemd cgroup). Note we're not really interested in the parent (conmon) cgroups -- only in the container's cgroups. Also, it would be interesting to see the contents of the container's init (which is a shell script, /root/start_nvmeof_tgt.sh). I see there's aso cri-o in the mix; @haircommander maybe you can remember something similar (container is not in systemd cgroup so systemd thinks it's gone). |
Another container [root@oss37 ~]# runc --root /run/runc exec -t 5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c date
ERRO[0000] exec failed: unable to start container process: error adding pid 259782 to cgroups: failed to write 259782: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope/cgroup.procs: no such file or directory
[root@oss37 ~]# ls /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
ls: cannot access '/sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope': No such file or directory
[root@oss37 ~]# crictl ps -a |grep 5eed917563583
5eed917563583 722f1e134a0169ff99658855480f5435779b04f5e8c2823f20f7da5366d2b762 13 hours ago Running calico-node 49 753485c65e7e7 calico-node-hx7cw
[root@oss37 ~]# runc --root /run/runc list |grep 5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c
5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c 21689 running /run/containers/storage/overlay-containers/5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c/userdata 2025-02-12T16:13:05.746965462Z root
[root@oss37 ~]# pstree -plTS 21689
runsvdir(21689,ipc,mnt,pid,uts)─┬─runsv(23266)───calico-node(23276)
├─runsv(23267)───calico-node(23277)
├─runsv(23268)───calico-node(23275)
├─runsv(23269)───calico-node(23281)
├─runsv(23270)───bird(24697)
├─runsv(23271)───bird6(24638)
├─runsv(23272)───calico-node(23285)
└─runsv(23273)───calico-node(23283)
[root@oss37 ~]# ps ufS 21689 23266 23267 23268 23269 23270 23271 23272 23273
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 21689 0.0 0.0 2660 968 ? Ss 00:13 0:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root 23266 0.0 0.0 2508 944 ? Ss 00:13 0:00 \_ runsv felix
root 23267 0.0 0.0 2508 944 ? Ss 00:13 0:00 \_ runsv monitor-addresses
root 23268 0.0 0.0 2508 920 ? Ss 00:13 0:00 \_ runsv allocate-tunnel-addrs
root 23269 0.0 0.0 2508 980 ? Ss 00:13 0:00 \_ runsv node-status-reporter
root 23270 0.0 0.0 2508 924 ? Ss 00:13 0:00 \_ runsv bird
root 23271 0.0 0.0 2508 972 ? Ss 00:13 0:00 \_ runsv bird6
root 23272 0.0 0.0 2508 988 ? Ss 00:13 0:00 \_ runsv confd
root 23273 0.0 0.0 2508 948 ? Ss 00:13 0:00 \_ runsv cni
[root@oss37 ~]# cat /proc/21689/cgroup
13:freezer:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
12:cpuset:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
11:files:/
10:hugetlb:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
9:memory:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
8:blkio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
7:devices:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
6:rdma:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
5:net_cls,net_prio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
4:perf_event:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
3:pids:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
2:cpu,cpuacct:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
1:name=systemd:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice/crio-conmon-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope
0::/ @lifubang @kolyshkin |
Part of config.json [root@oss37 ~]# cat /var/run/containers/storage/overlay-containers/5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c/userdata/config.json
......
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"cgroupsPath": "kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice:crio:5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c",
"namespaces": [
{
"type": "pid"
},
{
"type": "ipc",
"path": "/var/run/ipcns/404d1246-1ffb-40bc-8d8e-b51f500ef898"
},
{
"type": "uts",
"path": "/var/run/utsns/404d1246-1ffb-40bc-8d8e-b51f500ef898"
},
{
"type": "mount"
}
],
...... Use the systemd-cgls command
|
What dose this mean? I think it should be |
This comment has been minimized.
This comment has been minimized.
Yes,you can see normal container init process cgroup,it should be/crio-,but actually is crio -conmon,thus container init process change to crio-conm cgroup. |
v1 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
So, if you see the content of |
Yes,you can see systemd-cgls command out. |
This is the command of systemd-cgsl # Abnormal ├─kubepods-burstable-poda5c1f8d5_1919_434d_add7_4afaafdaef89.slice │ └─crio-conmon-5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c.scope │ ├─17283 /usr/bin/conmon -b /run/containers/storage/overlay-containers/5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c/userdata -c 5eed917563583925ffdbd40a64efb2aaffe58638e57add8becaf460fe74f5b1c --exit-dir> │ ├─21689 /usr/local/bin/runsvdir -P /etc/service/enabled │ ├─23266 runsv felix │ ├─23267 runsv monitor-addresses │ ├─23268 runsv allocate-tunnel-addrs │ ├─23269 runsv node-status-reporter │ ├─23270 runsv bird │ ├─23271 runsv bird6 │ ├─23272 runsv confd │ ├─23273 runsv cni │ ├─23275 calico-node -allocate-tunnel-addrs │ ├─23276 calico-node -felix │ ├─23277 calico-node -monitor-addresses │ ├─23281 calico-node -status-reporter │ ├─23283 calico-node -monitor-token │ ├─23285 calico-node -confd │ ├─24638 bird6 -R -s /var/run/calico/bird6.ctl -d -c /etc/calico/confd/config/bird6.cfg │ └─24697 bird -R -s /var/run/calico/bird.ctl -d -c /etc/calico/confd/config/bird.cfg |
So, maybe |
Who moved this process from |
I'm a bit confused about the relationship between crio-conmon.scope, crio.scope cgroups. These two (crio-conmon.scope and crio.scope) should be in a parallel relationship. Another doubt is whether crio-conmon-.scope causes crio-.scope to exit, or if crio-.scope exits and crio-conmon-.scope takes over the container init process. |
correct sorry there's a lot to catch up on--when we see the cgroup.procs error, is the corresponding conmon still running? as in crio-conmon.scope still running despite crio-.scope not? If so, then it may be a conmon issue of missing the container's exit. From the logs above, it looks like conmon may still be running though that would be very strange. If conmon missed the exit, then it wouldn't notify crio the container exited, which means crio's state wouldn't be updated, causing the kubelet to continue to do exec probes on the container (and for crictl to allow you to run exec on a container process that is gone). |
Yes, crio-conmon.scope is still running even though crio-.scope is not. |
I'm also puzzled by these two issues. Could you help answer them? @haircommander |
hmm so before runc moves the container process to the correct cgroup it would be in conmon cgroup (as the container process is run by conmon) is this always the calico pod that does this? and what version of cri-o/conmon? |
It occurs from time to time,but it's not guaranteed to happen every time.I want to add some print statements to crio or conmon to capture some information when it reproduces. conmon --version
conmon version 2.0.30
[root@oss38 systemd]# crio --version
crio version 1.26.4
Version: 1.26.4
GitCommit: 7b4340777efe2eaf4a6ec7541a0a178cec30ca64 |
I encountered another situation where the services rio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope and crio-conmon-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope have both exited, and the crio.slice has taken over the process. [root@node73 658f5d294301497b93b492d130ef4b78]# runc --debug --root /run/runc exec -t 5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6 date
DEBU[0000]libcontainer/cgroups/file.go:95 github.com/opencontainers/runc/libcontainer/cgroups.prepareOpenat2.func1() openat2 not available, falling back to securejoin
DEBU[0000] nsexec[2914244]: => nsexec container setup
DEBU[0000] nsexec[2914244]: update /proc/self/oom_score_adj to '-997'
DEBU[0000] nsexec[2914244]: set process as non-dumpable
DEBU[0000] nsexec-0[2914244]: ~> nsexec stage-0
DEBU[0000] nsexec-0[2914244]: spawn stage-1
DEBU[0000] nsexec-0[2914244]: -> stage-1 synchronisation loop
DEBU[0000] nsexec-1[2914267]: ~> nsexec stage-1
DEBU[0000] nsexec-1[2914267]: setns(0x8000000) into ipc namespace (with path /proc/12657/ns/ipc)
DEBU[0000] nsexec-1[2914267]: setns(0x4000000) into uts namespace (with path /proc/12657/ns/uts)
DEBU[0000] nsexec-1[2914267]: setns(0x20000000) into pid namespace (with path /proc/12657/ns/pid)
DEBU[0000] nsexec-1[2914267]: setns(0x20000) into mnt namespace (with path /proc/12657/ns/mnt)
DEBU[0000] nsexec-1[2914267]: unshare remaining namespace (except cgroupns)
DEBU[0000] nsexec-1[2914267]: spawn stage-2
DEBU[0000] nsexec-1[2914267]: request stage-0 to forward stage-2 pid (2914268)
DEBU[0000] nsexec-0[2914244]: stage-1 requested pid to be forwarded
DEBU[0000] nsexec-0[2914244]: forward stage-1 (2914267) and stage-2 (2914268) pids to runc
DEBU[0000] nsexec-2[82644]: ~> nsexec stage-2
DEBU[0000] nsexec-1[2914267]: signal completion to stage-0
DEBU[0000] nsexec-1[2914267]: <~ nsexec stage-1
DEBU[0000] nsexec-0[2914244]: stage-1 complete
DEBU[0000] nsexec-0[2914244]: <- stage-1 synchronisation loop
DEBU[0000] nsexec-0[2914244]: -> stage-2 synchronisation loop
DEBU[0000] nsexec-0[2914244]: signalling stage-2 to run
DEBU[0000] nsexec-2[82644]: signal completion to stage-0
DEBU[0000] nsexec-2[82644]: <= nsexec container setup
DEBU[0000] nsexec-2[82644]: booting up go runtime ...
DEBU[0000] nsexec-0[2914244]: stage-2 complete
DEBU[0000] nsexec-0[2914244]: <- stage-2 synchronisation loop
DEBU[0000] nsexec-0[2914244]: <~ nsexec stage-0
ERRO[0000]utils.go:61 main.fatalWithCode() exec failed: unable to start container process: error adding pid 2914268 to cgroups: failed to write 2914268: open /sys/fs/cgroup/blkio/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory
[root@node73 658f5d294301497b93b492d130ef4b78]# runc --root /run/runc/ list |grep 5421387b
5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6 12657 running /run/containers/storage/overlay-containers/5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6/userdata 2025-02-17T14:07:04.607184545Z root
[root@node73 658f5d294301497b93b492d130ef4b78]# cat /proc/12657/cgroup
12:rdma:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope
11:memory:/system.slice/crio.service
10:cpuset:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope
9:pids:/system.slice/crio.service
8:blkio:/system.slice/crio.service
7:perf_event:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope
6:freezer:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope
5:net_cls,net_prio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope
4:hugetlb:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope
3:cpu,cpuacct:/system.slice/crio.service
2:devices:/system.slice/crio.service
1:name=systemd:/system.slice/crio.service
0::/
[root@node73 658f5d294301497b93b492d130ef4b78]# pstree -plTS 12657
runsvdir(12657,ipc,mnt,pid,uts)─┬─runsv(15797)───calico-node(15839)
├─runsv(15798)───calico-node(15812)
├─runsv(15799)───calico-node(15813)
├─runsv(15800)───calico-node(15816)
├─runsv(15801)───bird(17186)
├─runsv(15802)───bird6(17176)
├─runsv(15804)───calico-node(15838)
└─runsv(15806)───calico-node(15825)
[root@node73 658f5d294301497b93b492d130ef4b78]# runc --root /run/runc exec -t 5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6 /date
ERRO[0000] exec failed: unable to start container process: error adding pid 2878132 to cgroups: failed to write 2878132: open /sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory
[root@node73 658f5d294301497b93b492d130ef4b78]# runc --root /run/runc exec -t 5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6 date
ERRO[0000] exec failed: unable to start container process: error adding pid 2882552 to cgroups: failed to write 2882552: open /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory This is the container info level log Feb 17 22:07:02 node73 systemd[1]: Started crio-conmon-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope.
Feb 17 22:07:02 node73 conmon[12299]: conmon 5421387b6d8ed001ef38 <ninfo>: addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach}
Feb 17 22:07:02 node73 conmon[12299]: conmon 5421387b6d8ed001ef38 <ninfo>: terminal_ctrl_fd: 12
Feb 17 22:07:02 node73 conmon[12299]: conmon 5421387b6d8ed001ef38 <ninfo>: winsz read side: 16, winsz write side: 16
Feb 17 22:07:03 node73 systemd[1]: Started libcontainer container 5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.
Feb 17 22:07:04 node73 systemd[1]: crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope: Succeeded.
Feb 17 22:07:04 node73 systemd[1]: crio-conmon-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope: Succeeded.
Feb 17 22:07:04 node73 crio[4725]: time="2025-02-17 22:07:04.818322213+08:00" level=info msg="Created container 5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6: ccos-calico/calico-node-7dzpv/calico-node" id=c6330067-646a-42ee-b777-a7304fcecb4b name=/runtime.v1.RuntimeService/CreateContainer
Feb 17 22:07:04 node73 hyperkube[4968]: I0217 22:07:04.820126 4968 remote_runtime.go:446] "[RemoteRuntimeService] CreateContainer" podSandboxID="880a6c5f8b44d0fa974c7ea590d108f861caaff8b466f817aebc629c23bf0a76" containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6"
Feb 17 22:07:04 node73 hyperkube[4968]: I0217 22:07:04.820275 4968 remote_runtime.go:459] "[RemoteRuntimeService] StartContainer" containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" timeout="2m0s"
Feb 17 22:07:04 node73 crio[4725]: time="2025-02-17 22:07:04.821601196+08:00" level=info msg="Starting container: 5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" id=4230152e-7bcd-4ff9-aec4-4e3a6c6847e8 name=/runtime.v1.RuntimeService/StartContainer
Feb 17 22:07:04 node73 crio[4725]: time="2025-02-17 22:07:04.871483427+08:00" level=info msg="Started container" PID=12657 containerID=5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6 description=ccos-calico/calico-node-7dzpv/calico-node id=4230152e-7bcd-4ff9-aec4-4e3a6c6847e8 name=/runtime.v1.RuntimeService/StartContainer sandboxID=880a6c5f8b44d0fa974c7ea590d108f861caaff8b466f817aebc629c23bf0a76
Feb 17 22:07:04 node73 hyperkube[4968]: I0217 22:07:04.945011 4968 remote_runtime.go:477] "[RemoteRuntimeService] StartContainer Response" containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6"
Feb 17 22:07:05 node73 hyperkube[4968]: I0217 22:07:05.251898 4968 kubelet.go:2250] "SyncLoop (PLEG): event for pod" pod="ccos-calico/calico-node-7dzpv" event=&{ID:cafb7d2e-c99d-4f7a-88fc-dd55531fd545 Type:ContainerStarted Data:5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6}
Feb 17 22:07:06 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:06+08:00" level=error msg="exec failed: unable to start container process: error adding pid 13638 to cgroups: failed to write 13638: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:06 node73 hyperkube[4968]: > containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" cmd=[/bin/calico-node -felix-ready]
Feb 17 22:07:07 node73 systemd[12682]: run-runc-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6-runc.HPpMuB.mount: Succeeded.
Feb 17 22:07:07 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:07+08:00" level=error msg="exec failed: unable to start container process: error adding pid 13953 to cgroups: failed to write 13953: open /sys/fs/cgroup/devices/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:07 node73 hyperkube[4968]: > containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" cmd=[/bin/calico-node -felix-ready]
Feb 17 22:07:08 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:08+08:00" level=error msg="exec failed: unable to start container process: error adding pid 14025 to cgroups: failed to write 14025: open /sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:08 node73 hyperkube[4968]: > containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" cmd=[/bin/calico-node -felix-ready]
Feb 17 22:07:08 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:08+08:00" level=error msg="exec failed: unable to start container process: error adding pid 14025 to cgroups: failed to write 14025: open /sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:08 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:08+08:00" level=error msg="exec failed: unable to start container process: error adding pid 14387 to cgroups: failed to write 14387: open /sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:08 node73 hyperkube[4968]: > containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" cmd=[/bin/calico-node -felix-ready]
Feb 17 22:07:08 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:08+08:00" level=error msg="exec failed: unable to start container process: error adding pid 14448 to cgroups: failed to write 14448: open /sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:08 node73 hyperkube[4968]: > containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" cmd=[/bin/calico-node -felix-ready]
Feb 17 22:07:08 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:08+08:00" level=error msg="exec failed: unable to start container process: error adding pid 14584 to cgroups: failed to write 14584: open /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory"
Feb 17 22:07:08 node73 hyperkube[4968]: > containerID="5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6" cmd=[/bin/calico-node -felix-ready]
Feb 17 22:07:08 node73 hyperkube[4968]: rpc error: code = Unknown desc = command error: time="2025-02-17T22:07:08+08:00" level=error msg="exec failed: unable to start container process: error adding pid 14584 to cgroups: failed to write 14584: open /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcafb7d2e_c99d_4f7a_88fc_dd55531fd545.slice/crio-5421387b6d8ed001ef38d5461a4675e4822561082d3115c254c11addad43c1a6.scope/cgroup.procs: no such file or directory" |
Description
failed to write 296291: open /sys/fs/cgroup/systemd/kubepods.slice/.../crio-66d6d4ac9851cfcab8400277ad96770ce52c1d75eeac29046753875056eacaed.scope/cgroup-procs: no such file or directory
I discovered through the system logs that when this issue occurs, the systemd logs will indicate that the corresponding containerID.scope has succeeded, which means that the containerID.scope service has exited, but the container still exists. Under normal circumstances, you wouldn't see containerID.scope succeeded. If you do encounter containerID.scope succeeded, it implies that the container has exited. I have no idea about the cause of this problem.
The corresponding service no longer exists, but the container is still running.
[root@ceashare23-node-3 kubepods-burstable-podf921ac10_59ef_4825_ac61_248f1989c789.slice]# systemctl status crio-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope Unit crio-0644511bc2734f5ff9f9532fdecddb9f1b92597fe269168b137ada040c3a3118.scope could not be found.
Normal circumstances, it should be like this:
Steps to reproduce the issue
Sometimes it happens again,not always
Describe the results you received and expected
ContainerID.scope has succeeded, which means that the containerID.scope service has exited, but the container still exists. Under normal circumstances, you wouldn't see containerID.scope succeeded. If you do encounter containerID.scope succeeded, it implies that the container has exited
What version of runc are you using?
runc --version
runc version 1.1.12
commit: v1.1.12-0-g51d5e946
spec: 1.0.2-dev
go: go1.20.13
Host OS information
No response
Host kernel information
Linux compute-node1 4.19.90-52.39 x86_64
[root@compute-node1 cgroup]# systemctl --version
systemd 243
The text was updated successfully, but these errors were encountered: