-
Notifications
You must be signed in to change notification settings - Fork 638
Add a lock around cgroupd communication. #2620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5fe25b7
to
ae55633
Compare
PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 3 empty lines which have the tabs at the beginning so the linter is complaining.
I think the commit title should be "restorer: Add a lock around cgroupd communication" to follow the project's convention.
These are nits. Overall, the approach LGTM.
ae55633
to
a361b71
Compare
thanks, PTAL. I tried running 'make lint', but it wants to run 'ruff' which isn't available on Ubuntu 24. |
Threads are put into cgroups through the cgroupd thread, which communicates with other threads using a socketpair. Previously, each thread received a dup'd copy of the socket, and did the following sendmsg(socket_dup_fd, my_cgroup_set); // wait for ack. while (1) { recvmsg(socket_dup_fd, &h, MSG_PEEK); if (h.pid != my_pid) continue; recvmsg(socket_dup_fd, &h, 0); } close(socket_dup_fd); When restoring many threads, many threads would be spinning in the above loop waiting for their PID to appear. In my test-case, restoring a process with a 11.5G heap and 491 threads could take anywhere between 10 seconds and 60 seconds to complete. To avoid the spinning, we drop the loop and MSG_PEEK, and add a lock around the above code. This does not decrease parallelism, as the cgroupd daemon uses a single thread anyway. With the lock in place, the same restore consistently takes around 10 seconds on my machine (Thinkpad P14s, AMD Ryzen 8840HS). There is a similar "daemon" thread for user namespaces. That already is protected with a similar userns_sync_lock in __userns_call(). Fixes checkpoint-restore#2614 Signed-off-by: Han-Wen Nienhuys <[email protected]>
a361b71
to
3240a17
Compare
Have you tried |
Merged. Thanks a lot. |
Threads are put into cgroups through the cgroupd thread, which communicates with other threads using a socketpair. Each thread received a dup'd copy of the socket, and did the following
When restoring many threads, most threads would be spinning in the above loop waiting for their PID to appear.
In my test-case, restoring a process with a 11.5G heap and 491 threads could take anywhere between 10 seconds and 60 seconds to complete.
To avoid the spinning, we drop the loop and MSG_PEEK, and add a lock around the above code. This does not decrease parallelism, as the cgroupd daemon uses a single thread anyway.
With the lock in place, the same restore takes a consistent 9.2 seconds on my machine (Thinkpad P14s, AMD Ryzen 8840HS).
There is a similar "daemon" thread for user namespaces. That already is protected with a similar userns_sync_lock in __userns_call().
Fixes #2614