Skip to content

Commit 2d3fb7b

Browse files
committed
freeze_processes: implement kludges for cgroup v1
Cgroup v1 freezer has always been problematic, failing to freeze a cgroup. In runc, we have implemented a few kludges to increase the chance of succeeding, but those are used when runc freezes a cgroup for its own purposes (for "runc pause" and to modify device properties for cgroup v1). When criu is used, it fails to freeze a cgroup from time to time (see [1], [2]). Let's try adding kludges similar to ones in runc. Alas, I have absolutely no way to test this, so please review carefully. [1]: opencontainers/runc#4273 [2]: opencontainers/runc#4457 Signed-off-by: Kir Kolyshkin <[email protected]>
1 parent 46e0a0f commit 2d3fb7b

File tree

1 file changed

+28
-0
lines changed

1 file changed

+28
-0
lines changed

criu/seize.c

+28
Original file line numberDiff line numberDiff line change
@@ -599,6 +599,34 @@ static int freeze_processes(void)
599599
goto err;
600600
}
601601
nanosleep(&req, NULL);
602+
603+
if (cgroup_v2)
604+
continue;
605+
606+
/* As per older kernel docs (freezer-subsystem.txt before
607+
* the kernel commit ef9fe980c6fcc1821), if FREEZING is seen,
608+
* userspace should either retry or thaw. While current
609+
* kernel cgroup v1 docs no longer mention a need to retry,
610+
* even recent kernels can't reliably freeze a cgroup v1.
611+
*
612+
* Let's keep asking the kernel to freeze from time to time.
613+
* In addition, do occasional thaw/sleep/freeze.
614+
*
615+
* This is still a game of chances (the real fix belongs to the kernel)
616+
* but these kludges might improve the probability of success.
617+
*
618+
* Cgroup v2 does not have this problem.
619+
*/
620+
switch (i%32) {
621+
case 30:
622+
freezer_write_state(fd, THAWED);
623+
break;
624+
case 9:
625+
case 20:
626+
case 31:
627+
freezer_write_state(fd, FROZEN);
628+
break;
629+
}
602630
}
603631

604632
if (i > nr_attempts) {

0 commit comments

Comments
 (0)