Skip to content

Commit 868e9fa

Browse files
committed
freeze_processes: implement kludges for cgroup v1
Cgroup v1 freezer has always been problematic, failing to freeze a cgroup. In runc, we have implemented a few kludges to increase the chance of succeeding, but those are used when runc freezes a cgroup for its own purposes (for "runc pause" and to modify device properties for cgroup v1). When criu is used, it fails to freeze a cgroup from time to time (see [1], [2]). Let's try adding kludges similar to ones in runc. Alas, I have absolutely no way to test this, so please review carefully. [1]: opencontainers/runc#4273 [2]: opencontainers/runc#4457 Signed-off-by: Kir Kolyshkin <[email protected]>
1 parent 03d4c48 commit 868e9fa

File tree

1 file changed

+31
-0
lines changed

1 file changed

+31
-0
lines changed

criu/seize.c

+31
Original file line numberDiff line numberDiff line change
@@ -539,6 +539,34 @@ static int prepare_freezer_for_interrupt_only_mode(void)
539539
return exit_code;
540540
}
541541

542+
static void cgroupv1_freezer_kludges(int fd, int iter, const struct timespec *req) {
543+
/* As per older kernel docs (freezer-subsystem.txt before
544+
* the kernel commit ef9fe980c6fcc1821), if FREEZING is seen,
545+
* userspace should either retry or thaw. While current
546+
* kernel cgroup v1 docs no longer mention a need to retry,
547+
* even recent kernels can't reliably freeze a cgroup v1.
548+
*
549+
* Let's keep asking the kernel to freeze from time to time.
550+
* In addition, do occasional thaw/sleep/freeze.
551+
*
552+
* This is still a game of chances (the real fix belongs to the kernel)
553+
* but these kludges might improve the probability of success.
554+
*
555+
* Cgroup v2 does not have this problem.
556+
*/
557+
switch (iter % 32) {
558+
case 9:
559+
case 20:
560+
freezer_write_state(fd, FROZEN);
561+
break;
562+
case 31:
563+
freezer_write_state(fd, THAWED);
564+
nanosleep(req, NULL);
565+
freezer_write_state(fd, FROZEN);
566+
break;
567+
}
568+
}
569+
542570
static int freeze_processes(void)
543571
{
544572
int fd, exit_code = -1;
@@ -597,6 +625,9 @@ static int freeze_processes(void)
597625
if (state == FROZEN || i++ == nr_attempts || alarm_timeouted())
598626
break;
599627

628+
if (!cgroup_v2)
629+
cgroupv1_freezer_kludges(fd, i, &req);
630+
600631
nanosleep(&req, NULL);
601632
}
602633

0 commit comments

Comments
 (0)