freeze_processes: implement kludges for cgroup v1

kolyshkin · kolyshkin · commit 3bf115ccb9a4 · 2024-12-16T14:14:01.000-08:00
Cgroup v1 freezer has always been problematic, failing to freeze a cgroup. In runc, we have implemented a few kludges to increase the chance of succeeding, but those are used when runc freezes a cgroup for its own purposes (for "runc pause" and to modify device properties for cgroup v1). When criu is used, it fails to freeze a cgroup from time to time (see [1], [2]). Let's try adding kludges similar to ones in runc. Alas, I have absolutely no way to test this, so please review carefully. [1]: opencontainers/runc#4273 [2]: opencontainers/runc#4457 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
diff --git a/criu/seize.c b/criu/seize.c
@@ -542,6 +542,7 @@ static int freeze_processes(void)
 	enum freezer_state state = THAWED;
 
 	static const unsigned long step_ms = 100;
+	/* Since opts.timeout is in seconds, multiply it by 1000 to convert to milliseconds. */
 	unsigned long nr_attempts = (opts.timeout * 1000) / step_ms;
 	unsigned long i = 0;
 
@@ -586,6 +587,7 @@ static int freeze_processes(void)
 		 * transition stage.
 		 */
 		for (; i <= nr_attempts; i++) {
+			nanosleep(&req, NULL);
 			state = get_freezer_state(fd);
 			if (state == FREEZER_ERROR) {
 				close(fd);
@@ -598,7 +600,35 @@ static int freeze_processes(void)
 				pr_err("Unable to freeze cgroup %s (timed out)\n", opts.freeze_cgroup);
 				goto err;
 			}
-			nanosleep(&req, NULL);
+
+			if (cgroup_v2)
+				continue;
+
+			/* As per older kernel docs (freezer-subsystem.txt before
+			 * the kernel commit ef9fe980c6fcc1821), if FREEZING is seen,
+			 * userspace should either retry or thaw. While current
+			 * kernel cgroup v1 docs no longer mention a need to retry,
+			 * even recent kernels can't reliably freeze a cgroup v1.
+			 *
+			 * Let's keep asking the kernel to freeze from time to time.
+			 * In addition, do occasional thaw/sleep/freeze.
+			 *
+			 * This is still a game of chances (the real fix belongs to the kernel)
+			 * but these kludges might improve the probability of success.
+			 *
+			 * Cgroup v2 does not have this problem.
+			 */
+			switch (i % 32) {
+			case 9:
+			case 20:
+				freezer_write_state(fd, FROZEN);
+				break;
+			case 31:
+				freezer_write_state(fd, THAWED);
+				nanosleep(&req, NULL);
+				freezer_write_state(fd, FROZEN);
+				break;
+			}
 		}
 
 		if (i > nr_attempts) {