Skip to content

Commit a988c30

Browse files
kolyshkinrst0git
authored andcommitted
freeze_processes: fix logic
There are a few issues with the freeze_processes logic: 1. Commit 9fae23f grossly (by 1000x) miscalculated the number of attempts required, as a result, we are seeing something like this: > (00.000340) freezing processes: 100000 attempts with 100 ms steps > (00.000351) freezer.state=THAWED > (00.000358) freezer.state=FREEZING > (00.100446) freezer.state=FREEZING > ...close to 100 lines skipped... > (09.915110) freezer.state=FREEZING > (10.000432) Error (criu/cr-dump.c:1467): Timeout reached. Try to interrupt: 0 > (10.000563) freezer.state=FREEZING For 10s with 100ms steps we only need 100 attempts, not 100000. 2. When the timeout is hit, the "failed to freeze cgroup" error is not printed, and the log_unfrozen_stacks is not called either. 3. The nanosleep at the last iteration is useless (this was hidden by issue 1 above, as the timeout was hit first). Fix all these. While at it, 4. Amend the error message with the number of attempts, sleep duration, and timeout. 5. Modify the "freezing cgroup" debug message to be in sync with the above error. Was: > freezing processes: 100000 attempts with 100 ms steps Now: > freezing cgroup some/name: 100 x 100ms attempts, timeout: 10s Signed-off-by: Kir Kolyshkin <[email protected]>
1 parent 39ab9c5 commit a988c30

File tree

1 file changed

+12
-13
lines changed

1 file changed

+12
-13
lines changed

criu/seize.c

+12-13
Original file line numberDiff line numberDiff line change
@@ -545,7 +545,8 @@ static int freeze_processes(void)
545545
enum freezer_state state = THAWED;
546546

547547
static const unsigned long step_ms = 100;
548-
unsigned long nr_attempts = (opts.timeout * 1000000) / step_ms;
548+
/* Since opts.timeout is in seconds, multiply it by 1000 to convert to milliseconds. */
549+
unsigned long nr_attempts = (opts.timeout * 1000) / step_ms;
549550
unsigned long i = 0;
550551

551552
const struct timespec req = {
@@ -554,14 +555,12 @@ static int freeze_processes(void)
554555
};
555556

556557
if (unlikely(!nr_attempts)) {
557-
/*
558-
* If timeout is turned off, lets
559-
* wait for at least 10 seconds.
560-
*/
561-
nr_attempts = (10 * 1000000) / step_ms;
558+
/* If the timeout is 0, wait for at least 10 seconds. */
559+
nr_attempts = (10 * 1000) / step_ms;
562560
}
563561

564-
pr_debug("freezing processes: %lu attempts with %lu ms steps\n", nr_attempts, step_ms);
562+
pr_debug("freezing cgroup %s: %lu x %lums attempts, timeout: %us\n",
563+
opts.freeze_cgroup, nr_attempts, step_ms, opts.timeout);
565564

566565
fd = freezer_open();
567566
if (fd < 0)
@@ -588,22 +587,22 @@ static int freeze_processes(void)
588587
* not read @tasks pids while freezer in
589588
* transition stage.
590589
*/
591-
for (; i <= nr_attempts; i++) {
590+
while (1) {
592591
state = get_freezer_state(fd);
593592
if (state == FREEZER_ERROR) {
594593
close(fd);
595594
return -1;
596595
}
597596

598-
if (state == FROZEN)
597+
if (state == FROZEN || i++ == nr_attempts || alarm_timeouted())
599598
break;
600-
if (alarm_timeouted())
601-
goto err;
599+
602600
nanosleep(&req, NULL);
603601
}
604602

605-
if (i > nr_attempts) {
606-
pr_err("Unable to freeze cgroup %s\n", opts.freeze_cgroup);
603+
if (state != FROZEN) {
604+
pr_err("Unable to freeze cgroup %s (%lu x %lums attempts, timeout: %us)\n",
605+
opts.freeze_cgroup, i, step_ms, opts.timeout);
607606
if (!pr_quelled(LOG_DEBUG))
608607
log_unfrozen_stacks(opts.freeze_cgroup);
609608
goto err;

0 commit comments

Comments
 (0)