syzbot: lots of `SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)` crashes #5956

a-nogikh · 2025-04-17T19:00:41Z

Context

In the upstream Linux namespace, the SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) crash has been responsible for up to 10% of daily syzbot crashes.

Its frequency recently dropped by 10x, but then it went up again (though it's still less frequent than it used to be).

Likely it was because of the other frequent net fuzzing crasher - unregister_netdevice: waiting for DEV to become free got fixed in the net tree. Since that moment, there's a surge of SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) crashes on ci-upstream-net-this-kasan-gce again.

In any case, we should figure out whether it's a manifestation of a real kernel bug or our it's our checkpoint code in syz-executor must be changed.

The reproducers look similar:

#{"repeat":true,"procs":1,"slowdown":1,"sandbox":"none","tun":true,"netdev":true,"resetnet":true,"cgroups":true,"binfmt_misc":true,"close_fds":true,"usb":true,"vhci":true,"wifi":true,"ieee802154":true,"sysctl":true,"tmpdir":true,"segv":true}
bpf$BPF_TASK_FD_QUERY(0x14, &(0x7f0000000080)={0x0, 0xffffffffffffffff, 0x0, 0x7, &(0x7f0000000000)='cgroup\x00'}, 0x30)
r0 = bpf$PROG_LOAD(0x5, &(0x7f0000000080)={0x9, 0x4, &(0x7f0000000700)=ANY=[@ANYBLOB="18000000000000000000000000000000611200000000000095000000000000001383096e16281fd43e588cf7a1e65f316e5e5600f1fb642cb352b9d4c50ae8366e5cadf97f4e52fdb37bdab01f9f6cc297b10500c98ea973fbaf38f9d47c5702c2bd9ebf0134b54dbee7458404277462d8ac80053e629d28aa5b25e324fd54d237d7921ff7b52f78ad9692619113594630a9eb6490c61332499f4861a57120ea351e61ca79b452a2bffd133c9ce1b4049b537a6310d0ee13db80ad6553ed19a04679d0d66bf61277501f370105113bd565ae2e766f9a79e314ecbc4000b4702ecfcaed9cb384edf20b1d3e7011bd384577a5a78efdd8687e0574465e490aa62e217fa49e4167d7edcd030c20937155d065ee7bb686bffcf28ec73d58a1d795c358c5aee99cae4c959ba2b9a78b4e231c46f8030523faf5b79ef84c5201a69d776df2041ae3d19a3d03fb1f2913fdd3fef24c94f1e224f872c1bebc0a7622231b2be88508a13a5b74e417cdad2076dd0ccdf44daf7404337f84783856b8582065669a46c1d570cdf4d6ce259d39fdc6bc4f066eb27ba18fc0110ebf3eb081d09b8587c911260c2ca2f49825e10b20733735ec2f4a80308c92dac2cac1608cbd739d385703e2933fda0dde43f3270d7170a7f5ce1dad0a2ae4691cc8487e113b89df89fd1d3c51723d79966e8c2eae12cf2dcfa7c09b15de3f494c5bfc35a8ac8124fb66066b2b3c7db6585b2fe802e86d2794d885c779de4ab1a0999fcedaea0b0497927b536e120212681673509f2aa7ad0875d5be6fa5f5812dd7966978f435924026737b78156906c3faf9e84f0cfb70a8d326262ce7ceadf4f95a7afb2bdf8250af753f32"], &(0x7f0000000100)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, 0x0, 0xffffffffffffffff, 0x8, 0x0, 0x0, 0x10, 0x0}, 0x70)
r1 = openat$cgroup_root(0xffffffffffffff9c, &(0x7f0000000000), 0x200002, 0x0)
bpf$BPF_PROG_DETACH(0x8, &(0x7f0000000040)={@cgroup=r1, r0, 0x2}, 0x10)

I was unable to crash v6.15-rc2 kernel running on qemu using the reproducer above.

Progress

In another related discussion, @FlorentRevest has concluded that the bpf program included in the reproducer is just returning 0.

The text was updated successfully, but these errors were encountered:

a-nogikh · 2025-04-17T20:12:19Z

A local patched qemu-based instance is able to trigger the bug.

After some debugging, it seems that the socket() call is rejected after calling BPF_CGROUP_RUN_PROG_INET_SOCK(sk) here: https://elixir.bootlin.com/linux/v6.15-rc1/source/net/ipv4/af_inet.c#L391

So it does look indeed as if it's the fuzzer who prohibits the socket creation.

a-nogikh · 2025-04-17T20:15:39Z

Should we replace the socket() call by syz_init_net_socket() or something similar?

Creating a cgroup namespace per test seems like an overkill. On the other hand, if we are able to prohibit such basic syscalls for the rest of the proc lifetime, there's little value in letting it run further.

Cc @dvyukov

dvyukov · 2025-04-22T06:56:14Z

Should we replace the socket() call by syz_init_net_socket() or something similar?

The socket should be in the test net namespace to reset the right state.

Creating a cgroup namespace per test seems like an overkill.

Perhaps we could re-create it as part of the sandbox creation. Before we produce SYZFAIL we should try to re-create sandbox several times.

Another possible option is to mark some set of syscalls that have global dangerous effects as "snapshot-mode only", and test them only on in snapshot mode. We already have a bunch of issues with perf, and create separate instances just for these to achieve similar effect. There is also a bunch of syscalls that we simply disable entirely, or not describe, for similar reasons (e.g. the only in program sanitization). See #5308.

a-nogikh · 2025-04-22T09:02:18Z

Perhaps we could re-create it as part of the sandbox creation.

You mean calling setup_cgroups() not in the runner executor process before creating the procs, but as a part of the sandboxing setup of each proc? Or do you mean something else here?

There is also a bunch of syscalls that we simply disable entirely, or not describe, for similar reasons (e.g. the only in program sanitization).

Then it looks like we'd need to ban all of BPF_PROG_TYPE_CGROUP_* bpf program types: https://docs.ebpf.io/linux/program-type/BPF_PROG_TYPE_CGROUP_SOCK/

or sanitize BPF_PROG_ATTACH to never attach to a EXP_CGROUP.

dvyukov · 2025-04-22T12:23:56Z

You mean calling setup_cgroups() not in the runner executor process before creating the procs

Probably not as simple as moving the call, but, yes, something like that.

or sanitize BPF_PROG_ATTACH to never attach to a EXP_CGROUP.

Is it the worst type of BPF hooks? It feels that any global hooks should be much worse than anything attached to a single cgroup.

a-nogikh · 2025-04-22T12:40:22Z

Is it the worst type of BPF hooks?

I don't know. In the documentation, they also mention LSM hooks, these also sound like they may have global consequences.

a-nogikh · 2025-04-22T13:00:28Z

There are also CONFIG_CGROUP_BPF and CONFIG_BPF_LSM kernel options, but I am not sure if we want to add this dimension to our kernel config generation. But that would be the most reliable way to prevent these from happening.

a-nogikh · 2025-04-22T16:13:17Z

As discussed offline: to resolve this specific case, we could do CLONE_NEWCGROUP and set up cgroups for each proc independently. Hopefully the total diff will be small.

a-nogikh · 2025-04-24T11:56:23Z

FTR

How cgroups are currently configured:

syzkaller/executor/executor.cc

Line 575 in 9c80ffa

runner(argv, argc);

syzkaller/executor/executor_runner.h

Lines 678 to 683 in 9c80ffa

    
           		// This does any one-time setup for the requested features on the machine. 
        
           		// Note: this can be called multiple times and must be idempotent. 
        
           #if SYZ_HAVE_FEATURES 
        
           		setup_sysctl(); 
        
           		setup_cgroups(); 
        
           #endif

This is where we first set them up.

syzkaller/executor/common_linux.h

Lines 3798 to 3800 in 9c80ffa

    
           mount_cgroups2(unified_controllers, sizeof(unified_controllers) / sizeof(unified_controllers[0])); 
        
           mount_cgroups("/syzcgroup/net", net_controllers, sizeof(net_controllers) / sizeof(net_controllers[0])); 
        
           mount_cgroups("/syzcgroup/cpu", cpu_controllers, sizeof(cpu_controllers) / sizeof(cpu_controllers[0]));

During per-proc sandboxing

We make /syzcgroups available within the chroot'ed fs for each proc:

syzkaller/executor/common_linux.h

Line 3966 in 9c80ffa

initialize_cgroups();

syzkaller/executor/common_linux.h

Line 3894 in 9c80ffa

    
           if (mount("/syzcgroup/unified", "./syz-tmp/newroot/syzcgroup/unified", NULL, bind_mount_flags, NULL)) {

Before the execution loop

After sandboxing, right before starting the execution loop, we also do some more cgroup configuration.

syzkaller/executor/common.h

Line 612 in 9c80ffa

setup_loop();

syzkaller/executor/common_linux.h

Line 4835 in 9c80ffa

setup_cgroups_loop();

syzkaller/executor/common_linux.h

Line 3806 in 9c80ffa

static void setup_cgroups_loop()

That apparently configures some per-proc sub-cgroups:

syzkaller/executor/common_linux.h

Lines 3815 to 3816 in 9c80ffa

    
           snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/unified/syz%llu", procid); 
        
           if (mkdir(cgroupdir, 0777)) {

And we also do some work per each program execution.

syzkaller/executor/common.h

Line 648 in 9c80ffa

setup_test();

syzkaller/executor/common_linux.h

Line 4874 in 9c80ffa

setup_cgroups_test();

It configures symlinks:

syzkaller/executor/common_linux.h

Lines 3863 to 3874 in 9c80ffa

    
           snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/unified/syz%llu", procid); 
        
           if (symlink(cgroupdir, "./cgroup")) { 
        
           	debug("symlink(%s, ./cgroup) failed: %d\n", cgroupdir, errno); 
        
           } 
        
           snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/cpu/syz%llu", procid); 
        
           if (symlink(cgroupdir, "./cgroup.cpu")) { 
        
           	debug("symlink(%s, ./cgroup.cpu) failed: %d\n", cgroupdir, errno); 
        
           } 
        
           snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/net/syz%llu", procid); 
        
           if (symlink(cgroupdir, "./cgroup.net")) { 
        
           	debug("symlink(%s, ./cgroup.net) failed: %d\n", cgroupdir, errno); 
        
           }

a-nogikh · 2025-04-28T10:25:56Z

The original SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) seems to be quite difficult to reliably reproduce, even though it happens quite frequently during fuzzing.

But SYZFAIL: ebtable: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) (*) is easy to reproduce:

bpf$BPF_TASK_FD_QUERY(0x14, &(0x7f0000000240)={0x0, 0xffffffffffffffff, 0x0, 0x7, &(0x7f0000000000)='cgroup\x00'}, 0x30)
r0 = openat$cgroup_root(0xffffffffffffff9c, &(0x7f0000000000), 0x200002, 0x0)
r1 = bpf$PROG_LOAD(0x5, &(0x7f0000000080)={0x9, 0x4, &(0x7f00000008c0)=ANY=[@ANYBLOB="180000000000000000000000000000006112000000000000950000000000000051fa7824c74186dc02ec0696c37b64e3b24da3180100000005165c0f63cdc2e82818254950ee03568b8809a1ff4c7c4750eabfafcb9531b31e6a86827d1010c5a909ab98e00e19644a88e95ba26d1c9eecddb2d11c541418ceeb29b9b6829c6e433822bdb3cc85244aab60c1aae1314d7381fcfeb970bea672cf1e926f6a51479343144648a07a975bd89dc398712376610f6254f12495b4658319684387f6f3543205d4bc4ce05b8b961103673dff7f158052e62b20f05fd24108d8363d44fcd0f8f3647899762a17282a1914452d11f557c28f396eebdc858558db0276d14f9035f2b5f703e5be7e4acf8b78c2834ae5805fffee38a9a0033d520bcf6b08ede50899d4b9bdf85c71c5de2503dab358f42a2624c7daa9ed44039aab46419496362e54cfad05a0004ac71a003d7b85d07191bed4e5a890826300214146f7ed569985439baa355c2766dd056f5d79e454f3d873095e7a237bc06d035a8d601f21746d886419f38b34a495040000000071c2f0cce8c93cc17e9afa314fcb2ba15d646c66b0f65021829f87d988b4e2d71753b1549fa734f0b2e56dbd21ed2e09d0cddad721971637f384eed3034597c93e1c52f42cad0ed09c395dc6e9703660fefa1c80f467367c006f25caf0cbcefd13d68839893e39c588eb032905f91cafa4996dbf0c9be9654db05fb918086cc8228d02a3092c0830b8f587a5624515298b2d4eb2bde6f9a2eb83d53f717f13fa7552d92c51dbd32ea50c490ecd085d2811a7555c538cffffff7f00000000dd872244bfa64779e0f43a9c277e2910b7ccdc3d6726d34ad2101033a623ca2a49ad344884289130bc71cee2b7de62bf48129ae1af052a2d46a61625735a9eea7f793946b3229e861d8ea49806b3f7d4295f6b000000000000f337b1ceb2d8a65dcdcd895d7ba37098d2593fdaaef445af5bee02019c00000099b13ecda2a5b37de0519e974cba92ebaf0f701611a9b027ce04340bda4594cc9049c3f101629ab028145e004209ebe71a6fe84af50804000000000000004a27213354964e250a98fe357676f94b6947383e320fbb1118f586d5b9b1b977e1e1a4490ff67703a9b5900f8a6f8a805879dd91ec5ff435b219c53680c0ae04dcc4ef69b98fcb0d6b6a03a8b71a66b4e2876dc4b610444bf10000000000b046b6ae5d68156bcbd6d8793ade9a22ac8fc7857e5bbc14adc4e12b08f350c6789283b9990c72e64372a1f79769a8bdc632fc1a0b3417855d8b7d25ca4d404c23631ad3d2f55dcd385371c86170a4bca58c2b2b4eabc365f45bd10bb45b0c5bc354456a52be18d9b44014d20a3c51c8f013dade83562e73278662829e4f5a9ac00fd91178468c737f0872d97d38d11a176be5a0d7294c51eb161eddcfefa8837c7430721851ec2a107af0df6d43e732bbc01e76c66895eb85d36798d61622773591ee21ad9f6a1b73fa9cf3ffeb8a00b63af800a81d0fb8aa29df8b8ad6fbafefb5802a23cbdeeabceda5bfc5ff2fa5c1d61d04a1324794c6ed000696d9f04010c35474e690545c3d9bd836d4cef2585ba616e01c3d000000000000000000470ebc6f3453ecbf3047e4547d7632d3ad21798e730cb5d1da059b5bdb8107815dff995c0788906790406dfb4f8ee9f24ff94233e2e6e581e6e5de33a5f254c9a8b612547473c3001df3928dac9203b744619082421a8da7c00000000000000000000000000000018a73ef40cca690fb7595c6962984f8276677be6f66cbdbccf1896433808c9c84d74ac4a7c186a04a2250972f7acb156b21f9826b6acb7db32c4e3b3ec8b59fd972975edb1da872d81a35e4fda2f5cbde6b40bea20418c6e9dad30b791eea58f53e80fee4dd7fe08373ea2784fcd3a65261de71eb866458d2c22a"], &(0x7f0000000100)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, @cgroup_sock, 0xffffffffffffffff, 0x8, 0x0, 0x0, 0x10, 0x0, 0x0, 0x0, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x10, 0x0, @void, @value}, 0x70)
bpf$BPF_PROG_DETACH(0x8, &(0x7f0000000040)={@cgroup=r0, r1, 0x2, 0x2, 0x0, @void, @value}, 0x10)

After running it under strace we get:

bpf(BPF_TASK_FD_QUERY, {task_fd_query={pid=0, fd=-1, flags=0, buf_len=7, buf="cgroup", prog_id=0, fd_type=BPF_FD_TYPE_RAW_TRACEPOINT, probe_offset=0, probe_addr=0}}, 48)
openat(AT_FDCWD, "cgroup", O_RDWR|O_PATH)
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_CGROUP_SOCK, insn_cnt=4, insns=0x2000000008c0, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=-1, func_info_rec_size=8, func_info=NULL, func_info_cnt=0, line_info_rec_size=16, line_info=NULL, line_info_cnt=0, attach_btf_id=0}, 112 ...)
bpf(BPF_PROG_ATTACH, {target_fd=3, attach_bpf_fd=4, attach_type=BPF_CGROUP_INET_SOCK_CREATE, attach_flags=BPF_F_ALLOW_MULTI}, 16)

(so bpf$BPF_PROG_DETACH was misleading and it's in fact BPF_PROG_ATTACH)

And ./cgroup is a symlink that we mount here:

syzkaller/executor/common_linux.h

Lines 3863 to 3866 in c6b4fb3

    
           snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/unified/syz%llu", procid); 
        
           if (symlink(cgroupdir, "./cgroup")) { 
        
           	debug("symlink(%s, ./cgroup) failed: %d\n", cgroupdir, errno); 
        
           }

So, judging by how cgroups are currently configured, we already have a separate cgroup for each proc. As it was mentioned, we could use the cgroup namespace to make sure that the cgroups are really recreated each time we restart the proc, but it doesn't change the fact that it's perfectly legal to configure bpf/cgroups to deny the socket() syscall.

If we recreate the proc on the SYZFAIL (which we afaik already do), we are going to hit the same problem as the failing program will just configure it all again. Even if we create a separate cgroup namespace for each proc and recreate the cgroup each time.

Are there any other viable options than disabling CONFIG_CGROUP_BPF / only allowing the operation on the snapshot instance?

@dvyukov @FlorentRevest

(*) This one is triggered here:

syzkaller/executor/common_linux.h

Line 4845 in c6b4fb3

static void reset_loop()

syzkaller/executor/common_linux.h

Line 3690 in c6b4fb3

static void reset_net_namespace(void)

syzkaller/executor/common_linux.h

Line 3626 in c6b4fb3

static void reset_ebtables()

Given that there's the same underlying scenario, could it be that we get SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) only if reset_loop() for some reason was not called and the proc was restarted?

dvyukov · 2025-04-29T10:05:29Z

If we recreate the proc on the SYZFAIL (which we afaik already do), we are going to hit the same problem as the failing program will just configure it all again. Even if we create a separate cgroup namespace for each proc and recreate the cgroup each time.

Will it SYZFAIL again? As far as I see we call reset_loop after finishing the previous program, so it should succeed after cgroup recreate. When you test the program with syz-execprog, does it fail on the first run, or on the second?

a-nogikh · 2025-04-29T13:07:12Z

Is is that well synchronized? If reply_execute is not blocking until we have processed the results in the parent process, we will just immediately execute reset_loop() and print the SYZFAIL message/exit with a non-zero code.

syzkaller/executor/common.h

Line 740 in 9e70464

reply_execute(0);

dvyukov · 2025-04-29T13:30:59Z

That's the child process that will exit. I don't immediately see why it should lead to any failures in the parent processes.

a-nogikh added the bug label Apr 17, 2025

a-nogikh mentioned this issue Apr 28, 2025

executor: cgroups setup causes unrecoverable errors #4685

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

syzbot: lots of `SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)` crashes #5956

syzbot: lots of `SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)` crashes #5956

a-nogikh commented Apr 17, 2025

a-nogikh commented Apr 17, 2025

a-nogikh commented Apr 17, 2025

dvyukov commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

dvyukov commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

a-nogikh commented Apr 24, 2025

a-nogikh commented Apr 28, 2025

dvyukov commented Apr 29, 2025

a-nogikh commented Apr 29, 2025

dvyukov commented Apr 29, 2025

syzbot: lots of SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) crashes #5956

syzbot: lots of SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) crashes #5956

Comments

a-nogikh commented Apr 17, 2025

a-nogikh commented Apr 17, 2025

a-nogikh commented Apr 17, 2025

dvyukov commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

dvyukov commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

a-nogikh commented Apr 22, 2025

a-nogikh commented Apr 24, 2025

How cgroups are currently configured:

During per-proc sandboxing

Before the execution loop

And we also do some work per each program execution.

a-nogikh commented Apr 28, 2025

dvyukov commented Apr 29, 2025

a-nogikh commented Apr 29, 2025

dvyukov commented Apr 29, 2025

syzbot: lots of `SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)` crashes #5956

syzbot: lots of `SYZFAIL: ebtable checkpoint: socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)` crashes #5956