-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System information
Type | Version/Name |
---|---|
Distribution Name | Arch Linux |
Distribution Version | rolling |
Kernel Version | 6.11.5-zen1-1-zen |
Architecture | amd64 |
OpenZFS Version | 2.3.99.r34.g152ae5c9bc |
~ » cat /proc/cmdline
zfs=zroot/arch rw mitigations=off init_on_alloc=0 init_on_free=0 lsm=landlock,lockdown,yama,integrity,apparmor,bpf pcie_aspm=performance systemd.gpt_auto=0 spl.spl_hostid=0x00bab10c
~ » cat /etc/modprobe.d/zfs.conf | grep -v \#
options zfs zfs_vdev_max_active=1024
options zfs zfs_txg_timeout=5
options zfs zfs_vdev_scrub_min_active=1
options zfs zfs_vdev_scrub_max_active=2
options zfs zfs_vdev_sync_write_min_active=1
options zfs zfs_vdev_sync_write_max_active=128
options zfs zfs_vdev_sync_read_min_active=1
options zfs zfs_vdev_sync_read_max_active=128
options zfs zfs_vdev_async_read_min_active=1
options zfs zfs_vdev_async_read_max_active=128
options zfs zfs_vdev_async_write_min_active=1
options zfs zfs_vdev_async_write_max_active=128
options zfs zfs_vdev_scheduler=none
options zfs zio_taskq_batch_pct=25
options zfs zfs_sync_taskq_batch_pct=25
options zfs zfs_prefetch_disable=1
options zfs zfs_arc_sys_free=2000000000
options zfs zvol_use_blk_mq=1
options zfs zfs_abd_scatter_enabled=0
options zfs compressed_arc_enabled=0
options zfs zfs_arc_shrinker_limit=0
options zfs zfs_bclone_enabled=0
Describe the problem you're observing
I'm seeing segmentation faults when using zfs git (zfs 2.2.6 is fine) with init_on_alloc=0 init_on_free=0
in cmdline
- nothing in dmesg
- I can trigger that using a docker compose up
with a few containers rails, mysql - after that system crashes and most commands fail. Shortly after it first appears whole system is crashing including plasmashell
and so on.
It's a system I need to work so I was going back to 2.2.6 where everything is fine and stable. Not using init_on_alloc=0 init_on_free=0
might help but i'm not 100% sure here. I'm not using zvols.
System passes a bios memory test just fine. Dell Latitude E5470 / i7-6820HQ
Describe how to reproduce the problem
Good question. Maybe it reproduces using the kmod options listed here and the cmdline - for me it's triggered by a docker compose up
so it could be related to overlayfs. At least that's when I was noticing it.
I assume it's a problem related to my kmod config settings or the cmdline settings overwise it would have already been found. Noticed a similiar behavoir a few weeks ago and tried pinning it down but failed. So I'd thought i'd put that here.
Include any warning/errors/backtraces from the system logs
there is nothing in dmesg. Below some random journalctl
logfile entries about crashes (it all looks pretty random)
Okt 25 15:41:16 systemd[1]: incus.service: Main process exited, code=dumped, status=11/SEGV
Okt 25 15:41:16 systemd[1]: [email protected]: Deactivated successfully.
Okt 25 15:41:16 systemd-coredump[98503]: [🡕] Process 98494 (incusd) of user 0 dumped core.
Stack trace of thread 98494:
#0 0x000060139c936214 n/a (incusd + 0x579214)
#1 0x000060139c90af45 n/a (incusd + 0x54df45)
#2 0x000060139c8f9aea n/a (incusd + 0x53caea)
#3 0x000060139c8fa214 n/a (incusd + 0x53d214)
#4 0x000060139c8f71b6 n/a (incusd + 0x53a1b6)
#5 0x000060139c931551 n/a (incusd + 0x574551)
#6 0x000060139c8cc158 n/a (incusd + 0x50f158)
#7 0x000060139c8e68f3 n/a (incusd + 0x5298f3)
#8 0x000060139c8e6130 n/a (incusd + 0x529130)
#9 0x000060139c8e5bdc n/a (incusd + 0x528bdc)
#10 0x000060139c8e5b3b n/a (incusd + 0x528b3b)
#11 0x000060139c8d2e12 n/a (incusd + 0x515e12)
#12 0x000060139c8d2c85 n/a (incusd + 0x515c85)
#13 0x000060139c8d22b3 n/a (incusd + 0x5152b3)
#14 0x000060139c8cc785 n/a (incusd + 0x50f785)
#15 0x000060139c92bf6d n/a (incusd + 0x56ef6d)
#16 0x000060139c8cca45 n/a (incusd + 0x50fa45)
#17 0x000060139c8bffbe n/a (incusd + 0x502fbe)
#18 0x000060139c8bfa1d n/a (incusd + 0x502a1d)
#19 0x000060139c8fba09 n/a (incusd + 0x53ea09)
#20 0x000060139c937fe0 n/a (incusd + 0x57afe0)
#21 0x00007ad45137fecc __libc_start_main_impl (libc.so.6 + 0x25ecc)
#22 0x000060139c8bbdf5 n/a (incusd + 0x4fedf5)
ELF object binary architecture: AMD x86-64
Okt 25 15:41:17 systemd[1]: Starting Incus Container Hypervisor...
Okt 25 15:41:17 incusd[98550]: fatal error: arena already initialized
Okt 25 15:41:17 incusd[98550]: runtime stack:
Okt 25 15:41:17 incusd[98550]: runtime.throw({0x5642ef51278f?, 0x0?})
Okt 25 15:41:17 incusd[98550]: /usr/lib/go/src/runtime/panic.go:1067 +0x4a fp=0x7fff025e56f0 sp=0x7fff025e56c0 pc=0x5642edef356a
Okt 25 15:41:17 incusd[98550]: runtime.(*mheap).sysAlloc(0x5642f0c409e0, 0x0?, 0x5642f0c50be8, 0x1)
Okt 25 15:41:17 incusd[98550]: /usr/lib/go/src/runtime/malloc.go:768 +0x398 fp=0x7fff025e5790 sp=0x7fff025e56f0 pc=0x5642ede8e158
Okt 25 15:41:17 incusd[98550]: runtime.(*mheap).grow(0x5642f0c409e0, 0x0?)
Okt 25 15:41:20 systemd-coredump[98599]: [🡕] Process 98582 (containerd) of user 0 dumped core.
Stack trace of thread 98582:
#0 0x0000000000da081d n/a (containerd + 0x9a081d)
#1 0x0000000000d72d25 runtime.args (containerd + 0x972d25)
#2 0x0000000000da9a85 runtime.args.abi0 (containerd + 0x9a9a85)
#3 0x0000000000da0f32 runtime.rt0_go.abi0 (containerd + 0x9a0f32)
#4 0x00007bb13ce23ecc __libc_start_main_impl (libc.so.6 + 0x25ecc)
#5 0x0000000000d20455 _start (containerd + 0x920455)
ELF object binary architecture: AMD x86-64
Okt 25 15:41:20 systemd[1]: containerd.service: Main process exited, code=dumped, status=11/SEGV
Okt 25 15:41:20 systemd[1]: containerd.service: Failed with result 'core-dump'.