Skip to content

Tasks hanging on 6.12.0-4.el9ueknext.x86_64 #32

Open
@danielnorberg

Description

@danielnorberg

Thank you for UEK-next, it is very useful.

We recently tried 6.12.0-4.el9ueknext.x86_64 and had issues with processes hanging. Some example kernel logs:

2025-01-16T17:24:58.365Z INFO: task khugepaged:931 blocked for more than 1228 seconds.
2025-01-16T17:24:58.365Z Tainted: P OE 6.12.0-4.el9ueknext.x86_64 #1
2025-01-16T17:24:58.374Z ""echo 0 > /proc/sys/kernel/hung_task_timeout_secs"" disables this message.
2025-01-16T17:24:58.384Z task:khugepaged state:D stack:0 pid:931 tgid:931 ppid:2 flags:0x00004002
2025-01-16T17:24:58.387Z Call Trace:
2025-01-16T17:24:58.390Z <TASK>
2025-01-16T17:24:58.394Z __schedule+0x266/0x720
2025-01-16T17:24:58.397Z schedule+0x27/0xa0
2025-01-16T17:24:58.403Z schedule_preempt_disabled+0x15/0x30
2025-01-16T17:24:58.408Z rwsem_down_write_slowpath+0x1d3/0x4e0
2025-01-16T17:24:58.412Z down_write+0x6a/0x70
2025-01-16T17:24:58.417Z collapse_huge_page+0x26d/0x7d0
2025-01-16T17:24:58.422Z hpage_collapse_scan_pmd+0x62b/0x750
2025-01-16T17:24:58.429Z khugepaged_scan_mm_slot.constprop.0+0x3c6/0x580
2025-01-16T17:24:58.432Z khugepaged+0xce/0x200
2025-01-16T17:24:58.437Z ? __pfx_khugepaged+0x10/0x10
2025-01-16T17:24:58.441Z kthread+0xcf/0x100
2025-01-16T17:24:58.445Z ? __pfx_kthread+0x10/0x10
2025-01-16T17:24:58.449Z ret_from_fork+0x31/0x50
2025-01-16T17:24:58.454Z ? __pfx_kthread+0x10/0x10
2025-01-16T17:24:58.458Z ret_from_fork_asm+0x1a/0x30
2025-01-16T17:24:58.461Z </TASK>
2025-01-16T17:38:19.036Z INFO: task tokio-runtime-w:27320 blocked for more than 1228 seconds.
2025-01-16T17:38:19.036Z Tainted: P OE 6.12.0-4.el9ueknext.x86_64 #1
2025-01-16T17:38:19.045Z ""echo 0 > /proc/sys/kernel/hung_task_timeout_secs"" disables this message.
2025-01-16T17:38:19.055Z task:tokio-runtime-w state:D stack:0 pid:27320 tgid:7967 ppid:7880 flags:0x00000002
2025-01-16T17:38:19.058Z Call Trace:
2025-01-16T17:38:19.061Z <TASK>
2025-01-16T17:38:19.065Z __schedule+0x266/0x720
2025-01-16T17:38:19.069Z schedule+0x27/0xa0
2025-01-16T17:38:19.074Z schedule_preempt_disabled+0x15/0x30
2025-01-16T17:38:19.080Z rwsem_down_write_slowpath+0x1d3/0x4e0
2025-01-16T17:38:19.085Z ? srso_return_thunk+0x5/0x5f
2025-01-16T17:38:19.089Z down_write+0x6a/0x70
2025-01-16T17:38:19.093Z vfs_unlink+0x48/0x2c0
2025-01-16T17:38:19.097Z do_unlinkat+0x2bc/0x340
2025-01-16T17:38:19.102Z __x64_sys_unlinkat+0x56/0xc0
2025-01-16T17:38:19.106Z do_syscall_64+0x8c/0x1b0
2025-01-16T17:38:19.113Z ? arch_exit_to_user_mode_prepare.isra.0+0x1e/0xd0
2025-01-16T17:38:19.118Z ? srso_return_thunk+0x5/0x5f
2025-01-16T17:38:19.123Z ? syscall_exit_to_user_mode+0x36/0x190
2025-01-16T17:38:19.128Z ? srso_return_thunk+0x5/0x5f
2025-01-16T17:38:19.133Z ? do_syscall_64+0xb9/0x1b0
2025-01-16T17:38:19.138Z ? syscall_exit_to_user_mode+0x36/0x190
2025-01-16T17:38:19.143Z ? srso_return_thunk+0x5/0x5f
2025-01-16T17:38:19.148Z ? do_syscall_64+0xb9/0x1b0
2025-01-16T17:38:19.154Z ? arch_exit_to_user_mode_prepare.isra.0+0xc0/0xd0
2025-01-16T17:38:19.160Z entry_SYSCALL_64_after_hwframe+0x76/0x7e
2025-01-16T17:38:19.165Z RIP: 0033:0x7f8b9beff42b
2025-01-16T17:38:19.174Z RSP: 002b:00007f8a8fdfc808 EFLAGS: 00000246 ORIG_RAX: 0000000000000107
2025-01-16T17:38:19.182Z RAX: ffffffffffffffda RBX: 00007f8a8fdfc828 RCX: 00007f8b9beff42b
2025-01-16T17:38:19.191Z RDX: 0000000000000000 RSI: 00007f8b99e54470 RDI: 000000000000004b
2025-01-16T17:38:19.199Z RBP: 00007f8a8fdfc8b0 R08: 00007f8b71b30608 R09: 0000000000000000
2025-01-16T17:38:19.208Z R10: 000000000a463408 R11: 0000000000000246 R12: ffffffff00000003
2025-01-16T17:38:19.216Z R13: 00007f8b72014ce0 R14: 0000000000000048 R15: 000000000000004b
2025-01-16T17:38:19.219Z </TASK>
2025-01-16T17:02:18.849Z INFO: task exe:3502400 blocked for more than 124 seconds.
2025-01-16T17:02:18.863Z Tainted: P OE 6.12.0-4.el9ueknext.x86_64 #1
2025-01-16T17:02:18.872Z ""echo 0 > /proc/sys/kernel/hung_task_timeout_secs"" disables this message.
2025-01-16T17:02:18.884Z task:exe state:D stack:0 pid:3502400 tgid:3481472 ppid:3481460 flags:0x00000000
2025-01-16T17:02:18.887Z Call Trace:
2025-01-16T17:02:18.889Z <TASK>
2025-01-16T17:02:18.893Z __schedule+0x266/0x720
2025-01-16T17:02:18.897Z schedule+0x27/0xa0
2025-01-16T17:02:18.903Z schedule_preempt_disabled+0x15/0x30
2025-01-16T17:02:18.908Z rwsem_down_read_slowpath+0x25c/0x490
2025-01-16T17:02:18.912Z down_read+0x48/0xb0
2025-01-16T17:02:18.916Z do_madvise+0xdd/0x4e9
2025-01-16T17:02:18.921Z __x64_sys_madvise+0x2b/0x40
2025-01-16T17:02:18.925Z do_syscall_64+0x8c/0x1b0
2025-01-16T17:02:18.932Z ? arch_exit_to_user_mode_prepare.isra.0+0x1e/0xd0
2025-01-16T17:02:18.937Z ? syscall_exit_to_user_mode+0x36/0x190
2025-01-16T17:02:18.942Z ? do_syscall_64+0xb9/0x1b0
2025-01-16T17:02:18.947Z ? flush_tlb_func+0x1dd/0x220
2025-01-16T17:02:18.951Z ? sched_clock+0x10/0x30
2025-01-16T17:02:18.956Z ? sched_clock_cpu+0xf/0x1e0
2025-01-16T17:02:18.961Z ? irqtime_account_irq+0x46/0xd0
2025-01-16T17:02:18.965Z ? clear_bhb_loop+0x45/0xa0
2025-01-16T17:02:18.971Z entry_SYSCALL_64_after_hwframe+0x76/0x7e
2025-01-16T17:02:18.975Z RIP: 0033:0x48250e
2025-01-16T17:02:18.984Z RSP: 002b:000000c001a6d5c0 EFLAGS: 00000212 ORIG_RAX: 000000000000001c
2025-01-16T17:02:18.992Z RAX: ffffffffffffffda RBX: 00003f1b62c6d000 RCX: 000000000048250e
2025-01-16T17:02:19Z     RDX: 0000000000000017 RSI: 0000000000002000 RDI: 00003f1b62c6d000
2025-01-16T17:02:19.008Z RBP: 000000c001a6d600 R08: 0000000000000000 R09: 0000000000000000
2025-01-16T17:02:19.017Z R10: 0000000000000000 R11: 0000000000000212 R12: 00000001ffc6d000
2025-01-16T17:02:19.025Z R13: ffffffffffffffff R14: 000000c005821340 R15: 0000000000000050
2025-01-16T17:02:19.028Z </TASK>

We could also see ps and top etc hanging on these hosts.

We have not seen similar issues with 6.10.0-2.el9ueknext.x86_64.

Is this a known issue with 6.12.0-4.el9ueknext.x86_64?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions