-
-
Notifications
You must be signed in to change notification settings - Fork 19
Kernel Panic (GPF in usb_hcd_unlink_urb_from_ep) when stopping webcam on MBA9,1 (Ubuntu 24.04, Kernel 6.14.1-1-t2-noble) #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Run this and restart |
@AdityaGarg8 , Unfortunately didn't work :((( Here is the last journalctl output:
|
PS: I've just realize that there is a strange stutters and freezes in mouse cursor movements during video streaming but this is a different problem. AFAIK the apple-bce driver acts as the virtual USB host controller for multiple T2-internal devices, including both the webcam and trackpad. While the fatal crash happens during the cleanup phase (stopping the stream) likely due to a specific bug in URB cancellation, the stuttering during the stream suggests the driver might also struggle with handling the concurrent load efficiently or without internal conflicts. Both symptoms point towards potential instability or suboptimal implementation within the apple-bce driver's VHCI component. The issue appears to be triggered when cancelling an URB that has already been submitted to the hardware (i.e., where
this unlock/pause/relock sequence allows a small window where the state of the URB or its list linkage ( The fix likely requires ensuring atomicity or proper state synchronization during the cancellation of already-submitted URBs within Maybe we should check these points: 1)Can 2)What guarantees does 3)What is the necessity and handling of 4)Could the call to I'll try to apply a patch soon and share with you but any help is greatly appreciated. |
Camera has been unstable from the starting. Although the command I sent should have made it a bit more stable, but nothing more can be done rn. I'd suggest you to use an external camera or use macOS for video conferencing. Talking about your findings and questions regarding code, please don't use AI here. |
@AdityaGarg8,
After adding these log messages, I compiled and ran the module and got the following log during the crash:
In this log, please look at the the entries that I put '****' at the beginning. urb=00000000913fa41a is being processed by bce_vhci_urb_request_cancel() and at the same time it is being processed by bce_vhci_transfer_queue_completion(). Just before the kernel oops message, it is being unlinked by bce_vhci_urb_request_cancel() and then a kernel oops occurs. To prevent this: At the beginning of the cancel function, we check if the URB has already been marked as cancelled (i.e., its state equals BCE_VHCI_URB_CANCELLED). If already cancelled, we exit early to prevent duplicate cleanup. Otherwise, we set the state to BCE_VHCI_URB_CANCELLED and take an extra reference using kref_get(). Here, we'll save URB's state as old_state to use whether it is BCE_VHCI_URB_INIT_PENDING because we'll set it as cancelled. transfer.c and transfer.h files has been changed. Here is new transfer.c
Here is new transfer.h
Also, sorry for putting all transfer.c and transfer.h files here. I don't know how to create diff file to share on github and I am not sure if it is worth to fork the repository for my changes. |
Are you able to reproduce this after removing the dkms driver as well?
And then restart. |
Nvm, I managed to reproduce this. So it definitely seems to fix camera, but I am concerned about the spinlocks rearranged. Any reason why? |
Detailed logs and reason behind can be seen here: t2linux/T2-Debian-and-Ubuntu-Kernel#130 (comment)
Acquire the spinlock at the beginning here as in the originial code
If the URB is already cancelled, release the lock and exit early.
If the URB was beyond the INIT_PENDING state, do pasue and resume but If I don't rearrange spinclock layout here, I got lots of BUG: scheduling while atomic: pipewire/1639/0x00000000” Errors in the journalctl. If I release the spinlock here, the errors disappear.
Re-acquire the spinlock briefly to update shared counters.
Unlink the URB from the device and give it back.
& keep going
Briefly, spinlocks has been rearranged because of the BUG: scheduling while atomic: pipewire/1639/0x00000000 error. This error doesn't end up with the GPF but concerning. With this layout I don't get this bug. |
I've opened a PR for others to review this code. Since it's AI generated, its better to have some human reviewers as well. It it passes the review, it should be available in the kernels soon. |
Detailed logs and reason behind can be seen here: t2linux/T2-Debian-and-Ubuntu-Kernel#130 (comment)
Re opening since it has not been merged yet. |
Also the code has memory leaks so needs to be fixed. It's not fit enough to be shipped. |
But it's better to avoid AI for this. |
System Details:
Problem Description:
The system experiences a hard freeze requiring a forced reboot immediately after stopping the built-in webcam's video stream. This occurs consistently when using applications like VLC (
vlc v4l2:///dev/video0
) orffplay
(ffplay /dev/video0
) to access the webcam.Steps to Reproduce:
Kernel Panic Details:
Oops: general protection fault, probably for non-canonical address 0xdead000000000108
.usb_hcd_unlink_urb_from_ep + 0x2c/0x60
.0xdead...
) strongly suggests memory corruption, potentially a use-after-free or similar issue related to USB Request Block (URB) handling.video_decoder
).Call Trace Summary:
The call trace indicates the following sequence leading to the crash:
v4l2_release
).uvcvideo
) stops the stream (uvc_video_stop_streaming
->uvc_video_stop_transfer
).usb_poison_urb
->usb_hcd_unlink_urb
).bce_vhci_urb_dequeue
->bce_vhci_urb_request_cancel
[moduleapple_bce
]).bce_vhci_urb_request_cancel
calls into the core USB HCD functionusb_hcd_unlink_urb_from_ep
.Suspected Cause:
The bug seems related to the handling of URB cancellation when the webcam stream is stopped. The involvement of the tainted
apple_bce
module in the call stack just before the crash in the USB core suggests a potential issue within theapple_bce
driver or its interaction with the standard USB stack's URB unlinking mechanism.Logs:
Please find the relevant kernel Oops message and full call trace from
journalctl
below:I hope I can find a solution because this is very annoying :((((
The text was updated successfully, but these errors were encountered: