Skip to content

Kernel Panic (GPF in usb_hcd_unlink_urb_from_ep) when stopping webcam on MBA9,1 (Ubuntu 24.04, Kernel 6.14.1-1-t2-noble) #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mnural opened this issue Apr 8, 2025 · 12 comments

Comments

@mnural
Copy link

mnural commented Apr 8, 2025

System Details:

  • Hardware: MacBook Air 2020 (MacBookAir9,1)
  • OS: Ubuntu 24.04 LTS (Noble Numbat)
  • Kernel: 6.14.1-1-t2-noble (Tainted: G C)

Problem Description:

The system experiences a hard freeze requiring a forced reboot immediately after stopping the built-in webcam's video stream. This occurs consistently when using applications like VLC (vlc v4l2:///dev/video0) or ffplay (ffplay /dev/video0) to access the webcam.

Steps to Reproduce:

  1. Start webcam stream using VLC, ffplay, or similar V4L2 application. (Stream works fine).
  2. Stop the stream (e.g., close the capture window in VLC, press 'q' in ffplay, or terminate the application).
  3. The system freezes instantly upon stopping the stream.

Kernel Panic Details:

  • A kernel panic occurs with the message: Oops: general protection fault, probably for non-canonical address 0xdead000000000108.
  • The instruction pointer (RIP) at the time of the crash is within usb_hcd_unlink_urb_from_ep + 0x2c/0x60.
  • The faulting memory address (0xdead...) strongly suggests memory corruption, potentially a use-after-free or similar issue related to USB Request Block (URB) handling.
  • The fault occurred on CPU 2, triggered by PID 4210 (video_decoder).

Call Trace Summary:

The call trace indicates the following sequence leading to the crash:

  1. The V4L2 device associated with the webcam is released (v4l2_release).
  2. The UVC video driver (uvcvideo) stops the stream (uvc_video_stop_streaming -> uvc_video_stop_transfer).
  3. USB URBs associated with the stream are cancelled (usb_poison_urb -> usb_hcd_unlink_urb).
  4. The cancellation process involves the Apple BCE VHCI driver (bce_vhci_urb_dequeue -> bce_vhci_urb_request_cancel [module apple_bce]).
  5. The crash occurs when bce_vhci_urb_request_cancel calls into the core USB HCD function usb_hcd_unlink_urb_from_ep.

Suspected Cause:

The bug seems related to the handling of URB cancellation when the webcam stream is stopped. The involvement of the tainted apple_bce module in the call stack just before the crash in the USB core suggests a potential issue within the apple_bce driver or its interaction with the standard USB stack's URB unlinking mechanism.

Logs:

Please find the relevant kernel Oops message and full call trace from journalctl below:

Apr 08 19:50:00 mnrl-MacBookAir kernel: Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP NOPTI
Apr 08 19:50:00 mnrl-MacBookAir kernel: CPU: 2 UID: 1000 PID: 4210 Comm: video_decoder Tainted: G         C         6.14.1-1-t2-noble #1
Apr 08 19:50:00 mnrl-MacBookAir kernel: Tainted: [C]=CRAP
Apr 08 19:50:00 mnrl-MacBookAir kernel: Hardware name: Apple Inc. MacBookAir9,1/Mac-0CFF9C7C2B63DF8D, BIOS 2075.101.2.0.0 (iBridge: 22.16.14248.0.0,0) 03/12/2025
Apr 08 19:50:00 mnrl-MacBookAir kernel: RIP: 0010:usb_hcd_unlink_urb_from_ep+0x2c/0x60
Apr 08 19:50:00 mnrl-MacBookAir kernel: Code: 44 00 00 55 48 c7 c7 ac 74 1d 9f 48 89 e5 53 48 89 f3 e8 a7 24 4a 00 48 8b 4b 18 48 8b 53 20 48 8d 43 18 48 c7 c7 ac 74 1d 9f <48> 89 51 08 48 89 0a 48 89 43 18 48 89 43 20 e8 c0 25 4a 00 48 8b
Apr 08 19:50:00 mnrl-MacBookAir kernel: RSP: 0018:ffffae33c66976e8 EFLAGS: 00010046
Apr 08 19:50:00 mnrl-MacBookAir kernel: RAX: ffff993e0b9da198 RBX: ffff993e0b9da180 RCX: dead000000000100
Apr 08 19:50:00 mnrl-MacBookAir kernel: RDX: dead000000000122 RSI: 0000000000000000 RDI: ffffffff9f1d74ac
Apr 08 19:50:00 mnrl-MacBookAir kernel: RBP: ffffae33c66976f0 R08: 0000000000000000 R09: 0000000000000000
Apr 08 19:50:00 mnrl-MacBookAir kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff993e0b9da180
Apr 08 19:50:00 mnrl-MacBookAir kernel: R13: 0000000000000000 R14: ffff993e033f8b78 R15: ffff993e070b0e80
Apr 08 19:50:00 mnrl-MacBookAir kernel: FS:  0000000000000000(0000) GS:ffff993f77f00000(0000) knlGS:0000000000000000
Apr 08 19:50:00 mnrl-MacBookAir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 08 19:50:00 mnrl-MacBookAir kernel: CR2: 000058e79d08c000 CR3: 000000013f022005 CR4: 0000000000772ef0
Apr 08 19:50:00 mnrl-MacBookAir kernel: PKRU: 55555554
Apr 08 19:50:00 mnrl-MacBookAir kernel: Call Trace:
Apr 08 19:50:00 mnrl-MacBookAir kernel:  <TASK>
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? show_regs+0x6c/0x80
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? die_addr+0x37/0xa0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? exc_general_protection+0x1d2/0x400
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? asm_exc_general_protection+0x27/0x30
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? usb_hcd_unlink_urb_from_ep+0x2c/0x60
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? usb_hcd_unlink_urb_from_ep+0x19/0x60
Apr 08 19:50:00 mnrl-MacBookAir kernel:  bce_vhci_urb_request_cancel+0x6b/0x150 [apple_bce]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  bce_vhci_urb_dequeue+0x2c/0x60 [apple_bce]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  unlink1+0x34/0x160
Apr 08 19:50:00 mnrl-MacBookAir kernel:  usb_hcd_unlink_urb+0x8a/0xf0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  usb_poison_urb+0x49/0xf0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? ktime_get+0x3e/0x100
Apr 08 19:50:00 mnrl-MacBookAir kernel:  uvc_video_stop_transfer+0x4a/0xc0 [uvcvideo]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  uvc_video_stop_streaming+0x17/0xa0 [uvcvideo]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  uvc_stop_streaming+0x27/0xd0 [uvcvideo]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  __vb2_queue_cancel+0x33/0x320 [videobuf2_common]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  vb2_core_queue_release+0x23/0x90 [videobuf2_common]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  vb2_queue_release+0xe/0x20 [videobuf2_v4l2]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  uvc_queue_release+0x26/0x40 [uvcvideo]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  uvc_v4l2_release+0x9c/0xf0 [uvcvideo]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  v4l2_release+0x104/0x120 [videodev]
Apr 08 19:50:00 mnrl-MacBookAir kernel:  __fput+0xea/0x2d0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ____fput+0x15/0x20
Apr 08 19:50:00 mnrl-MacBookAir kernel:  task_work_run+0x5d/0xa0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  do_exit+0x31f/0xab0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? __pfx_futex_wake_mark+0x10/0x10
Apr 08 19:50:00 mnrl-MacBookAir kernel:  do_group_exit+0x34/0x90
Apr 08 19:50:00 mnrl-MacBookAir kernel:  get_signal+0x9e3/0x9f0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  arch_do_signal_or_restart+0x42/0x260
Apr 08 19:50:00 mnrl-MacBookAir kernel:  syscall_exit_to_user_mode+0x146/0x1d0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  do_syscall_64+0x8a/0x170
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? wake_up_q+0x50/0xa0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? futex_wake+0x167/0x190
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? do_futex+0x18e/0x260
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? __x64_sys_futex+0x12a/0x200
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? futex_wake+0x89/0x190
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? do_futex+0x18e/0x260
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? __x64_sys_futex+0x12a/0x200
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? __rseq_handle_notify_resume+0xa4/0x520
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0xc8/0xd0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? irqentry_exit_to_user_mode+0x2d/0x1d0
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? irqentry_exit+0x43/0x50
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? clear_bhb_loop+0x15/0x70
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? clear_bhb_loop+0x15/0x70
Apr 08 19:50:00 mnrl-MacBookAir kernel:  ? clear_bhb_loop+0x15/0x70
Apr 08 19:50:00 mnrl-MacBookAir kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Apr 08 19:50:00 mnrl-MacBookAir kernel: RIP: 0033:0x7b7b01498d71
Apr 08 19:50:00 mnrl-MacBookAir kernel: Code: Unable to access opcode bytes at 0x7b7b01498d47.
Apr 08 19:50:00 mnrl-MacBookAir kernel: RSP: 002b:00007b7ab57f9150 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
Apr 08 19:50:00 mnrl-MacBookAir kernel: RAX: fffffffffffffe00 RBX: 00007b7ab801e080 RCX: 00007b7b01498d71
Apr 08 19:50:00 mnrl-MacBookAir kernel: RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007b7ab801e0a8
Apr 08 19:50:00 mnrl-MacBookAir kernel: RBP: 00007b7ab57f9190 R08: 0000000000000000 R09: 00000000ffffffff
Apr 08 19:50:00 mnrl-MacBookAir kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Apr 08 19:50:00 mnrl-MacBookAir kernel: R13: 0000000000000000 R14: 00007b7ab801e058 R15: 00007b7ab801e0a8
Apr 08 19:50:00 mnrl-MacBookAir kernel:  </TASK>
Apr 08 19:50:00 mnrl-MacBookAir kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr cmac algif_hash algif_skcipher af_alg bnep joydev input_leds hid_appletb_bl hid_magicmouse hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio binfmt_misc hid_sensor_hub hid_apple nls_iso8859_1 cdc_mbim cdc_wdm hid_generic uvcvideo videobuf2_vmalloc uvc videobuf2_memops videobuf2_v4l2 videobuf2_common cdc_ncm videodev usbhid cdc_ether usbnet hid mc mii apple_mfi_fastcharge snd_sof_pci_intel_icl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_cadence snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_hda_codec_hdmi snd_sof_pci snd_sof_xtensa_dsp intel_uncore_frequency intel_uncore_frequency_common intel_pmc_core_pltdrv snd_sof intel_pmc_core pmt_telemetry snd_sof_utils pmt_class snd_soc_acpi_intel_match intel_vsec snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_acpi soundwire_bus snd_soc_sdca snd_soc_avs
Apr 08 19:50:00 mnrl-MacBookAir kernel:  x86_pkg_temp_thermal intel_powerclamp snd_soc_hda_codec snd_hda_ext_core coretemp snd_soc_core kvm_intel brcmfmac_wcc processor_thermal_device_pci_legacy snd_compress mei_pxp intel_rapl_msr spi_nor i915 kvm processor_thermal_device mei_hdcp ac97_bus mtd processor_thermal_wt_hint iTCO_wdt polyval_clmulni polyval_generic ghash_clmulni_intel intel_pmc_bxt iTCO_vendor_support sha256_ssse3 sha1_ssse3 snd_pcm_dmaengine aesni_intel brcmfmac processor_thermal_rfim crypto_simd applesmc cryptd brcmutil snd_hda_intel drm_buddy snd_intel_dspcfg hci_bcm4377 processor_thermal_rapl snd_intel_sdw_acpi rapl ttm cfg80211 intel_cstate bluetooth intel_rapl_common snd_hda_codec drm_display_helper processor_thermal_wt_req processor_thermal_power_floor cec sbs processor_thermal_mbox rc_core spi_intel_pci i2c_i801 spi_intel int340x_thermal_zone mei_me snd_hda_core i2c_smbus mei i2c_algo_bit i2c_mux intel_soc_dts_iosf snd_hwdep sbshc intel_lpss_acpi intel_lpss acpi_tad idma64 mac_hid sch_fq_codel apple_bce(C) snd_pcm snd_seq_midi
Apr 08 19:50:00 mnrl-MacBookAir kernel:  snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq nvme nvme_core thunderbolt nvme_auth video wmi
Apr 08 19:50:00 mnrl-MacBookAir kernel: ---[ end trace 0000000000000000 ]---
Apr 08 19:50:00 mnrl-MacBookAir kernel: RIP: 0010:usb_hcd_unlink_urb_from_ep+0x2c/0x60
Apr 08 19:50:00 mnrl-MacBookAir kernel: Code: 44 00 00 55 48 c7 c7 ac 74 1d 9f 48 89 e5 53 48 89 f3 e8 a7 24 4a 00 48 8b 4b 18 48 8b 53 20 48 8d 43 18 48 c7 c7 ac 74 1d 9f <48> 89 51 08 48 89 0a 48 89 43 18 48 89 43 20 e8 c0 25 4a 00 48 8b
Apr 08 19:50:00 mnrl-MacBookAir kernel: RSP: 0018:ffffae33c66976e8 EFLAGS: 00010046
Apr 08 19:50:00 mnrl-MacBookAir kernel: RAX: ffff993e0b9da198 RBX: ffff993e0b9da180 RCX: dead000000000100
Apr 08 19:50:00 mnrl-MacBookAir kernel: RDX: dead000000000122 RSI: 0000000000000000 RDI: ffffffff9f1d74ac
Apr 08 19:50:00 mnrl-MacBookAir kernel: RBP: ffffae33c66976f0 R08: 0000000000000000 R09: 0000000000000000
Apr 08 19:50:00 mnrl-MacBookAir kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff993e0b9da180
Apr 08 19:50:00 mnrl-MacBookAir kernel: R13: 0000000000000000 R14: ffff993e033f8b78 R15: ffff993e070b0e80
Apr 08 19:50:00 mnrl-MacBookAir kernel: FS:  0000000000000000(0000) GS:ffff993f77f00000(0000) knlGS:0000000000000000
Apr 08 19:50:00 mnrl-MacBookAir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 08 19:50:00 mnrl-MacBookAir kernel: CR2: 000058e79d08c000 CR3: 000000013f022005 CR4: 0000000000772ef0
Apr 08 19:50:00 mnrl-MacBookAir kernel: PKRU: 55555554
Apr 08 19:50:00 mnrl-MacBookAir kernel: note: video_decoder[4210] exited with irqs disabled
Apr 08 19:50:00 mnrl-MacBookAir kernel: note: video_decoder[4210] exited with preempt_count 2
Apr 08 19:50:00 mnrl-MacBookAir kernel: Fixing recursive fault but reboot is needed!

I hope I can find a solution because this is very annoying :((((

@AdityaGarg8
Copy link
Member

sudo apt update
sudo apt install apple-bce

Run this and restart

@mnural
Copy link
Author

mnural commented Apr 8, 2025

@AdityaGarg8 , Unfortunately didn't work :(((
I am still experiencing freezes.
It doesn't happen always, sometimes I can close webcam stream without a freeze but usually happens after 4 or 5 video call sessions. It makes impossible online meetings :((

Here is the last journalctl output:

Apr 08 23:54:45 mnrl-MacBookAir kernel: Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP NOPTI
Apr 08 23:54:45 mnrl-MacBookAir kernel: CPU: 2 UID: 0 PID: 5601 Comm: QXcbEventQueue Tainted: G           OE      6.14.1-1-t2-noble #1
Apr 08 23:54:45 mnrl-MacBookAir kernel: Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Apr 08 23:54:45 mnrl-MacBookAir kernel: Hardware name: Apple Inc. MacBookAir9,1/Mac-0CFF9C7C2B63DF8D, BIOS 2075.101.2.0.0 (iBridge: 22.16.14248.0.0,0) 03/12/2025
Apr 08 23:54:45 mnrl-MacBookAir kernel: RIP: 0010:usb_hcd_unlink_urb_from_ep+0x2c/0x60
Apr 08 23:54:45 mnrl-MacBookAir kernel: Code: 44 00 00 55 48 c7 c7 ac 74 dd 9a 48 89 e5 53 48 89 f3 e8 a7 24 4a 00 48 8b 4b 18 48 8b 53 20 48 8d 43 18 48 c7 c7 ac 74 dd 9a <48> 89>
Apr 08 23:54:45 mnrl-MacBookAir kernel: RSP: 0018:ffffa24ec564f818 EFLAGS: 00010046
Apr 08 23:54:45 mnrl-MacBookAir kernel: RAX: ffff963e435d8618 RBX: ffff963e435d8600 RCX: dead000000000100
Apr 08 23:54:45 mnrl-MacBookAir kernel: RDX: dead000000000122 RSI: 0000000000000000 RDI: ffffffff9add74ac
Apr 08 23:54:45 mnrl-MacBookAir kernel: RBP: ffffa24ec564f820 R08: 0000000000000000 R09: 0000000000000000
Apr 08 23:54:45 mnrl-MacBookAir kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff963e435d8600
Apr 08 23:54:45 mnrl-MacBookAir kernel: R13: 0000000000000000 R14: ffff963e43874b78 R15: ffff963e573b1880
Apr 08 23:54:45 mnrl-MacBookAir kernel: FS:  0000000000000000(0000) GS:ffff963fb7f00000(0000) knlGS:0000000000000000
Apr 08 23:54:45 mnrl-MacBookAir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 08 23:54:45 mnrl-MacBookAir kernel: CR2: 000017e982b5a000 CR3: 0000000109216004 CR4: 0000000000772ef0
Apr 08 23:54:45 mnrl-MacBookAir kernel: PKRU: 55555554
Apr 08 23:54:45 mnrl-MacBookAir kernel: Call Trace:
Apr 08 23:54:45 mnrl-MacBookAir kernel:  <TASK>
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? show_regs+0x6c/0x80
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? die_addr+0x37/0xa0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? exc_general_protection+0x1d2/0x400
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? asm_exc_general_protection+0x27/0x30
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? usb_hcd_unlink_urb_from_ep+0x2c/0x60
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? usb_hcd_unlink_urb_from_ep+0x19/0x60
Apr 08 23:54:45 mnrl-MacBookAir kernel:  bce_vhci_urb_request_cancel+0x6b/0x150 [apple_bce]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  bce_vhci_urb_dequeue+0x2c/0x60 [apple_bce]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  unlink1+0x34/0x160
Apr 08 23:54:45 mnrl-MacBookAir kernel:  usb_hcd_unlink_urb+0x8a/0xf0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  usb_poison_urb+0x49/0xf0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? fsnotify_destroy_marks+0x2a/0x190
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? ktime_get+0x3e/0x100
Apr 08 23:54:45 mnrl-MacBookAir kernel:  uvc_video_stop_transfer+0x4a/0xc0 [uvcvideo]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  uvc_video_stop_streaming+0x17/0xa0 [uvcvideo]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  uvc_stop_streaming+0x27/0xd0 [uvcvideo]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  __vb2_queue_cancel+0x33/0x320 [videobuf2_common]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  vb2_core_queue_release+0x23/0x90 [videobuf2_common]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  vb2_queue_release+0xe/0x20 [videobuf2_v4l2]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  uvc_queue_release+0x26/0x40 [uvcvideo]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  uvc_v4l2_release+0x9c/0xf0 [uvcvideo]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  v4l2_release+0x104/0x120 [videodev]
Apr 08 23:54:45 mnrl-MacBookAir kernel:  __fput+0xea/0x2d0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ____fput+0x15/0x20
Apr 08 23:54:45 mnrl-MacBookAir kernel:  task_work_run+0x5d/0xa0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  do_exit+0x31f/0xab0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  do_group_exit+0x34/0x90
Apr 08 23:54:45 mnrl-MacBookAir kernel:  get_signal+0x9e3/0x9f0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  arch_do_signal_or_restart+0x42/0x260
Apr 08 23:54:45 mnrl-MacBookAir kernel:  syscall_exit_to_user_mode+0x146/0x1d0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? __sys_recvmsg+0x9a/0xf0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0xc8/0xd0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? do_syscall_64+0x8a/0x170
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? clear_bhb_loop+0x15/0x70
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? clear_bhb_loop+0x15/0x70
Apr 08 23:54:45 mnrl-MacBookAir kernel:  ? clear_bhb_loop+0x15/0x70
Apr 08 23:54:45 mnrl-MacBookAir kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Apr 08 23:54:45 mnrl-MacBookAir kernel: RIP: 0033:0x793f9351b4cd
Apr 08 23:54:45 mnrl-MacBookAir kernel: Code: Unable to access opcode bytes at 0x793f9351b4a3.
Apr 08 23:54:45 mnrl-MacBookAir kernel: RSP: 002b:0000793f7602cce0 EFLAGS: 00000293 ORIG_RAX: 0000000000000007
Apr 08 23:54:45 mnrl-MacBookAir kernel: RAX: fffffffffffffdfc RBX: 000000003275dff0 RCX: 0000793f9351b4cd
Apr 08 23:54:45 mnrl-MacBookAir kernel: RDX: 00000000ffffffff RSI: 0000000000000001 RDI: 0000793f7602cd28
Apr 08 23:54:45 mnrl-MacBookAir kernel: RBP: 0000793f7602cd00 R08: 0000000000000000 R09: 0000000000000001
Apr 08 23:54:45 mnrl-MacBookAir kernel: R10: 0000793f70002290 R11: 0000000000000293 R12: 0000793f7602cd28
Apr 08 23:54:45 mnrl-MacBookAir kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 000000003275e008
Apr 08 23:54:45 mnrl-MacBookAir kernel:  </TASK>
Apr 08 23:54:45 mnrl-MacBookAir kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr cmac algif_hash algif_skcipher af_alg bnep input_leds joydev brcmfmac_wcc hid_magi>
Apr 08 23:54:45 mnrl-MacBookAir kernel:  snd_soc_acpi kvm_intel soundwire_bus snd_soc_sdca snd_soc_avs snd_soc_hda_codec snd_hda_ext_core kvm snd_soc_core snd_compress polyval_clm>
Apr 08 23:54:45 mnrl-MacBookAir kernel:  snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sys>
Apr 08 23:54:45 mnrl-MacBookAir kernel: ---[ end trace 0000000000000000 ]---
Apr 08 23:54:45 mnrl-MacBookAir kernel: RIP: 0010:usb_hcd_unlink_urb_from_ep+0x2c/0x60
Apr 08 23:54:45 mnrl-MacBookAir kernel: Code: 44 00 00 55 48 c7 c7 ac 74 dd 9a 48 89 e5 53 48 89 f3 e8 a7 24 4a 00 48 8b 4b 18 48 8b 53 20 48 8d 43 18 48 c7 c7 ac 74 dd 9a <48> 89>
Apr 08 23:54:45 mnrl-MacBookAir kernel: RSP: 0018:ffffa24ec564f818 EFLAGS: 00010046
Apr 08 23:54:45 mnrl-MacBookAir kernel: RAX: ffff963e435d8618 RBX: ffff963e435d8600 RCX: dead000000000100
Apr 08 23:54:45 mnrl-MacBookAir kernel: RDX: dead000000000122 RSI: 0000000000000000 RDI: ffffffff9add74ac
Apr 08 23:54:45 mnrl-MacBookAir kernel: RBP: ffffa24ec564f820 R08: 0000000000000000 R09: 0000000000000000
Apr 08 23:54:45 mnrl-MacBookAir kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff963e435d8600
Apr 08 23:54:45 mnrl-MacBookAir kernel: R13: 0000000000000000 R14: ffff963e43874b78 R15: ffff963e573b1880
Apr 08 23:54:45 mnrl-MacBookAir kernel: FS:  0000000000000000(0000) GS:ffff963fb7f00000(0000) knlGS:0000000000000000
Apr 08 23:54:45 mnrl-MacBookAir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 08 23:54:45 mnrl-MacBookAir kernel: CR2: 000017e982b5a000 CR3: 0000000109216004 CR4: 0000000000772ef0
Apr 08 23:54:45 mnrl-MacBookAir kernel: PKRU: 55555554
Apr 08 23:54:45 mnrl-MacBookAir kernel: note: QXcbEventQueue[5601] exited with irqs disabled
Apr 08 23:54:45 mnrl-MacBookAir kernel: note: QXcbEventQueue[5601] exited with preempt_count 2
Apr 08 23:54:45 mnrl-MacBookAir kernel: Fixing recursive fault but reboot is needed!

@mnural
Copy link
Author

mnural commented Apr 9, 2025

  1. I tried manually compiling the apple_bce module on my macbook but no effect.
  2. I also tried LTS kernel, Xanmod kernel and generic kernel but again nothing changed :(((

PS: I've just realize that there is a strange stutters and freezes in mouse cursor movements during video streaming but this is a different problem.

AFAIK the apple-bce driver acts as the virtual USB host controller for multiple T2-internal devices, including both the webcam and trackpad. While the fatal crash happens during the cleanup phase (stopping the stream) likely due to a specific bug in URB cancellation, the stuttering during the stream suggests the driver might also struggle with handling the concurrent load efficiently or without internal conflicts. Both symptoms point towards potential instability or suboptimal implementation within the apple-bce driver's VHCI component.
Further code review of vhci/transfer.c points us a potential race condition or state management issue within the bce_vhci_urb_request_cancel function.

The issue appears to be triggered when cancelling an URB that has already been submitted to the hardware (i.e., where vurb->state != BCE_VHCI_URB_INIT_PENDING). Let's check this specific code path, the function performs these:

  1. Temporarily unlocks the queue's spinlock (q->urb_lock)
  2. Calls bce_vhci_transfer_queue_pause()
  3. Re-acquires the spinlock
  4. Increments q->remaining_active_requests
  5. Calls the core HCD function usb_hcd_unlink_urb_from_ep() inside which the crash occurs.

this unlock/pause/relock sequence allows a small window where the state of the URB or its list linkage (urb->urb_list) might become inconsistent due to concurrent operations (e.g., interrupts, other threads, or incomplete pause state).
When usb_hcd_unlink_urb_from_ep() is subsequently called, it may operate on corrupted list pointers, leading to the observed genral prot fault when attempting to dereference the 0xdead... address.

The fix likely requires ensuring atomicity or proper state synchronization during the cancellation of already-submitted URBs within bce_vhci_urb_request_cancel.

Maybe we should check these points:

1)Can bce_vhci_transfer_queue_pause() be safely called while holding q->urb_lock to eliminate the unlock/relock window? Are there potential deadlocks?

2)What guarantees does bce_vhci_transfer_queue_pause() provide regarding hardware state and list stability before returning? Is the state fully consistent when the lock is re-acquired?

3)What is the necessity and handling of q->remaining_active_requests within this critical section?

4)Could the call to usb_hcd_unlink_urb_from_ep() be deferred or handled differently based on the state returned by the pause/hardware interaction?

I'll try to apply a patch soon and share with you but any help is greatly appreciated.
Another thing I suspect and more likely is that there may be a hardware problem with my macbook that I can't understand. Why is this problem only happening to me? Why haven't other users reported a similar bug?

@AdityaGarg8
Copy link
Member

Camera has been unstable from the starting. Although the command I sent should have made it a bit more stable, but nothing more can be done rn.

I'd suggest you to use an external camera or use macOS for video conferencing.

Talking about your findings and questions regarding code, please don't use AI here.

@mnural
Copy link
Author

mnural commented Apr 12, 2025

@AdityaGarg8,
Sorry, I wish I don't need AI to identify and solve the problem. To be honest I'm not experienced on debugging linux kernel modules but AI really helped me to comprehend compex mechanism of apple-bce module even a little bit. I filnally figured out what's the problem and how to fix it.
Let me share the solution and steps.
If any application (i.e., cheese, skype or chrome...) is closed during webcam streaming, GPF would occur. In the GPF details in Journalctl, bce_vhci_urb_request_cancel() and usb_hcd_unlink_urb_from_ep() were seen as the cause of the crash. I added the following log messages to the bce_vhci_urb_request_cancel() and bce_vhci_transfer_queue_completion() functions to understand what is going on. I use %p instead of %pK to see actual pointers in the log.

  1. logging in bce_vhci_transfer_queue_completion()
static void bce_vhci_transfer_queue_completion(struct bce_queue_sq *sq)
{
    unsigned long flags;
    struct bce_sq_completion_data *c;
    struct urb *urb;
    struct bce_vhci_transfer_queue *q = sq->userdata;

    // 1. FUNCTION ENTRY
    pr_emerg("[BCE_DBG:%s] Handler Entry: ep=0x%02x\n", __func__, q->endp_addr);

    spin_lock_irqsave(&q->urb_lock, flags);
    // 2. LOCK ACQUIRED
    pr_emerg("[BCE_DBG:%s] Handler Lock Acquired: ep=0x%02x\n", __func__, q->endp_addr);

    while ((c = bce_next_completion(sq))) {
        if (c->status == BCE_COMPLETION_ABORTED) { /* We flushed the queue */
            pr_emerg("[BCE_DBG:%s] Abort completion: ep=0x%02x\n", __func__, q->endp_addr);
            bce_notify_submission_complete(sq);
            continue;
        }
        if (list_empty(&q->endp->urb_list)) {
            pr_emerg("[BCE_DBG:%s] ERROR: Completion but URB list empty: ep=0x%02x\n", __func__, q->endp_addr);

            continue;
        }
        pr_debug("bce-vhci: [%02x] Got a transfer queue completion\n", q->endp_addr);

        urb = list_first_entry(&q->endp->urb_list, struct urb, urb_list);

        // ************************************
        // **** THIS URB IS PROCESSING NOW ****
        // ************************************
        pr_emerg("[BCE_DBG:%s] Processing completion for URB %p (hcpriv=%p)\n", __func__, urb, urb->hcpriv);
        bce_vhci_urb_transfer_completion(urb->hcpriv, c);
        pr_emerg("[BCE_DBG:%s] After transfer_completion call for URB %p\n", __func__, urb);

        bce_notify_submission_complete(sq);
    }
    pr_emerg("[BCE_DBG:%s] Handler Loop Done: ep=0x%02x\n", __func__, q->endp_addr);

    bce_vhci_transfer_queue_deliver_pending(q);

    // 3. LOCK RELEASING
    spin_unlock_irqrestore(&q->urb_lock, flags);
    pr_emerg("[BCE_DBG:%s] Handler Lock Released: ep=0x%02x\n", __func__, q->endp_addr);

    // 4. HANDLER EXIT
    bce_vhci_transfer_queue_giveback(q);
    pr_emerg("[BCE_DBG:%s] Handler Exit: ep=0x%02x\n", __func__, q->endp_addr);

}
  1. logging in bce_vhci_urb_request_cancel()
int bce_vhci_urb_request_cancel(struct bce_vhci_transfer_queue *q, struct urb *urb, int status)
{
    struct bce_vhci_urb *vurb;
    unsigned long flags;
    int ret;
    
    pr_emerg("[BCE_DBG:%s] THIS URB IS PROCESSING NOW: urb=%p, ep=0x%02x, status=%d\n", __func__, urb, q->endp_addr, status);

    spin_lock_irqsave(&q->urb_lock, flags);
    
    if ((ret = usb_hcd_check_unlink_urb(q->vhci->hcd, urb, status))) {
        pr_emerg("[BCE_DBG:%s] Check failed (already unlinked?): urb=%p, ret=%d\n", __func__, urb, ret);
        spin_unlock_irqrestore(&q->urb_lock, flags);
        return ret;
    }
    
    vurb = urb->hcpriv;
   
    /* If the URB wasn't posted to the device yet, we can still remove it on the host without pausing the queue. */
    if (vurb->state != BCE_VHCI_URB_INIT_PENDING) {  // Complex path

        pr_emerg("[BCE_DBG:%s] Complex path: state=%d\n", __func__, vurb->state);
        pr_debug("bce-vhci: [%02x] Cancelling URB\n", q->endp_addr);

        pr_emerg("[BCE_DBG:%s] Unlocking for pause: urb=%p\n", __func__, urb);
        spin_unlock_irqrestore(&q->urb_lock, flags);

        bce_vhci_transfer_queue_pause(q, BCE_VHCI_PAUSE_INTERNAL_WQ);

        pr_emerg("[BCE_DBG:%s] Pause returned, relocking: urb=%p\n", __func__, urb);
        spin_lock_irqsave(&q->urb_lock, flags);

        pr_emerg("[BCE_DBG:%s] Relocked: urb=%p\n", __func__, urb);

        ++q->remaining_active_requests;
        pr_emerg("[BCE_DBG:%s] remaining_active_requests=%d\n", __func__, q->remaining_active_requests);


    pr_emerg("[BCE_DBG:%s] Calling unlink: urb=%p\n", __func__, urb);
    usb_hcd_unlink_urb_from_ep(q->vhci->hcd, urb);

    pr_emerg("[BCE_DBG:%s] IF IT DOESNT CRASH THIS WILL BE LOGGED ---> Unlink returned: urb=%p\n", __func__, urb);

    spin_unlock_irqrestore(&q->urb_lock, flags);

    usb_hcd_giveback_urb(q->vhci->hcd, urb, status);

    if (vurb->state != BCE_VHCI_URB_INIT_PENDING)
        bce_vhci_transfer_queue_resume(q, BCE_VHCI_PAUSE_INTERNAL_WQ);

    pr_emerg("[BCE_DBG:%s] Freeing vurb: vurb=%p (for urb=%p)\n", __func__, vurb, urb);
    kfree(vurb);

    pr_emerg("[BCE_DBG:%s] Exiting: urb=%p\n", __func__, urb);
    return 0;
}

After adding these log messages, I compiled and ran the module and got the following log during the crash:

[  156.104757] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 0000000090c19135 (hcpriv=00000000050c65f2)
[  156.104759] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 0000000090c19135
[  156.104761] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.104763] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.104766] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
****[  156.104831] [BCE_DBG:bce_vhci_urb_request_cancel] THIS URB IS PROCESSING NOW: urb=00000000913fa41a
****[  156.104835] [BCE_DBG:bce_vhci_urb_request_cancel] Complex path: state=2
****[  156.104837] [BCE_DBG:bce_vhci_urb_request_cancel] Unlocking for pause: urb=00000000913fa41a
[  156.104939] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Entry: ep=0x81
[  156.104942] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Acquired: ep=0x81
[  156.104944] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 0000000072bfac4b (hcpriv=0000000062de7f9b)
[  156.104947] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 0000000072bfac4b
[  156.104949] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.104952] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.104955] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
[  156.105032] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Entry: ep=0x81
[  156.105034] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Acquired: ep=0x81
[  156.105036] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 00000000da80ef51 (hcpriv=00000000f4f82d7b)
[  156.105039] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 00000000da80ef51
[  156.105041] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.105044] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.105047] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
[  156.105171] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Entry: ep=0x81
[  156.105173] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Acquired: ep=0x81
****[  156.105175] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 00000000913fa41a (hcpriv=000000000b887228)
****[  156.105178] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 00000000913fa41a
[  156.105180] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.105182] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.105187] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
[  156.105360] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Entry: ep=0x81
[  156.105367] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Acquired: ep=0x81
[  156.105369] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 000000009752fe82 (hcpriv=00000000892219ae)
[  156.105372] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 000000009752fe82
[  156.105374] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.105377] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.105380] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
[  156.105454] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Entry: ep=0x81
[  156.105457] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Acquired: ep=0x81
[  156.105458] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 0000000090c19135 (hcpriv=00000000a5b4ba3a)
[  156.105460] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 0000000090c19135
[  156.105462] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.105463] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.105466] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
[  156.105616] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Entry: ep=0x81
[  156.105619] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Acquired: ep=0x81
[  156.105621] [BCE_DBG:bce_vhci_transfer_queue_completion] Processing completion for URB 0000000072bfac4b (hcpriv=000000007aa4d11c)
[  156.105623] [BCE_DBG:bce_vhci_transfer_queue_completion] After transfer_completion call for URB 0000000072bfac4b
[  156.105625] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Loop Done: ep=0x81
[  156.105627] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Lock Released: ep=0x81
[  156.105630] [BCE_DBG:bce_vhci_transfer_queue_completion] Handler Exit: ep=0x81
****[  156.107831] [BCE_DBG:bce_vhci_urb_request_cancel] Pause returned, relocking: urb=00000000913fa41a
****[  156.107840] [BCE_DBG:bce_vhci_urb_request_cancel] Relocked: urb=00000000913fa41a
****[  156.107843] [BCE_DBG:bce_vhci_urb_request_cancel] remaining_active_requests=1
****[  156.107845] [BCE_DBG:bce_vhci_urb_request_cancel] Calling unlink: urb=00000000913fa41a
[  156.107855] Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP NOPTI
[  156.107860] CPU: 1 UID: 1000 PID: 4056 Comm: pipewire Tainted: G           OE      6.12.23-1-t2-noble #1
[  156.107866] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  156.107867] Hardware name: Apple Inc. MacBookAir9,1/Mac-0CFF9C7C2B63DF8D, BIOS 2075.101.2.0.0 (iBridge: 22.16.14248.0.0,0) 03/12/2025
[  156.107870] RIP: 0010:usb_hcd_unlink_urb_from_ep+0x2c/0x60
[  156.107877] Code: 44 00 00 55 48 c7 c7 8c de 9a 9d 48 89 e5 53 48 89 f3 e8 c7 46 51 00 48 8b 4b 18 48 8b 53 20 48 8d 43 18 48 c7 c7 8c de 9a 9d <48> 89 51 08 48 89 0a 48 89 43 18 48 89 43 20 e8 e0 47 51 00 48 8b
[  156.107880] RSP: 0000:ffffb98a4230b808 EFLAGS: 00010046
[  156.107884] RAX: ffff9d4704741198 RBX: ffff9d4704741180 RCX: dead000000000100
[  156.107887] RDX: dead000000000122 RSI: 0000000000000000 RDI: ffffffff9d9ade8c
[  156.107889] RBP: ffffb98a4230b810 R08: 0000000000000000 R09: 0000000000000000
[  156.107891] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9d47010eab28
[  156.107894] R13: 0000000000000000 R14: ffff9d4736cd3440 R15: ffff9d47010eab78
[  156.107896] FS:  0000702f4b680740(0000) GS:ffff9d4877e80000(0000) knlGS:0000000000000000
[  156.107899] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  156.107902] CR2: 00007b3477d6c9d0 CR3: 0000000146724006 CR4: 0000000000772ef0
[  156.107904] PKRU: 55555554
[  156.107906] Call Trace:
[  156.107909]  <TASK>
[  156.107913]  bce_vhci_urb_request_cancel+0xb2/0x290 [apple_bce]
[  156.107925]  bce_vhci_urb_dequeue+0x2c/0x60 [apple_bce]
[  156.107934]  unlink1+0x34/0x160
[  156.107939]  usb_hcd_unlink_urb+0x8a/0xf0
[  156.107943]  usb_poison_urb+0x49/0xf0
[  156.107947]  ? aa_file_perm+0x134/0x550
[  156.107951]  ? ep_poll_callback+0x249/0x2a0
[  156.107956]  ? ktime_get+0x3f/0xf0
[  156.107963]  uvc_video_stop_transfer+0x4a/0xc0 [uvcvideo]
[  156.107972]  uvc_video_stop_streaming+0x17/0xa0 [uvcvideo]
[  156.107981]  uvc_stop_streaming+0x27/0xd0 [uvcvideo]
[  156.107989]  __vb2_queue_cancel+0x33/0x320 [videobuf2_common]
[  156.107998]  vb2_core_streamoff+0x16/0xa0 [videobuf2_common]
[  156.108006]  vb2_streamoff+0x18/0x60 [videobuf2_v4l2]
[  156.108011]  uvc_queue_streamoff+0x2e/0x50 [uvcvideo]
[  156.108017]  uvc_ioctl_streamoff+0x3f/0x70 [uvcvideo]
[  156.108024]  v4l_streamoff+0x1d/0x30 [videodev]
[  156.108042]  __video_do_ioctl+0x3f1/0x5c0 [videodev]
[  156.108058]  video_usercopy+0x300/0x8a0 [videodev]
[  156.108073]  ? __pfx___video_do_ioctl+0x10/0x10 [videodev]
[  156.108089]  video_ioctl2+0x15/0x30 [videodev]
[  156.108106]  v4l2_ioctl+0x69/0xb0 [videodev]
[  156.108122]  __x64_sys_ioctl+0x9d/0xe0
[  156.108127]  x64_sys_call+0x11ad/0x25f0
[  156.108131]  do_syscall_64+0x7e/0x170
[  156.108135]  ? eventfd_read+0xdc/0x200
[  156.108139]  ? security_file_permission+0x8e/0x170
[  156.108143]  ? vfs_read+0x2a2/0x380
[  156.108149]  ? ksys_read+0xe6/0x100
[  156.108153]  ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0
[  156.108157]  ? syscall_exit_to_user_mode+0x38/0x1d0
[  156.108161]  ? do_syscall_64+0x8a/0x170
[  156.108163]  ? do_syscall_64+0x8a/0x170
[  156.108166]  ? arch_exit_to_user_mode_prepare.isra.0+0xc8/0xd0
[  156.108170]  ? syscall_exit_to_user_mode+0x38/0x1d0
[  156.108173]  ? clear_bhb_loop+0x15/0x70
[  156.108178]  ? clear_bhb_loop+0x15/0x70
[  156.108181]  ? clear_bhb_loop+0x15/0x70
[  156.108184]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  156.108188] RIP: 0033:0x702f4b524ded
[  156.108192] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
[  156.108194] RSP: 002b:00007fffc7eca020 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  156.108198] RAX: ffffffffffffffda RBX: 00007fffc7eca094 RCX: 0000702f4b524ded
[  156.108201] RDX: 00007fffc7eca094 RSI: 0000000040045613 RDI: 0000000000000045
[  156.108203] RBP: 00007fffc7eca070 R08: 0000000000000000 R09: 0000000000008000
[  156.108205] R10: 0000000000000000 R11: 0000000000000246 R12: 000060807924b4c8
[  156.108207] R13: 0000000000000045 R14: 000060807924b6b0 R15: ba06e70243de1100
[  156.108212]  </TASK>
[  156.108214] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr cmac algif_hash algif_skcipher af_alg bnep input_leds joydev hid_appletb_bl hid_magicmouse hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub hid_apple cdc_mbim cdc_wdm hid_generic uvcvideo videobuf2_vmalloc uvc videobuf2_memops videobuf2_v4l2 videobuf2_common usbhid videodev cdc_ncm cdc_ether hid usbnet mc mii apple_mfi_fastcharge intel_uncore_frequency snd_sof_pci_intel_icl intel_uncore_frequency_common snd_sof_pci_intel_cnl intel_pmc_core_pltdrv snd_sof_intel_hda_generic intel_pmc_core soundwire_intel intel_vsec soundwire_cadence pmt_telemetry pmt_class snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_hda_codec_hdmi snd_sof_pci snd_sof_xtensa_dsp snd_sof x86_pkg_temp_thermal intel_powerclamp snd_sof_utils coretemp snd_soc_acpi_intel_match soundwire_generic_allocation snd_soc_acpi kvm_intel soundwire_bus snd_soc_avs snd_soc_hda_codec
[  156.108278]  snd_hda_ext_core kvm snd_soc_core iTCO_wdt spi_nor intel_pmc_bxt snd_compress crct10dif_pclmul ac97_bus polyval_clmulni brcmfmac_wcc binfmt_misc iTCO_vendor_support intel_rapl_msr mei_hdcp mei_pxp mtd snd_pcm_dmaengine i915 snd_hda_intel snd_intel_dspcfg polyval_generic processor_thermal_device_pci_legacy processor_thermal_device ghash_clmulni_intel sha256_ssse3 snd_intel_sdw_acpi nls_iso8859_1 sha1_ssse3 drm_buddy applesmc snd_hda_codec brcmfmac ttm aesni_intel processor_thermal_wt_hint processor_thermal_rfim crypto_simd processor_thermal_rapl cryptd drm_display_helper i2c_i801 snd_hda_core brcmutil rapl hci_bcm4377 intel_rapl_common cec i2c_mux intel_cstate snd_hwdep spi_intel_pci processor_thermal_wt_req bluetooth cfg80211 spi_intel i2c_smbus mei_me processor_thermal_power_floor rc_core mei processor_thermal_mbox i2c_algo_bit int340x_thermal_zone intel_soc_dts_iosf intel_lpss_acpi sbs acpi_tad intel_lpss sbshc idma64 mac_hid sch_fq_codel apple_bce(OE) snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi
[  156.108355]  snd_seq snd_seq_device snd_timer snd soundcore msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c nvme nvme_core crc32_pclmul nvme_auth thunderbolt video wmi
[  156.108383] ---[ end trace 0000000000000000 ]---
[  156.605423] ieee80211 phy0: brcmf_msgbuf_query_dcmd: Timeout on response for query command
[  156.605432] ieee80211 phy0: brcmf_dongle_scantime: Scan assoc time error (-5)
[  156.943002] RIP: 0010:usb_hcd_unlink_urb_from_ep+0x2c/0x60
[  156.943012] Code: 44 00 00 55 48 c7 c7 8c de 9a 9d 48 89 e5 53 48 89 f3 e8 c7 46 51 00 48 8b 4b 18 48 8b 53 20 48 8d 43 18 48 c7 c7 8c de 9a 9d <48> 89 51 08 48 89 0a 48 89 43 18 48 89 43 20 e8 e0 47 51 00 48 8b
[  156.943014] RSP: 0000:ffffb98a4230b808 EFLAGS: 00010046
[  156.943017] RAX: ffff9d4704741198 RBX: ffff9d4704741180 RCX: dead000000000100
[  156.943019] RDX: dead000000000122 RSI: 0000000000000000 RDI: ffffffff9d9ade8c
[  156.943021] RBP: ffffb98a4230b810 R08: 0000000000000000 R09: 0000000000000000
[  156.943022] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9d47010eab28
[  156.943023] R13: 0000000000000000 R14: ffff9d4736cd3440 R15: ffff9d47010eab78
[  156.943025] FS:  0000702f4b680740(0000) GS:ffff9d4877e80000(0000) knlGS:0000000000000000
[  156.943027] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  156.943028] CR2: 00007b3477d6c9d0 CR3: 0000000146724006 CR4: 0000000000772ef0
[  156.943030] PKRU: 55555554
[  156.943031] note: pipewire[4056] exited with irqs disabled
[  156.943114] note: pipewire[4056] exited with preempt_count 2

In this log, please look at the the entries that I put '****' at the beginning.

urb=00000000913fa41a is being processed by bce_vhci_urb_request_cancel() and at the same time it is being processed by bce_vhci_transfer_queue_completion(). Just before the kernel oops message, it is being unlinked by bce_vhci_urb_request_cancel() and then a kernel oops occurs.
most probably, it was already unlinked by bce_vhci_transfer_queue_completion() before unlink was called by bce_vhci_urb_request_cancel(). Therefore, a double unlink situation occurs and this causes GPF.

To prevent this:
1)Added a new state (BCE_VHCI_URB_CANCELLED) to the enum bce_vhci_urb_state in transfer.h file.
This state is used to mark a URB as cancelled, so that any subsequent processing—such as transfer completion—is skipped.
2)Added a reference counting in bce_vhci_urb struct to prevent double free issues (kref ref)

At the beginning of the cancel function, we check if the URB has already been marked as cancelled (i.e., its state equals BCE_VHCI_URB_CANCELLED). If already cancelled, we exit early to prevent duplicate cleanup. Otherwise, we set the state to BCE_VHCI_URB_CANCELLED and take an extra reference using kref_get(). Here, we'll save URB's state as old_state to use whether it is BCE_VHCI_URB_INIT_PENDING because we'll set it as cancelled.
We then proceed with cancel operations. For URBs that were not in the INIT_PENDING state, we pause the transfer queue, adjust the active request count, and later resume the queue.
Finally, instead of directly calling kfree(), we simply call kref_put(). This ensures that the URB’s memory is only freed once the reference count reaches zero.
In the transfer completion function, we added a check that skips processing for any URB whose state is BCE_VHCI_URB_CANCELLED. This prevents the completion path from accessing or cleaning up a URB that has already been cancelled.

transfer.c and transfer.h files has been changed.

Here is new transfer.c

#include "transfer.h"
#include "../queue.h"
#include "vhci.h"
#include "../apple_bce.h"
#include <linux/usb/hcd.h>
#include <linux/kref.h>

static void bce_vhci_transfer_queue_completion(struct bce_queue_sq *sq);
static void bce_vhci_transfer_queue_giveback(struct bce_vhci_transfer_queue *q);
static void bce_vhci_transfer_queue_remove_pending(struct bce_vhci_transfer_queue *q);

static int bce_vhci_urb_init(struct bce_vhci_urb *vurb);
static int bce_vhci_urb_update(struct bce_vhci_urb *urb, struct bce_vhci_message *msg);
static int bce_vhci_urb_transfer_completion(struct bce_vhci_urb *urb, struct bce_sq_completion_data *c);

static void bce_vhci_transfer_queue_reset_w(struct work_struct *work);

void bce_vhci_create_transfer_queue(struct bce_vhci *vhci, struct bce_vhci_transfer_queue *q,
        struct usb_host_endpoint *endp, bce_vhci_device_t dev_addr, enum dma_data_direction dir)
{
    char name[0x21];
    INIT_LIST_HEAD(&q->evq);
    INIT_LIST_HEAD(&q->giveback_urb_list);
    spin_lock_init(&q->urb_lock);
    mutex_init(&q->pause_lock);
    q->vhci = vhci;
    q->endp = endp;
    q->dev_addr = dev_addr;
    q->endp_addr = (u8) (endp->desc.bEndpointAddress & 0x8F);
    q->state = BCE_VHCI_ENDPOINT_ACTIVE;
    q->active = true;
    q->stalled = false;
    q->max_active_requests = 1;
    if (usb_endpoint_type(&endp->desc) == USB_ENDPOINT_XFER_BULK)
        q->max_active_requests = BCE_VHCI_BULK_MAX_ACTIVE_URBS;
    q->remaining_active_requests = q->max_active_requests;
    q->cq = bce_create_cq(vhci->dev, 0x100);
    INIT_WORK(&q->w_reset, bce_vhci_transfer_queue_reset_w);
    q->sq_in = NULL;
    if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
        snprintf(name, sizeof(name), "VHC1-%i-%02x", dev_addr, 0x80 | usb_endpoint_num(&endp->desc));
        q->sq_in = bce_create_sq(vhci->dev, q->cq, name, 0x100, DMA_FROM_DEVICE,
                                 bce_vhci_transfer_queue_completion, q);
    }
    q->sq_out = NULL;
    if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) {
        snprintf(name, sizeof(name), "VHC1-%i-%02x", dev_addr, usb_endpoint_num(&endp->desc));
        q->sq_out = bce_create_sq(vhci->dev, q->cq, name, 0x100, DMA_TO_DEVICE,
                                  bce_vhci_transfer_queue_completion, q);
    }
}

void bce_vhci_destroy_transfer_queue(struct bce_vhci *vhci, struct bce_vhci_transfer_queue *q)
{
    bce_vhci_transfer_queue_giveback(q);
    bce_vhci_transfer_queue_remove_pending(q);
    if (q->sq_in)
        bce_destroy_sq(vhci->dev, q->sq_in);
    if (q->sq_out)
        bce_destroy_sq(vhci->dev, q->sq_out);
    bce_destroy_cq(vhci->dev, q->cq);
}

static inline bool bce_vhci_transfer_queue_can_init_urb(struct bce_vhci_transfer_queue *q)
{
    return q->remaining_active_requests > 0;
}

static void bce_vhci_transfer_queue_defer_event(struct bce_vhci_transfer_queue *q, struct bce_vhci_message *msg)
{
    struct bce_vhci_list_message *lm;
    lm = kmalloc(sizeof(struct bce_vhci_list_message), GFP_KERNEL);
    INIT_LIST_HEAD(&lm->list);
    lm->msg = *msg;
    list_add_tail(&lm->list, &q->evq);
}

static void bce_vhci_transfer_queue_giveback(struct bce_vhci_transfer_queue *q)
{
    unsigned long flags;
    struct urb *urb;
    spin_lock_irqsave(&q->urb_lock, flags);
    while (!list_empty(&q->giveback_urb_list)) {
        urb = list_first_entry(&q->giveback_urb_list, struct urb, urb_list);
        list_del(&urb->urb_list);

        spin_unlock_irqrestore(&q->urb_lock, flags);
        usb_hcd_giveback_urb(q->vhci->hcd, urb, urb->status);
        spin_lock_irqsave(&q->urb_lock, flags);
    }
    spin_unlock_irqrestore(&q->urb_lock, flags);
}

static void bce_vhci_transfer_queue_init_pending_urbs(struct bce_vhci_transfer_queue *q);

static void bce_vhci_transfer_queue_deliver_pending(struct bce_vhci_transfer_queue *q)
{
    struct urb *urb;
    struct bce_vhci_list_message *lm;

    while (!list_empty(&q->endp->urb_list) && !list_empty(&q->evq)) {
        urb = list_first_entry(&q->endp->urb_list, struct urb, urb_list);

        lm = list_first_entry(&q->evq, struct bce_vhci_list_message, list);
        if (bce_vhci_urb_update(urb->hcpriv, &lm->msg) == -EAGAIN)
            break;
        list_del(&lm->list);
        kfree(lm);
    }

    /* some of the URBs could have been completed, so initialize more URBs if possible */
    bce_vhci_transfer_queue_init_pending_urbs(q);
}

static void bce_vhci_transfer_queue_remove_pending(struct bce_vhci_transfer_queue *q)
{
    unsigned long flags;
    struct bce_vhci_list_message *lm;
    spin_lock_irqsave(&q->urb_lock, flags);
    while (!list_empty(&q->evq)) {
        lm = list_first_entry(&q->evq, struct bce_vhci_list_message, list);
        list_del(&lm->list);
        kfree(lm);
    }
    spin_unlock_irqrestore(&q->urb_lock, flags);
}

void bce_vhci_transfer_queue_event(struct bce_vhci_transfer_queue *q, struct bce_vhci_message *msg)
{
    unsigned long flags;
    struct bce_vhci_urb *turb;
    struct urb *urb;
    spin_lock_irqsave(&q->urb_lock, flags);
    bce_vhci_transfer_queue_deliver_pending(q);

    if (msg->cmd == BCE_VHCI_CMD_TRANSFER_REQUEST &&
        (!list_empty(&q->evq) || list_empty(&q->endp->urb_list))) {
        bce_vhci_transfer_queue_defer_event(q, msg);
        goto complete;
    }
    if (list_empty(&q->endp->urb_list)) {
        pr_err("bce-vhci: [%02x] Unexpected transfer queue event\n", q->endp_addr);
        goto complete;
    }
    urb = list_first_entry(&q->endp->urb_list, struct urb, urb_list);
    turb = urb->hcpriv;
    if (bce_vhci_urb_update(turb, msg) == -EAGAIN) {
        bce_vhci_transfer_queue_defer_event(q, msg);
    } else {
        bce_vhci_transfer_queue_init_pending_urbs(q);
    }

complete:
    spin_unlock_irqrestore(&q->urb_lock, flags);
    bce_vhci_transfer_queue_giveback(q);
}

static void bce_vhci_transfer_queue_completion(struct bce_queue_sq *sq)
{
    unsigned long flags;
    struct bce_sq_completion_data *c;
    struct urb *urb;
    struct bce_vhci_transfer_queue *q = sq->userdata;
    spin_lock_irqsave(&q->urb_lock, flags);
    while ((c = bce_next_completion(sq))) {
        if (c->status == BCE_COMPLETION_ABORTED) { /* We flushed the queue */
            pr_debug("bce-vhci: [%02x] Got an abort completion\n", q->endp_addr);
            bce_notify_submission_complete(sq);
            continue;
        }
        if (list_empty(&q->endp->urb_list)) {
            pr_err("bce-vhci: [%02x] Got a completion while no requests are pending\n", q->endp_addr);
            continue;
        }
        pr_debug("bce-vhci: [%02x] Got a transfer queue completion\n", q->endp_addr);
        urb = list_first_entry(&q->endp->urb_list, struct urb, urb_list);
        {
            struct bce_vhci_urb *vurb = urb->hcpriv;
            if (vurb->state == BCE_VHCI_URB_CANCELLED) {
                pr_info("[BCE_DBG:%s] SKIPPING COMPLETION PROCESSING FOR CANCELLED URB: %p\n", __func__, urb);
                continue;  // Skipping if URB is already cancelled
            }
        }
        bce_vhci_urb_transfer_completion(urb->hcpriv, c);
        bce_notify_submission_complete(sq);
    }
    bce_vhci_transfer_queue_deliver_pending(q);
    spin_unlock_irqrestore(&q->urb_lock, flags);
    bce_vhci_transfer_queue_giveback(q);
}

int bce_vhci_transfer_queue_do_pause(struct bce_vhci_transfer_queue *q)
{
    unsigned long flags;
    int status;
    u8 endp_addr = (u8) (q->endp->desc.bEndpointAddress & 0x8F);
    spin_lock_irqsave(&q->urb_lock, flags);
    q->active = false;
    spin_unlock_irqrestore(&q->urb_lock, flags);
    if (q->sq_out) {
        pr_err("bce-vhci: Not implemented: wait for pending output requests\n");
    }
    bce_vhci_transfer_queue_remove_pending(q);
    if ((status = bce_vhci_cmd_endpoint_set_state(
            &q->vhci->cq, q->dev_addr, endp_addr, BCE_VHCI_ENDPOINT_PAUSED, &q->state)))
        return status;
    if (q->state != BCE_VHCI_ENDPOINT_PAUSED)
        return -EINVAL;
    if (q->sq_in)
        bce_cmd_flush_memory_queue(q->vhci->dev->cmd_cmdq, (u16) q->sq_in->qid);
    if (q->sq_out)
        bce_cmd_flush_memory_queue(q->vhci->dev->cmd_cmdq, (u16) q->sq_out->qid);
    return 0;
}

static void bce_vhci_urb_resume(struct bce_vhci_urb *urb);

int bce_vhci_transfer_queue_do_resume(struct bce_vhci_transfer_queue *q)
{
    unsigned long flags;
    int status;
    struct urb *urb, *urbt;
    struct bce_vhci_urb *vurb;
    u8 endp_addr = (u8) (q->endp->desc.bEndpointAddress & 0x8F);
    if ((status = bce_vhci_cmd_endpoint_set_state(
            &q->vhci->cq, q->dev_addr, endp_addr, BCE_VHCI_ENDPOINT_ACTIVE, &q->state)))
        return status;
    if (q->state != BCE_VHCI_ENDPOINT_ACTIVE)
        return -EINVAL;
    spin_lock_irqsave(&q->urb_lock, flags);
    q->active = true;
    list_for_each_entry_safe(urb, urbt, &q->endp->urb_list, urb_list) {
        vurb = urb->hcpriv;
        if (vurb->state == BCE_VHCI_URB_INIT_PENDING) {
            if (!bce_vhci_transfer_queue_can_init_urb(q))
                break;
            bce_vhci_urb_init(vurb);
        } else {
            bce_vhci_urb_resume(vurb);
        }
    }
    bce_vhci_transfer_queue_deliver_pending(q);
    spin_unlock_irqrestore(&q->urb_lock, flags);
    return 0;
}

int bce_vhci_transfer_queue_pause(struct bce_vhci_transfer_queue *q, enum bce_vhci_pause_source src)
{
    int ret = 0;
    mutex_lock(&q->pause_lock);
    if ((q->paused_by & src) != src) {
        if (!q->paused_by)
            ret = bce_vhci_transfer_queue_do_pause(q);
        if (!ret)
            q->paused_by |= src;
    }
    mutex_unlock(&q->pause_lock);
    return ret;
}

int bce_vhci_transfer_queue_resume(struct bce_vhci_transfer_queue *q, enum bce_vhci_pause_source src)
{
    int ret = 0;
    mutex_lock(&q->pause_lock);
    if (q->paused_by & src) {
        if (!(q->paused_by & ~src))
            ret = bce_vhci_transfer_queue_do_resume(q);
        if (!ret)
            q->paused_by &= ~src;
    }
    mutex_unlock(&q->pause_lock);
    return ret;
}

static void bce_vhci_transfer_queue_reset_w(struct work_struct *work)
{
    unsigned long flags;
    struct bce_vhci_transfer_queue *q = container_of(work, struct bce_vhci_transfer_queue, w_reset);

    mutex_lock(&q->pause_lock);
    spin_lock_irqsave(&q->urb_lock, flags);
    if (!q->stalled) {
        spin_unlock_irqrestore(&q->urb_lock, flags);
        mutex_unlock(&q->pause_lock);
        return;
    }
    q->active = false;
    spin_unlock_irqrestore(&q->urb_lock, flags);
    q->paused_by |= BCE_VHCI_PAUSE_INTERNAL_WQ;
    bce_vhci_transfer_queue_remove_pending(q);
    if (q->sq_in)
        bce_cmd_flush_memory_queue(q->vhci->dev->cmd_cmdq, (u16) q->sq_in->qid);
    if (q->sq_out)
        bce_cmd_flush_memory_queue(q->vhci->dev->cmd_cmdq, (u16) q->sq_out->qid);
    bce_vhci_cmd_endpoint_reset(&q->vhci->cq, q->dev_addr, (u8) (q->endp->desc.bEndpointAddress & 0x8F));
    spin_lock_irqsave(&q->urb_lock, flags);
    q->stalled = false;
    spin_unlock_irqrestore(&q->urb_lock, flags);
    mutex_unlock(&q->pause_lock);
    bce_vhci_transfer_queue_resume(q, BCE_VHCI_PAUSE_INTERNAL_WQ);
}

void bce_vhci_transfer_queue_request_reset(struct bce_vhci_transfer_queue *q)
{
    queue_work(q->vhci->tq_state_wq, &q->w_reset);
}

static void bce_vhci_transfer_queue_init_pending_urbs(struct bce_vhci_transfer_queue *q)
{
    struct urb *urb, *urbt;
    struct bce_vhci_urb *vurb;
    list_for_each_entry_safe(urb, urbt, &q->endp->urb_list, urb_list) {
        vurb = urb->hcpriv;
        if (!bce_vhci_transfer_queue_can_init_urb(q))
            break;
        if (vurb->state == BCE_VHCI_URB_INIT_PENDING)
            bce_vhci_urb_init(vurb);
    }
}



static int bce_vhci_urb_data_start(struct bce_vhci_urb *urb, unsigned long *timeout);

int bce_vhci_urb_create(struct bce_vhci_transfer_queue *q, struct urb *urb)
{
    unsigned long flags;
    int status = 0;
    struct bce_vhci_urb *vurb;
    vurb = kzalloc(sizeof(struct bce_vhci_urb), GFP_KERNEL);
    
    if (!vurb)
        return -ENOMEM;

    /*Added: Starting reference count here */
    kref_init(&vurb->ref);

    urb->hcpriv = vurb;
    vurb->q = q;
    vurb->urb = urb;
    vurb->dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
    vurb->is_control = (usb_endpoint_num(&urb->ep->desc) == 0);

    spin_lock_irqsave(&q->urb_lock, flags);
    status = usb_hcd_link_urb_to_ep(q->vhci->hcd, urb);
    if (status) {
        spin_unlock_irqrestore(&q->urb_lock, flags);
        urb->hcpriv = NULL;
        kfree(vurb);
        return status;
    }

    if (q->active) {
        if (bce_vhci_transfer_queue_can_init_urb(vurb->q))
            status = bce_vhci_urb_init(vurb);
        else
            vurb->state = BCE_VHCI_URB_INIT_PENDING;
    } else {
        if (q->stalled)
            bce_vhci_transfer_queue_request_reset(q);
        vurb->state = BCE_VHCI_URB_INIT_PENDING;
    }
    if (status) {
        usb_hcd_unlink_urb_from_ep(q->vhci->hcd, urb);
        urb->hcpriv = NULL;
        kfree(vurb);
    } else {
        bce_vhci_transfer_queue_deliver_pending(q);
    }
    spin_unlock_irqrestore(&q->urb_lock, flags);
    pr_debug("bce-vhci: [%02x] URB enqueued (dir = %s, size = %i)\n", q->endp_addr,
            usb_urb_dir_in(urb) ? "IN" : "OUT", urb->transfer_buffer_length);
    return status;
}

static int bce_vhci_urb_init(struct bce_vhci_urb *vurb)
{
    int status = 0;

    if (vurb->q->remaining_active_requests == 0) {
        pr_err("bce-vhci: cannot init request (remaining_active_requests = 0)\n");
        return -EINVAL;
    }

    if (vurb->is_control) {
        vurb->state = BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_REQUEST;
    } else {
        status = bce_vhci_urb_data_start(vurb, NULL);
    }

    if (!status) {
        --vurb->q->remaining_active_requests;
    }
    return status;
}

static void bce_vhci_urb_complete(struct bce_vhci_urb *urb, int status)
{
    struct bce_vhci_transfer_queue *q = urb->q;
    struct bce_vhci *vhci = q->vhci;
    struct urb *real_urb = urb->urb;
    pr_debug("bce-vhci: [%02x] URB complete %i\n", q->endp_addr, status);
    usb_hcd_unlink_urb_from_ep(vhci->hcd, real_urb);
    real_urb->hcpriv = NULL;
    real_urb->status = status;
    if (urb->state != BCE_VHCI_URB_INIT_PENDING)
        ++urb->q->remaining_active_requests;
    kfree(urb);
    list_add_tail(&real_urb->urb_list, &q->giveback_urb_list);
}

static void vurb_release(struct kref *kref)
{
    struct bce_vhci_urb *vurb = container_of(kref, struct bce_vhci_urb, ref);
    kfree(vurb);
}

int bce_vhci_urb_request_cancel(struct bce_vhci_transfer_queue *q, struct urb *urb, int status)
{
    struct bce_vhci_urb *vurb;
    unsigned long flags;
    int ret;
    enum bce_vhci_urb_state old_state; // save old state to use later because we'll set state as cancelled

    pr_info("[BCE_DBG:%s] THIS URB IS PROCESSING NOW: urb=%p, ep=0x%02x, status=%d\n", __func__, urb, q->endp_addr, status);

    spin_lock_irqsave(&q->urb_lock, flags);
    if ((ret = usb_hcd_check_unlink_urb(q->vhci->hcd, urb, status))) {
        pr_emerg("[BCE_DBG:%s] Check failed (already unlinked?): urb=%p, ret=%d\n", __func__, urb, ret);
        spin_unlock_irqrestore(&q->urb_lock, flags);
        return ret;
    }

    vurb = urb->hcpriv;
    
    old_state = vurb->state; // save old state to use later because we'll set state as cancelled
    
    if (old_state == BCE_VHCI_URB_CANCELLED) {
        spin_unlock_irqrestore(&q->urb_lock, flags);
        pr_info("bce-vhci: URB %p IS ALREADY CANCELLED, SKIPPING\n", urb);
        return 0;
    }
    
    vurb->state = BCE_VHCI_URB_CANCELLED;
    
    /* Let's take an extra reference so that the cancellation process cleanup is not done twice. */
    kref_get(&vurb->ref);
    spin_unlock_irqrestore(&q->urb_lock, flags);
    
    /* If the URB wasn't posted to the device yet, we can still remove it on the host without pausing the queue. */
    if (old_state != BCE_VHCI_URB_INIT_PENDING) { // we're using old state here
        pr_debug("bce-vhci: [%02x] Cancelling URB\n", q->endp_addr);
        
        bce_vhci_transfer_queue_pause(q, BCE_VHCI_PAUSE_INTERNAL_WQ);
        spin_lock_irqsave(&q->urb_lock, flags);
        ++q->remaining_active_requests;
        spin_unlock_irqrestore(&q->urb_lock, flags);
    }

    usb_hcd_unlink_urb_from_ep(q->vhci->hcd, urb);
    usb_hcd_giveback_urb(q->vhci->hcd, urb, status);

    if (old_state != BCE_VHCI_URB_INIT_PENDING) // we're using old state here again
        bce_vhci_transfer_queue_resume(q, BCE_VHCI_PAUSE_INTERNAL_WQ);

    kref_put(&vurb->ref, vurb_release);

    return 0;
}

static int bce_vhci_urb_data_transfer_in(struct bce_vhci_urb *urb, unsigned long *timeout)
{
    struct bce_vhci_message msg;
    struct bce_qe_submission *s;
    u32 tr_len;
    int reservation1, reservation2 = -EFAULT;

    pr_debug("bce-vhci: [%02x] DMA from device %llx %x\n", urb->q->endp_addr,
             (u64) urb->urb->transfer_dma, urb->urb->transfer_buffer_length);

    /* Reserve both a message and a submission, so we don't run into issues later. */
    reservation1 = bce_reserve_submission(urb->q->vhci->msg_asynchronous.sq, timeout);
    if (!reservation1)
        reservation2 = bce_reserve_submission(urb->q->sq_in, timeout);
    if (reservation1 || reservation2) {
        pr_err("bce-vhci: Failed to reserve a submission for URB data transfer\n");
        if (!reservation1)
            bce_cancel_submission_reservation(urb->q->vhci->msg_asynchronous.sq);
        return -ENOMEM;
    }

    urb->send_offset = urb->receive_offset;

    tr_len = urb->urb->transfer_buffer_length - urb->send_offset;

    spin_lock(&urb->q->vhci->msg_asynchronous_lock);
    msg.cmd = BCE_VHCI_CMD_TRANSFER_REQUEST;
    msg.status = 0;
    msg.param1 = ((urb->urb->ep->desc.bEndpointAddress & 0x8Fu) << 8) | urb->q->dev_addr;
    msg.param2 = tr_len;
    bce_vhci_message_queue_write(&urb->q->vhci->msg_asynchronous, &msg);
    spin_unlock(&urb->q->vhci->msg_asynchronous_lock);

    s = bce_next_submission(urb->q->sq_in);
    bce_set_submission_single(s, urb->urb->transfer_dma + urb->send_offset, tr_len);
    bce_submit_to_device(urb->q->sq_in);

    urb->state = BCE_VHCI_URB_WAITING_FOR_COMPLETION;
    return 0;
}

static int bce_vhci_urb_data_start(struct bce_vhci_urb *urb, unsigned long *timeout)
{
    if (urb->dir == DMA_TO_DEVICE) {
        if (urb->urb->transfer_buffer_length > 0)
            urb->state = BCE_VHCI_URB_WAITING_FOR_TRANSFER_REQUEST;
        else
            urb->state = BCE_VHCI_URB_DATA_TRANSFER_COMPLETE;
        return 0;
    } else {
        return bce_vhci_urb_data_transfer_in(urb, timeout);
    }
}

static int bce_vhci_urb_send_out_data(struct bce_vhci_urb *urb, dma_addr_t addr, size_t size)
{
    struct bce_qe_submission *s;
    unsigned long timeout = 0;
    if (bce_reserve_submission(urb->q->sq_out, &timeout)) {
        pr_err("bce-vhci: Failed to reserve a submission for URB data transfer\n");
        return -EPIPE;
    }

    pr_debug("bce-vhci: [%02x] DMA to device %llx %lx\n", urb->q->endp_addr, (u64) addr, size);

    s = bce_next_submission(urb->q->sq_out);
    bce_set_submission_single(s, addr, size);
    bce_submit_to_device(urb->q->sq_out);
    return 0;
}

static int bce_vhci_urb_data_update(struct bce_vhci_urb *urb, struct bce_vhci_message *msg)
{
    u32 tr_len;
    int status;
    if (urb->state == BCE_VHCI_URB_WAITING_FOR_TRANSFER_REQUEST) {
        if (msg->cmd == BCE_VHCI_CMD_TRANSFER_REQUEST) {
            tr_len = min(urb->urb->transfer_buffer_length - urb->send_offset, (u32) msg->param2);
            if ((status = bce_vhci_urb_send_out_data(urb, urb->urb->transfer_dma + urb->send_offset, tr_len)))
                return status;
            urb->send_offset += tr_len;
            urb->state = BCE_VHCI_URB_WAITING_FOR_COMPLETION;
            return 0;
        }
    }

    /* 0x1000 in out queues aren't really unexpected */
    if (msg->cmd == BCE_VHCI_CMD_TRANSFER_REQUEST && urb->q->sq_out != NULL)
        return -EAGAIN;
    pr_err("bce-vhci: [%02x] %s URB unexpected message (state = %x, msg: %x %x %x %llx)\n",
            urb->q->endp_addr, (urb->is_control ? "Control (data update)" : "Data"), urb->state,
            msg->cmd, msg->status, msg->param1, msg->param2);
    return -EAGAIN;
}

static int bce_vhci_urb_data_transfer_completion(struct bce_vhci_urb *urb, struct bce_sq_completion_data *c)
{
    if (urb->state == BCE_VHCI_URB_WAITING_FOR_COMPLETION) {
        urb->receive_offset += c->data_size;
        if (urb->dir == DMA_FROM_DEVICE || urb->receive_offset >= urb->urb->transfer_buffer_length) {
            urb->urb->actual_length = (u32) urb->receive_offset;
            urb->state = BCE_VHCI_URB_DATA_TRANSFER_COMPLETE;
            if (!urb->is_control) {
                bce_vhci_urb_complete(urb, 0);
                return -ENOENT;
            }
        }
    } else {
        pr_err("bce-vhci: [%02x] Data URB unexpected completion\n", urb->q->endp_addr);
    }
    return 0;
}


static int bce_vhci_urb_control_check_status(struct bce_vhci_urb *urb)
{
    struct bce_vhci_transfer_queue *q = urb->q;
    if (urb->received_status == 0)
        return 0;
    if (urb->state == BCE_VHCI_URB_DATA_TRANSFER_COMPLETE ||
        (urb->received_status != BCE_VHCI_SUCCESS && urb->state != BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_REQUEST &&
        urb->state != BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_COMPLETION)) {
        urb->state = BCE_VHCI_URB_CONTROL_COMPLETE;
        if (urb->received_status != BCE_VHCI_SUCCESS) {
            pr_err("bce-vhci: [%02x] URB failed: %x\n", urb->q->endp_addr, urb->received_status);
            urb->q->active = false;
            urb->q->stalled = true;
            bce_vhci_urb_complete(urb, -EPIPE);
            if (!list_empty(&q->endp->urb_list))
                bce_vhci_transfer_queue_request_reset(q);
            return -ENOENT;
        }
        bce_vhci_urb_complete(urb, 0);
        return -ENOENT;
    }
    return 0;
}

static int bce_vhci_urb_control_update(struct bce_vhci_urb *urb, struct bce_vhci_message *msg)
{
    int status;
    if (msg->cmd == BCE_VHCI_CMD_CONTROL_TRANSFER_STATUS) {
        urb->received_status = msg->status;
        return bce_vhci_urb_control_check_status(urb);
    }

    if (urb->state == BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_REQUEST) {
        if (msg->cmd == BCE_VHCI_CMD_TRANSFER_REQUEST) {
            if (bce_vhci_urb_send_out_data(urb, urb->urb->setup_dma, sizeof(struct usb_ctrlrequest))) {
                pr_err("bce-vhci: [%02x] Failed to start URB setup transfer\n", urb->q->endp_addr);
                return 0; /* TODO: fail the URB? */
            }
            urb->state = BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_COMPLETION;
            pr_debug("bce-vhci: [%02x] Sent setup %llx\n", urb->q->endp_addr, urb->urb->setup_dma);
            return 0;
        }
    } else if (urb->state == BCE_VHCI_URB_WAITING_FOR_TRANSFER_REQUEST ||
               urb->state == BCE_VHCI_URB_WAITING_FOR_COMPLETION) {
        if ((status = bce_vhci_urb_data_update(urb, msg)))
            return status;
        return bce_vhci_urb_control_check_status(urb);
    }

    /* 0x1000 in out queues aren't really unexpected */
    if (msg->cmd == BCE_VHCI_CMD_TRANSFER_REQUEST && urb->q->sq_out != NULL)
        return -EAGAIN;
    pr_err("bce-vhci: [%02x] Control URB unexpected message (state = %x, msg: %x %x %x %llx)\n", urb->q->endp_addr,
            urb->state, msg->cmd, msg->status, msg->param1, msg->param2);
    return -EAGAIN;
}

static int bce_vhci_urb_control_transfer_completion(struct bce_vhci_urb *urb, struct bce_sq_completion_data *c)
{
    int status;
    unsigned long timeout;

    if (urb->state == BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_COMPLETION) {
        if (c->data_size != sizeof(struct usb_ctrlrequest))
            pr_err("bce-vhci: [%02x] transfer complete data size mistmatch for usb_ctrlrequest (%llx instead of %lx)\n",
                   urb->q->endp_addr, c->data_size, sizeof(struct usb_ctrlrequest));

        timeout = 1000;
        status = bce_vhci_urb_data_start(urb, &timeout);
        if (status) {
            bce_vhci_urb_complete(urb, status);
            return -ENOENT;
        }
        return 0;
    } else if (urb->state == BCE_VHCI_URB_WAITING_FOR_TRANSFER_REQUEST ||
               urb->state == BCE_VHCI_URB_WAITING_FOR_COMPLETION) {
        if ((status = bce_vhci_urb_data_transfer_completion(urb, c)))
            return status;
        return bce_vhci_urb_control_check_status(urb);
    } else {
        pr_err("bce-vhci: [%02x] Control URB unexpected completion (state = %x)\n", urb->q->endp_addr, urb->state);
    }
    return 0;
}

static int bce_vhci_urb_update(struct bce_vhci_urb *urb, struct bce_vhci_message *msg)
{
    if (urb->state == BCE_VHCI_URB_INIT_PENDING)
        return -EAGAIN;
    if (urb->is_control)
        return bce_vhci_urb_control_update(urb, msg);
    else
        return bce_vhci_urb_data_update(urb, msg);
}

static int bce_vhci_urb_transfer_completion(struct bce_vhci_urb *urb, struct bce_sq_completion_data *c)
{
    if (urb->is_control)
        return bce_vhci_urb_control_transfer_completion(urb, c);
    else
        return bce_vhci_urb_data_transfer_completion(urb, c);
}

static void bce_vhci_urb_resume(struct bce_vhci_urb *urb)
{
    int status = 0;
    if (urb->state == BCE_VHCI_URB_WAITING_FOR_COMPLETION) {
        status = bce_vhci_urb_data_transfer_in(urb, NULL);
    }
    if (status)
        bce_vhci_urb_complete(urb, status);
}

Here is new transfer.h

#ifndef BCEDRIVER_TRANSFER_H
#define BCEDRIVER_TRANSFER_H

#include <linux/usb.h>
#include "queue.h"
#include "command.h"
#include "../queue.h"
#include <linux/kref.h>

struct bce_vhci_list_message {
    struct list_head list;
    struct bce_vhci_message msg;
};
enum bce_vhci_pause_source {
    BCE_VHCI_PAUSE_INTERNAL_WQ = 1,
    BCE_VHCI_PAUSE_FIRMWARE = 2,
    BCE_VHCI_PAUSE_SUSPEND = 4,
    BCE_VHCI_PAUSE_SHUTDOWN = 8
};
struct bce_vhci_transfer_queue {
    struct bce_vhci *vhci;
    struct usb_host_endpoint *endp;
    enum bce_vhci_endpoint_state state;
    u32 max_active_requests, remaining_active_requests;
    bool active, stalled;
    u32 paused_by;
    bce_vhci_device_t dev_addr;
    u8 endp_addr;
    struct bce_queue_cq *cq;
    struct bce_queue_sq *sq_in;
    struct bce_queue_sq *sq_out;
    struct list_head evq;
    struct spinlock urb_lock;
    struct mutex pause_lock;
    struct list_head giveback_urb_list;

    struct work_struct w_reset;
};
enum bce_vhci_urb_state {
    BCE_VHCI_URB_INIT_PENDING,

    BCE_VHCI_URB_WAITING_FOR_TRANSFER_REQUEST,
    BCE_VHCI_URB_WAITING_FOR_COMPLETION,
    BCE_VHCI_URB_DATA_TRANSFER_COMPLETE,

    BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_REQUEST,
    BCE_VHCI_URB_CONTROL_WAITING_FOR_SETUP_COMPLETION,
    BCE_VHCI_URB_CONTROL_COMPLETE,
    
    BCE_VHCI_URB_CANCELLED   /* New state */
};
struct bce_vhci_urb {
    struct urb *urb;
    struct bce_vhci_transfer_queue *q;
    enum dma_data_direction dir;
    bool is_control;
    enum bce_vhci_urb_state state;
    int received_status;
    u32 send_offset;
    u32 receive_offset;
    struct kref ref;  /* Added: urb refcount */
};

void bce_vhci_create_transfer_queue(struct bce_vhci *vhci, struct bce_vhci_transfer_queue *q,
        struct usb_host_endpoint *endp, bce_vhci_device_t dev_addr, enum dma_data_direction dir);
void bce_vhci_destroy_transfer_queue(struct bce_vhci *vhci, struct bce_vhci_transfer_queue *q);
void bce_vhci_transfer_queue_event(struct bce_vhci_transfer_queue *q, struct bce_vhci_message *msg);
int bce_vhci_transfer_queue_do_pause(struct bce_vhci_transfer_queue *q);
int bce_vhci_transfer_queue_do_resume(struct bce_vhci_transfer_queue *q);
int bce_vhci_transfer_queue_pause(struct bce_vhci_transfer_queue *q, enum bce_vhci_pause_source src);
int bce_vhci_transfer_queue_resume(struct bce_vhci_transfer_queue *q, enum bce_vhci_pause_source src);
void bce_vhci_transfer_queue_request_reset(struct bce_vhci_transfer_queue *q);

int bce_vhci_urb_create(struct bce_vhci_transfer_queue *q, struct urb *urb);
int bce_vhci_urb_request_cancel(struct bce_vhci_transfer_queue *q, struct urb *urb, int status);

#endif //BCEDRIVER_TRANSFER_H

Also, sorry for putting all transfer.c and transfer.h files here. I don't know how to create diff file to share on github and I am not sure if it is worth to fork the repository for my changes.
It works on my mac but I am not 100% sure that my changes are safe to apply or I am doing right thing to fix the issue.
Please review my code and let me know if there is something missing or wrong.
Thanks.

@AdityaGarg8
Copy link
Member

AdityaGarg8 commented Apr 13, 2025

I hope I can find a solution because this is very annoying :((((

Are you able to reproduce this after removing the dkms driver as well?

sudo apt purge apple-bce

And then restart.

@AdityaGarg8
Copy link
Member

I hope I can find a solution because this is very annoying :((((

Are you able to reproduce this after removing the dkms driver as well?

sudo apt purge apple-bce

And then restart.

Nvm, I managed to reproduce this. So it definitely seems to fix camera, but I am concerned about the spinlocks rearranged. Any reason why?

AdityaGarg8 pushed a commit to AdityaGarg8/apple-bce-drv that referenced this issue Apr 13, 2025
@mnural
Copy link
Author

mnural commented Apr 13, 2025

int bce_vhci_urb_request_cancel(struct bce_vhci_transfer_queue *q, struct urb *urb, int status)
{
    struct bce_vhci_urb *vurb;
    unsigned long flags;
    int ret;
    enum bce_vhci_urb_state old_state;

Acquire the spinlock at the beginning here as in the originial code

spin_lock_irqsave(&q->urb_lock, flags);

    if ((ret = usb_hcd_check_unlink_urb(q->vhci->hcd, urb, status))) {
        pr_emerg("[BCE_DBG:%s] Check failed (already unlinked?): urb=%p, ret=%d\n",
                 __func__, urb, ret);
        spin_unlock_irqrestore(&q->urb_lock, flags);
        return ret;
    }

    vurb = urb->hcpriv;
    old_state = vurb->state;

If the URB is already cancelled, release the lock and exit early.

    if (old_state == BCE_VHCI_URB_CANCELLED) {
        spin_unlock_irqrestore(&q->urb_lock, flags);
        pr_info("bce-vhci: URB %p IS ALREADY CANCELLED, SKIPPING\n", urb);
        return 0;
    }

    vurb->state = BCE_VHCI_URB_CANCELLED;
    kref_get(&vurb->ref);

If the URB was beyond the INIT_PENDING state, do pasue and resume but If I don't rearrange spinclock layout here, I got lots of BUG: scheduling while atomic: pipewire/1639/0x00000000” Errors in the journalctl. If I release the spinlock here, the errors disappear.

spin_unlock_irqrestore(&q->urb_lock, flags);

    if (old_state != BCE_VHCI_URB_INIT_PENDING) {
        pr_debug("bce-vhci: [%02x] Cancelling URB (complex path)\n", q->endp_addr);
        bce_vhci_transfer_queue_pause(q, BCE_VHCI_PAUSE_INTERNAL_WQ);

Re-acquire the spinlock briefly to update shared counters.

        spin_lock_irqsave(&q->urb_lock, flags);
        ++q->remaining_active_requests;
        spin_unlock_irqrestore(&q->urb_lock, flags);
    }

Unlink the URB from the device and give it back.
AGAIN if I lock the spinlock here, I got BUG: scheduling while atomic: pipewire/1639/0x00000000 error.
I couldn't figure out the exact reason of this error.
That't why I decided to release it.

    usb_hcd_unlink_urb_from_ep(q->vhci->hcd, urb);
    usb_hcd_giveback_urb(q->vhci->hcd, urb, status);

& keep going

    if (old_state != BCE_VHCI_URB_INIT_PENDING)
         bce_vhci_transfer_queue_resume(q, BCE_VHCI_PAUSE_INTERNAL_WQ);

    kref_put(&vurb->ref, vurb_release);
    
    pr_emerg("[BCE_RACE_DBG:%s] Exiting: urb=%p\n", __func__, urb);
    return 0;
}

static void vurb_release(struct kref *kref)
{
    struct bce_vhci_urb *vurb = container_of(kref, struct bce_vhci_urb, ref);
    kfree(vurb);
}

Briefly, spinlocks has been rearranged because of the BUG: scheduling while atomic: pipewire/1639/0x00000000 error. This error doesn't end up with the GPF but concerning. With this layout I don't get this bug.

@AdityaGarg8
Copy link
Member

I've opened a PR for others to review this code. Since it's AI generated, its better to have some human reviewers as well. It it passes the review, it should be available in the kernels soon.

AdityaGarg8 pushed a commit to AdityaGarg8/apple-bce-drv that referenced this issue Apr 14, 2025
@mnural mnural closed this as completed Apr 14, 2025
@AdityaGarg8
Copy link
Member

Re opening since it has not been merged yet.

@AdityaGarg8 AdityaGarg8 reopened this Apr 14, 2025
@AdityaGarg8
Copy link
Member

Also the code has memory leaks so needs to be fixed. It's not fit enough to be shipped.

@AdityaGarg8
Copy link
Member

But it's better to avoid AI for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants