Skip to content

fix(dcd): Fixed race condition on device disconnect #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

tore-espressif
Copy link
Collaborator

@tore-espressif tore-espressif commented Apr 9, 2025

TinyUSB does not provide any locking means to protect the DCD variables. This can lead to race conditions when the user is trying to submit a transfer while the device is being disconnected. This can cause the device to be in an inconsistent state, leading to a crash or undefined behavior.

This commit adds a spin-lock to protect the DCD variables during device disconnect.

@igi540
Copy link

igi540 commented Apr 9, 2025

LGTM.

@roma-jam
Copy link
Collaborator

roma-jam commented Apr 9, 2025

I think that for better compatibility with the upstream, we need to provide this mechanism in the dwc2_esp32.h file with default definition for all other controllers. In other words, make a minimum changes in the upstream dcd_dwc2.c file.

Otherwise we just break all the builds for other controllers family which potentially will lead to difficulties during next synchronizations.

@tore-espressif tore-espressif force-pushed the fix/disconnect_race_condition branch from e9ba17b to fa392ab Compare April 9, 2025 12:58
TinyUSB does not provide any locking means to protect the DCD variables.
This can lead to race conditions when the user is trying to submit
a transfer while the device is being disconnected. This can cause
the device to be in an inconsistent state, leading to a crash or
undefined behavior.

This commit adds a spin-lock to protect the DCD variables during
device disconnect.

Closes espressif/esp-idf#9691
Also reported in espressif/esp-usb#131
@tore-espressif tore-espressif force-pushed the fix/disconnect_race_condition branch from fa392ab to d4a39f5 Compare April 9, 2025 13:33
@tore-espressif
Copy link
Collaborator Author

I think that for better compatibility with the upstream, we need to provide this mechanism in the dwc2_esp32.h file with default definition for all other controllers. In other words, make a minimum changes in the upstream dcd_dwc2.c file.

Otherwise we just break all the builds for other controllers family which potentially will lead to difficulties during next synchronizations.

@roma-jam Thank you for the review!

I added new file dwc2_critical.h because adding it to dwc2_esp32.h produced too many compiler warnings

@tore-espressif
Copy link
Collaborator Author

@roma-jam do you agree with the changes now? can we merge and release?

@roma-jam
Copy link
Collaborator

@tore-espressif give me a moment, I want to check it on a hardware.

@roma-jam
Copy link
Collaborator

@tore-espressif Unfortunately, I'm not able to reproduce the problem neither with tusb_hid, nor with tusb_msc examples. I will try more.

May I ask you to share the log, how the application handled the situation, while we return false; // Endpoint is closed at the line 593?

@tore-espressif
Copy link
Collaborator Author

The bug can be 100% reproduced in https://github.com/espressif/esp-usb/tree/master/host/class/msc/usb_host_msc/test_app test number 2 and esp-idf release/v5.2 branch.

@roma-jam
Copy link
Collaborator

The bug can be 100% reproduced in https://github.com/espressif/esp-usb/tree/master/host/class/msc/usb_host_msc/test_app test number 2 and esp-idf release/v5.2 branch.

but only in espressif/esp-usb#131, not in master branch.

Copy link
Collaborator

@roma-jam roma-jam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing: is there could be a possible situation when GINTSTS_USBRST and GINTSTS_OEPINT or GINTSTS_IEPINT will be set in the same time, we could get the division by zero again.

Maybe we need to cover all the calls of edpt_schedule_packets() in this case to prevent such logic scenario.

Otherwise, LGTM.

@tore-espressif
Copy link
Collaborator Author

One thing: is there could be a possible situation when GINTSTS_USBRST and GINTSTS_OEPINT or GINTSTS_IEPINT will be set in the same time, we could get the division by zero again.

I see. However, practically, having both Rest and EP interrupt at the same time should never happen if we handle interrupts reasonably fast...

I'll merge as it is and will continue watching the long-term stability

@tore-espressif tore-espressif merged commit 4095aba into release/v0.18 Apr 11, 2025
91 checks passed
@tore-espressif tore-espressif deleted the fix/disconnect_race_condition branch April 11, 2025 08:55
@roma-jam roma-jam added this to the v0.18.0.2 milestone Apr 11, 2025
@tore-espressif tore-espressif restored the fix/disconnect_race_condition branch April 30, 2025 12:34
@tore-espressif tore-espressif deleted the fix/disconnect_race_condition branch May 2, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants