Skip to content

fix(usb_host): Give semaphore on attempted close of non-opened device (IDFGH-14893) #15608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mykmelez
Copy link
Contributor

Description

If you call usb_host_device_close() for a device that isn't open, the function exits early, without giving back the semaphore it took, which causes any other call that tries to take that semaphore to hang indefinitely.

Strangely, there's redundant handling of this condition, with two checks in a row that both handle the case where _check_client_opened_device(client_obj, dev_addr) returns false:

    HOST_CHECK_FROM_CRIT(_check_client_opened_device(client_obj, dev_addr), ESP_ERR_NOT_FOUND);
    if (!_check_client_opened_device(client_obj, dev_addr)) {
        // Client never opened this device
        ret = ESP_ERR_INVALID_STATE;
        HOST_EXIT_CRITICAL();
        goto exit;
    }
…
exit:
    xSemaphoreGive(p_host_lib_obj->constant.mux_lock);
    return ret;

The first line is the one that exits early, as HOST_CHECK_FROM_CRIT returns its second parameter if its first parameter is false, without giving back the semaphore (although it does exit the critical section).

The subsequent block handles the exact same case, except that it ensures the semaphore is given back before returning. Currently, this block is never reached.

Perhaps the first check was added, then someone noticed the issue and added the second check, but they forgot to remove the first one.

In any case, this PR removes the first check, so the second check can properly handle this case by giving back the semaphore before returning.

This bug appears to have been present in the initial commit of the USB Host library to the ESP-IDF repo: accbaee

Of course, if you never try to close a non-opened device, then you won't encounter it! Unfortunately, I have some code that tried to do that, which is how I found the issue.

Related

Testing

Before this change, I stepped through usb_host_device_close() in GDB to confirm that it was exiting early without giving back the semaphore when my code tried to close a device that wasn't open. After this change, I stepped through the function again to confirm that it was giving back the semaphore before exiting.

As well, before this change, I observed that threads that called functions that take the same semaphore, such as usb_host_client_register() and usb_host_client_deregister(), would hang indefinitely awaiting the semaphore. After this change, I observed that those threads didn't hang.


Checklist

Before submitting a Pull Request, please ensure the following:

  • 🚨 This PR does not introduce breaking changes.
  • All CI checks (GH Actions) pass.
  • Documentation is updated as needed.
  • Tests are updated or added as necessary.
  • Code is well-commented, especially in complex areas.
  • Git history is clean — commits are squashed to the minimum necessary.

Copy link

github-actions bot commented Mar 20, 2025

Messages
📖 🎉 Good Job! All checks are passing!

👋 Hello mykmelez, we appreciate your contribution to this project!


📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more.

🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project.

Click to see more instructions ...


This automated output is generated by the PR linter DangerJS, which checks if your Pull Request meets the project's requirements and helps you fix potential issues.

DangerJS is triggered with each push event to a Pull Request and modify the contents of this comment.

Please consider the following:
- Danger mainly focuses on the PR structure and formatting and can't understand the meaning behind your code or changes.
- Danger is not a substitute for human code reviews; it's still important to request a code review from your colleagues.
- To manually retry these Danger checks, please navigate to the Actions tab and re-run last Danger workflow.

Review and merge process you can expect ...


We do welcome contributions in the form of bug reports, feature requests and pull requests via this public GitHub repository.

This GitHub project is public mirror of our internal git repository

1. An internal issue has been created for the PR, we assign it to the relevant engineer.
2. They review the PR and either approve it or ask you for changes or clarifications.
3. Once the GitHub PR is approved, we synchronize it into our internal git repository.
4. In the internal git repository we do the final review, collect approvals from core owners and make sure all the automated tests are passing.
- At this point we may do some adjustments to the proposed change, or extend it by adding tests or documentation.
5. If the change is approved and passes the tests it is merged into the default branch.
5. On next sync from the internal git repository merged change will appear in this public GitHub repository.

Generated by 🚫 dangerJS against f560629

If you call *usb_host_device_close()* for a device that isn't open, the function exits early,
without giving back the semaphore it took, which causes any other call that tries to take
that semaphore to hang indefinitely.

Strangely, there's redundant handling of this condition, with two checks in a row that both handle
the case where `_check_client_opened_device(client_obj, dev_addr)` returns `false`:

```c
    HOST_CHECK_FROM_CRIT(_check_client_opened_device(client_obj, dev_addr), ESP_ERR_NOT_FOUND);
    if (!_check_client_opened_device(client_obj, dev_addr)) {
        // Client never opened this device
        ret = ESP_ERR_INVALID_STATE;
        HOST_EXIT_CRITICAL();
        goto exit;
    }
…
exit:
    xSemaphoreGive(p_host_lib_obj->constant.mux_lock);
    return ret;
```

The first line is the one that exits early, as HOST_CHECK_FROM_CRIT returns its second parameter
if its first parameter is false, without giving back the semaphore (although it does exit
the critical section).

The subsequent block handles the exact same case, except that it ensures the semaphore is given
back before returning. Currently, this block is never reached.

Perhaps the first check was added, then someone noticed the issue and added the second check,
but they forgot to remove the first one.

In any case, this PR removes the first check, so the second check can properly handle this case
by giving back the semaphore before returning.

This bug appears to have been present in the initial commit of the USB Host library to the ESP-IDF
repo: espressif@accbaee

Of course, if you never try to close a non-opened device, then you won't encounter it!
Unfortunately, I have some code that tried to do that, which is how I found the issue.
@mykmelez mykmelez force-pushed the usb-host-device-close-err-give-semaphore branch from 1bc06ab to f560629 Compare March 20, 2025 01:04
@mykmelez mykmelez changed the title Give semaphore on attempted close of non-opened device fix(usb_host): Give semaphore on attempted close of non-opened device Mar 20, 2025
@github-actions github-actions bot changed the title fix(usb_host): Give semaphore on attempted close of non-opened device fix(usb_host): Give semaphore on attempted close of non-opened device (IDFGH-14893) Mar 20, 2025
@espressif-bot espressif-bot added the Status: Opened Issue is new label Mar 20, 2025
@tore-espressif
Copy link
Collaborator

@mykmelez Thank you very much for the fix!

We will merge internally and backport to all active release branches

@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development and removed Status: Opened Issue is new labels Mar 20, 2025
@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Selected for Development Issue is selected for development labels Apr 1, 2025
@igi540
Copy link
Collaborator

igi540 commented Apr 2, 2025

sha=a3864c088dafb0b8ce94dba272685b850b46c837

@tore-espressif
Copy link
Collaborator

Thanks again, the commit is already in master branch aa669fa

The PR was not closed automatically, so closing now!

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: In Progress Work is in progress labels Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants