Skip to content

fix(tusb_msc): Cache sector metrics, enable storage buffering #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

igi540
Copy link
Collaborator

@igi540 igi540 commented Feb 5, 2025

πŸš€ Optimize USB MSC Performance on ESP32-S3, ESP32-P4, and ESP32-S2


βœ… Requirements

Why is this needed?

  • ESP32-S3 USB Mass Storage (MSC) implementation showed suboptimal performance (slow file copy speed).
  • Observed issues:
    • ⚠️ SPI Flash write speed was severely limited due to architectural constraints.
    • πŸ“‰ SD card speeds varied significantly based on FIFO size, but were still below expectations.
  • Goal: Optimize data transfer efficiency for both SPI Flash and SD cards.

πŸ“ Description

This PR improves USB Mass Storage (MSC) performance by:

  • βœ… Caching get_sector_size and get_sector_count to reduce redundant queries and improve efficiency.
  • βœ… Optimizing buffering logic:
    • A single-buffer approach is used for all targets (ESP32-S3, ESP32-P4, ESP32-S2)
    • Buffering significantly improves SD card write speed, while SPI Flash shows minimal improvement due to hardware constraints.

πŸ“Š Observed Results

ESP32-S3

SD Card Performance

FIFO Size Read Speed (Old) Read Speed (New) Write Speed (Old) Write Speed (New)
512B 0.561 MB/s 0.566 MB/s 0.191 MB/s 0.236 MB/s
8192B 0.925 MB/s 0.925 MB/s 0.726 MB/s 0.928 MB/s

βœ… Write speed improved significantly with 8192B FIFO (~27% increase).
βœ… Read speed remains stable at ~0.9 MB/s.

SPI Flash Performance

FIFO Size Write Speed (Old) Write Speed (New)
512B 8.21 KB/s 8.46 KB/s
8192B 30.31 KB/s 30.2 KB/s

⚠ Minimal improvement for SPI Flash (limited by write speed).


ESP32-P4

SD Card Performance

FIFO Size Read Speed (Old) Read Speed (New) Write Speed (Old) Write Speed (New)
512B 1.175 MB/s 1.174 MB/s 0.234 MB/s 0.238 MB/s
8192B 4.786 MB/s 4.744 MB/s 2.021 MB/s 2.157 MB/s
32768B 5.833 MB/s 5.998 MB/s 3.673 MB/s 4.485 MB/s

βœ… Larger FIFO sizes (32KB) significantly improved SD card write speed (~22% increase).

SPI Flash Performance

FIFO Size Write Speed (Old) Write Speed (New)
512B 7.44 KB/s 7.57 KB/s
8192B 28.44 KB/s 28.42 KB/s
32768B 28.12 KB/s 28.24 KB/s

πŸ”Έ No major changes in SPI Flash performance on ESP32-P4.


ESP32-S2

SPI Flash Performance (Corrected Values)

FIFO Size Write Speed (Old) Write Speed (New)
512B 5.50 KB/s 5.59 KB/s
8192B 21.52 KB/s 21.54 KB/s

πŸ”Έ Minimal improvements observed, similar to ESP32-S3 results.
⚠ SPI Flash performance remains limited due to architectural constraints.

🚫 Note: SD card is currently not supported in MSC device mode for ESP32-S2.


πŸ§ͺ Measurement Method

  • SD card performance was measured using SpeedOut on Windows (applies to all supported SoCs).
  • SPI Flash performance was measured using a custom Windows Bash script (applies to all supported SoCs).

🚩 Why SPI Flash Performance is Limited?

The core limitation is that program execution and storage reside on the same flash chip:

  • 🧠 During a flash write operation, program execution from flash is suspended.
  • ⏱ Interrupts not loaded into IRAM cannot run, severely slowing down USB transfers.
  • ⚠️ This explains why SPI Flash remains slow, even with buffering optimizations.

SD cards, on the other hand, are not affected by this limitation, which is why they show significant improvements with optimized buffering.


πŸ”— Related


πŸ“Œ Conclusion

βœ… SD card performance now approaches USB Full-Speed theoretical limit (~1 MB/s on ESP32-S3).
βœ… Write speed improvements are significant for SD cards, especially with larger FIFO buffers (up to ~4.5 MB/s on ESP32-P4).
⚠ SPI Flash remains speed-limited due to architectural constraints, with only minor improvements observed.

πŸš€ This PR significantly enhances SD card performance and optimizes MSC efficiency across ESP32 targets!

@igi540 igi540 self-assigned this Feb 5, 2025
@igi540 igi540 force-pushed the fix/msc_device_performance branch from 4571365 to 92e6469 Compare February 5, 2025 15:30
Copy link
Collaborator

@roma-jam roma-jam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR! Description is perfect! Nice idea to use the resource of the same task, which runs tud_task().

I pointed some general things that affects the logic. Based on the that, may I ask you to try and remove one buffer? The logic will be much simpler: get write10 callback, memcpy data, usbd_defer_func write to storage.

As far as we don't use both buffers at the same time, this change could help us:

  • simplify the logic
  • decrease memory usage (twice)
  • do not loose in speed comparing to your current results

@igi540 igi540 force-pushed the fix/msc_device_performance branch from 92e6469 to 851d69d Compare February 7, 2025 16:18
@igi540 igi540 changed the title fix(tusb_msc): Cache sector metrics, enable ping-pong buffering on ESP32S3 fix(tusb_msc): Cache sector metrics, enable buffering on ESP32S3 Feb 7, 2025
@igi540 igi540 force-pushed the fix/msc_device_performance branch 2 times, most recently from 83da0d2 to 22cf787 Compare February 9, 2025 20:21
@igi540 igi540 changed the title fix(tusb_msc): Cache sector metrics, enable buffering on ESP32S3 fix(tusb_msc): Cache sector metrics, enable storage buffering Feb 9, 2025
Copy link
Collaborator

@tore-espressif tore-espressif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igi540 Overall LGTM.

Only 1 major question about adding the storage_buffer into tinyusb_msc_storage_handle struct. This way, we would have 1 handle for 1 storage instance

@roma-jam
Copy link
Collaborator

@igi540 and please check the CI fail: APP: Write data after disconnect

@igi540 igi540 force-pushed the fix/msc_device_performance branch 2 times, most recently from 44c38b9 to 1b7b065 Compare February 10, 2025 18:32
@tore-espressif tore-espressif added this to the esp_tinyusb v1.7.2 milestone Feb 11, 2025
@tore-espressif
Copy link
Collaborator

tore-espressif commented Feb 11, 2025

@igi540 and please check the CI fail: APP: Write data after disconnect

This is really weird because it fails on v5.0 v5.1 and v5.2.
It passes on v5.3, v5.4 and latest...

EDIT: Might be related to espressif/esp-idf#9691

@roma-jam
Copy link
Collaborator

@igi540 Not it is not related, as in this error the espressif dcd layer is used: dcd_esp32sx.c:356

We are not using it since v0.15.

@tore-espressif
Copy link
Collaborator

@roma-jam sorry, I didn't provide any details. Very similar code with the same race condition and by-zero-division is also in the new DCD layer. However, I can't confirm it is the same problem now. Will look into it tomorrow

@igi540 igi540 force-pushed the fix/msc_device_performance branch 3 times, most recently from 694c5a0 to b0f400b Compare February 11, 2025 22:30
@igi540 igi540 marked this pull request as draft February 12, 2025 11:08
@igi540 igi540 force-pushed the fix/msc_device_performance branch from b3c8a1e to ad3272d Compare February 12, 2025 11:38
@roma-jam roma-jam removed this from the esp_tinyusb v1.7.2 milestone Feb 21, 2025
tore-espressif added a commit to espressif/tinyusb that referenced this pull request Apr 9, 2025
TinyUSB does not provide any locking means to protect the DCD variables.
This can lead to race conditions when the user is trying to submit
a transfer while the device is being disconnected. This can cause
the device to be in an inconsistent state, leading to a crash or
undefined behavior.

This commit adds a spin-lock to protect the DCD variables during
device disconnect.

Closes espressif/esp-idf#9691
Also reported in espressif/esp-usb#131
@tore-espressif
Copy link
Collaborator

@igi540 The problem with race condition is fixed espressif/tinyusb#55

Please update this PR, so we can merge it after the fix is released

tore-espressif added a commit to espressif/tinyusb that referenced this pull request Apr 9, 2025
TinyUSB does not provide any locking means to protect the DCD variables.
This can lead to race conditions when the user is trying to submit
a transfer while the device is being disconnected. This can cause
the device to be in an inconsistent state, leading to a crash or
undefined behavior.

This commit adds a spin-lock to protect the DCD variables during
device disconnect.

Closes espressif/esp-idf#9691
Also reported in espressif/esp-usb#131
tore-espressif added a commit to espressif/tinyusb that referenced this pull request Apr 9, 2025
TinyUSB does not provide any locking means to protect the DCD variables.
This can lead to race conditions when the user is trying to submit
a transfer while the device is being disconnected. This can cause
the device to be in an inconsistent state, leading to a crash or
undefined behavior.

This commit adds a spin-lock to protect the DCD variables during
device disconnect.

Closes espressif/esp-idf#9691
Also reported in espressif/esp-usb#131
@igi540 igi540 force-pushed the fix/msc_device_performance branch 3 times, most recently from f0b2f30 to 342a9b8 Compare April 9, 2025 21:12
@roma-jam roma-jam self-requested a review April 11, 2025 08:49
Copy link
Collaborator

@roma-jam roma-jam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one thing that unleash the race condition - is immediate return from the tud_msc_write10_cb().

Maybe we could think of adding the test_app with emulated ram disk to:

  1. measure the raw speed of mscd driver speed
  2. test sudden disconnection in this case (as in this case reset comes sooner than we prepare_cbw(), so the reset flush the xfer->max_size)

Otherwise, LGTM.

Please, don't forget to address all the rest notes before merge.

@igi540 igi540 marked this pull request as ready for review April 11, 2025 09:34
Cached `get_sector_size` and `get_sector_count` to improve performance and avoid redundant calls.
Storage buffering mechanism is enabled.

Closes espressif/esp-idf#11110
@igi540 igi540 force-pushed the fix/msc_device_performance branch from 342a9b8 to c30af29 Compare April 11, 2025 11:58
@igi540 igi540 merged commit b0a7ab1 into master Apr 11, 2025
27 of 36 checks passed
@igi540 igi540 deleted the fix/msc_device_performance branch April 11, 2025 13:20
tore-espressif added a commit to espressif/tinyusb that referenced this pull request May 2, 2025
TinyUSB does not provide any locking means to protect the DCD variables.
This can lead to race conditions when the user is trying to submit
a transfer while the device is being disconnected. This can cause
the device to be in an inconsistent state, leading to a crash or
undefined behavior.

This commit adds a spin-lock to protect the DCD variables during
device disconnect.

Closes espressif/esp-idf#9691
Also reported in espressif/esp-usb#131
hathach pushed a commit to hathach/tinyusb that referenced this pull request May 19, 2025
TinyUSB does not provide any locking means to protect the DCD variables.
This can lead to race conditions when the user is trying to submit
a transfer while the device is being disconnected. This can cause
the device to be in an inconsistent state, leading to a crash or
undefined behavior.

This commit adds a spin-lock to protect the DCD variables during
device disconnect.

Closes espressif/esp-idf#9691
Also reported in espressif/esp-usb#131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESP32S3 USB MSC speed lower than expected (IDFGH-9776)
4 participants