Skip to content

[Bug?]: ZipFs not cached in Pnpm mode causing Couldn't allocate enough memory (not node but wasm memory) #6722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
dannyvv opened this issue Mar 10, 2025 · 1 comment · Fixed by #6723
Closed
1 task done
Labels
bug Something isn't working

Comments

@dannyvv
Copy link
Contributor

dannyvv commented Mar 10, 2025

Self-service

  • I'd be willing to implement a fix

Describe the bug

It seems that our repo is running into a race condition in the cache where it creates multiple ZipFs instances for the same downloaded .tgz file (same cachepath and same locator hash.

I could not repro with pnp mode but have in pnpm mode. We are not ready in all our tooling to handle pnp mode yet....
There is a parallelism phase for the 'link' phase which is great. Yet it seems in the cache it does have protection for a mutex, but the mutex is ONLY for the descriptor creation... I.e. it creates a single instance of the function that creates the ZipFs object.... since they ZipFs creation is still behind the LazyFs implementation there is still a chance that multiple ZipFs instances get created, which seems to be what is happening...

In the repro steps below , I've described some added some logging lines and the output when run..
This clearly shows that the wasm memory is growing and growing to close to the 2Gb limmit.
This is simplified logging where I only report the calls to create ZipFS instances for a single large package, but there are many many packages that get more than one ZipFS Implementation.

I haven't fully wrapped my head around why this is but it seems due to transitive dependencies, where this package is layed out on disk as siblings in parallel...

The other aspect that seems to go wrong is the 'release' function in

I'd love to provide the fix for this. I'm filing this issue to help get early feedback what would be the preference.
Possible avenues:

  1. Create ref-counted map of ZipFs instances.
  2. Extend the Mutex duration in Cache.ts to also encapsulate the LazyFs and FetchResult creation to ensure that is a singleton.

    Prototyped, did not work (still run into multiple instances..)

  3. Place a Mutex inside LazyFs's baseFs function where it calls the factory.

In my prototype I pursued the first option (1) and which consisted of 2 changes:

  1. I created a Ref-Counted Map class.
  2. use that class for ZipFS instances created in cache.ts. Also wrapping the release function rather than hang onto a local variable that gets rebound..

To reproduce

I've been trying to create a smaller repro but have not been successful wiht the OOM error. I've been trying to add dependencies on a large package like ep_latex in the yarn repo itself and use pnpm mode but to no avail..
The project that demonstrates this is an internal repo reliant on an internal feed with internal only packages that I can't share.. I have shared the particular details below. But if you want to see the problem in the yarn v4 repo itself I have created a branch that shows the issue:

https://github.com/dannyvv/yarn-berry/pull/new/repro/ShowDuplicateZipFsCreation

It shows that there are only a few packages that are created with duplicate ZipFs instances.. In our repo we reach a lot higher count for many packages. Note my devbox has 32-threads...

➤ YN0000: ┌ Link step
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@typescript-eslint-typescript-estree-npm-8.24.0-0efb13ddb1-89e451f5d2.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@typescript-eslint-utils-npm-8.24.0-ff783b2b9b-773a4085e4.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/ts-api-utils-npm-2.0.1-03c1d3773a-2e68938cd5.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/babel-preset-current-node-syntax-npm-1.0.1-849ec71e32-94561959cb.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-async-generators-npm-7.8.4-d10cf993c9-7ed1c1d9b9.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-bigint-npm-7.8.3-b05d971e6c-3a10849d83.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-class-properties-npm-7.12.13-002ee9d930-24f34b196d.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-import-meta-npm-7.10.4-4a0a0158bc-166ac1125d.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-json-strings-npm-7.8.3-6dc7848179-bf5aea1f31.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-logical-assignment-operators-npm-7.10.4-72ae00fdf6-aff3357703.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-nullish-coalescing-operator-npm-7.8.3-8a723173b5-87aca49189.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-numeric-separator-npm-7.10.4-81444be605-01ec5547bd.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-object-rest-spread-npm-7.8.3-60bd05b6ae-fddcf581a5.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-optional-catch-binding-npm-7.8.3-ce337427d8-910d90e72b.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-optional-chaining-npm-7.8.3-f3f3c79579-eef94d53a1.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@babel-plugin-syntax-top-level-await-npm-7.14.5-60a0a2e83b-bbd1a56b09.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/jest-config-npm-29.2.1-af59d671b3-b951bcf37f.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@docusaurus-utils-npm-3.4.0-bde3d29fc3-49f926eedb.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@docusaurus-utils-common-npm-3.4.0-df70ac70c8-a3d17e3e50.zip
 ⚠️ > 1: /home/danny/src/yarn-berry/.yarn/cache/@algolia-autocomplete-shared-npm-1.9.3-e918a6f29f-2332d12268.zip

In my repro I made a few changes to Yarn:

In cache.ts:

     const zipFsBuilder = shouldMock
       ? () => makeMockPackage()
-      : () => new ZipFS(cachePath, {baseFs, readOnly: true});
+      : () => {
+        if (locator.name == "canvas-contextual") {
+          console.trace("new ZipFS for a large 168 Mb package" + cachePath + " with mutex locator hash: " + locator.locatorHash);
+        }
+        return new ZipFS(cachePath, {baseFs, readOnly: true});
+      }

in libzipsync.js:

    function _emscripten_resize_heap(requestedSize) {
+      (requestedSize < 0 ? console.trace : console.error)("libzipSync:: _emscripten_resize_heap to " + requestedSize);

      var oldSize = HEAPU8.length;

in zipzipAsync.js:

    function _emscripten_resize_heap(requestedSize) {
+      (requestedSize < 0 ? console.trace : console.error)("libzipASync:: _emscripten_resize_heap to " + requestedSize);

      var oldSize = HEAPU8.length;

Running on our repo that has that large package with tons of dependencies I get:

➤ YN0000: └ Completed
➤ YN0000: ┌ Fetch step
➤ YN0000: ⠧ --------------------------------------------------------------------------------
➤ YN0000: ⠇ --------------------------------------------------------------------------------
libzipSync:: _emscripten_resize_heap to 42024960
➤ YN0000: ⠇ ==================--------------------------------------------------------------
➤ YN0000: ⠧ ===================================---------------------------------------------
libzipSync:: _emscripten_resize_heap to 188698624
libzipSync:: _emscripten_resize_heap to 188747776
➤ YN0013: │ 2451 packages were added to the project (+ 1.73 GiB).
➤ YN0000: └ Completed in 1m 40s

At this point in time we have pulled in 1.7GiB of packages.. And the node memory is already at ~3Gb.
No problem with this, we all have at least 32 thread machines with 64Gb machines...
The problem is the WASM limit of 2Gb. Downloading only has already asked it to grow to just 200Mb.
But as you'll see below it will soon balloon to over 2Gb and will be asked to grow to negative values...

➤ YN0000: ┌ Link step
libzipSync:: _emscripten_resize_heap to 373198848
libzipSync:: _emscripten_resize_heap to 446156800
libzipSync:: _emscripten_resize_heap to 557699072
libzipSync:: _emscripten_resize_heap to 683163648
libzipSync:: _emscripten_resize_heap to 793341952
libzipSync:: _emscripten_resize_heap to 894115840
libzipSync:: _emscripten_resize_heap to 1000157184
libzipSync:: _emscripten_resize_heap to 1101004800
libzipSync:: _emscripten_resize_heap to 1218220032
Trace: new ZipFS for a large 168 Mb package/D:/temp/YARN4TEst/build/cache/global/cache/@office-iss-canvas-contextual-npm-66.0.10-b209b2e00c-10c0.zip with mutex locator hash: b209b2e00c4dfa91e56ece2ec8d0dcf31bb679ea6f942f5e34fd5af63d8c89a472776ce6fc39ba215488e0cc64de2d75f6dbf038e63d4b564492e631d294f756
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53450:23)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
    at get baseFs (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:2129:34)
    at LazyFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:23)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at _Manifest.loadFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31410:40)
    at _Manifest.fromFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31390:26)
libzipSync:: _emscripten_resize_heap to 1470582784
libzipSync:: _emscripten_resize_heap to 1571713024
Trace: new ZipFS for a large 168 Mb package/D:/temp/YARN4TEst/build/cache/global/cache/@office-iss-canvas-contextual-npm-66.0.10-b209b2e00c-10c0.zip with mutex locator hash: b209b2e00c4dfa91e56ece2ec8d0dcf31bb679ea6f942f5e34fd5af63d8c89a472776ce6fc39ba215488e0cc64de2d75f6dbf038e63d4b564492e631d294f756
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53450:23)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
    at get baseFs (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:2129:34)
    at LazyFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:23)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at _Manifest.loadFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31410:40)
    at _Manifest.fromFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31390:26)
libzipSync:: _emscripten_resize_heap to 1757843456
Trace: new ZipFS for a large 168 Mb package/D:/temp/YARN4TEst/build/cache/global/cache/@office-iss-canvas-contextual-npm-66.0.10-b209b2e00c-10c0.zip with mutex locator hash: b209b2e00c4dfa91e56ece2ec8d0dcf31bb679ea6f942f5e34fd5af63d8c89a472776ce6fc39ba215488e0cc64de2d75f6dbf038e63d4b564492e631d294f756
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53450:23)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
    at get baseFs (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:2129:34)
    at LazyFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:23)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at _Manifest.loadFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31410:40)
    at _Manifest.fromFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31390:26)
libzipSync:: _emscripten_resize_heap to 1976565760
Trace: new ZipFS for a large 168 Mb package/D:/temp/YARN4TEst/build/cache/global/cache/@office-iss-canvas-contextual-npm-66.0.10-b209b2e00c-10c0.zip with mutex locator hash: b209b2e00c4dfa91e56ece2ec8d0dcf31bb679ea6f942f5e34fd5af63d8c89a472776ce6fc39ba215488e0cc64de2d75f6dbf038e63d4b564492e631d294f756
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53450:23)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
    at get baseFs (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:2129:34)
    at LazyFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:23)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
    at _Manifest.loadFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31410:40)
    at _Manifest.fromFile (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:31390:26)
Trace: libzipSync:: _emscripten_resize_heap to -2092892160
    at _emscripten_resize_heap (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:41971:64)
    at wasm://wasm/0007b852:wasm-function[41]:0x3f86
    at wasm://wasm/0007b852:wasm-function[9]:0x14be
    at ZipFS.allocateBuffer (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:43061:39)
    at ZipFS.allocateUnattachedSource (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:43070:56)
    at new ZipFS (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:42561:35)
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53452:20)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
Trace: libzipSync:: _emscripten_resize_heap to -2092822528
    at _emscripten_resize_heap (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:41971:64)
    at wasm://wasm/0007b852:wasm-function[41]:0x3f86
    at wasm://wasm/0007b852:wasm-function[9]:0x1574
    at ZipFS.allocateBuffer (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:43061:39)
    at ZipFS.allocateUnattachedSource (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:43070:56)
    at new ZipFS (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:42561:35)
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53452:20)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
➤ YN0001: │ Error: Failed to open the cache entry for @office-iss/canvas-contextual@npm:66.0.10: Couldn't allocate enough memory
    at ZipFS.allocateBuffer (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:43063:19)
    at ZipFS.allocateUnattachedSource (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:43070:56)
    at new ZipFS (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:42561:35)
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53452:20)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
    at get baseFs (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:2129:34)
    at LazyFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:23)
    at AliasFS.readFilePromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1517:30)
➤ YN0000: └ Completed in 5s 356ms
➤ YN0000: · Failed with errors in 1m 47s
Trace: new ZipFS for a large 168 Mb package/D:/temp/YARN4TEst/build/cache/global/cache/@office-iss-canvas-contextual-npm-66.0.10-b209b2e00c-10c0.zip with mutex locator hash: b209b2e00c4dfa91e56ece2ec8d0dcf31bb679ea6f942f5e34fd5af63d8c89a472776ce6fc39ba215488e0cc64de2d75f6dbf038e63d4b564492e631d294f756
    at zipFsBuilder (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53450:23)
    at D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53455:28
    at prettifySyncErrors (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:26992:14)
    at LazyFS.factory (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:53454:43)
    at get baseFs (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:2129:34)
    at LazyFS.lstatPromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1405:23)
    at AliasFS.lstatPromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1405:30)
    at AliasFS.lstatPromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1405:30)
    at copyPromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:461:108)
    at NodeFS.copyPromise (D:\temp\YARN4TEst\.yarn\releases\yarn-wsl.cjs:1101:24)

The total of .tgz files is already close to the WebAssembly limmit:

Environment

System:
    OS: Windows 11 10.0.26100
    CPU: (32) x64 AMD Ryzen Threadripper PRO 3955WX 16-Cores
  Binaries:
    Node: 22.13.0 - ~\AppData\Local\Temp\xfs-192a2d33\node.CMD
    Yarn: 4.7.0-git.20250305.hash-3c8a90ace - ~\AppData\Local\Temp\xfs-192a2d33\yarn.CMD
    npm: 10.9.2 - C:\Program Files\nodejs\npm.CMD

Additional context

No response

@dannyvv dannyvv added the bug Something isn't working label Mar 10, 2025
@dannyvv
Copy link
Contributor Author

dannyvv commented Mar 11, 2025

Note: I noticed a bunch of other bugs reporting Out Of Memory errors. I have not had the bandwidth yet to confirm if they are affected by the same issue or not...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant