Skip to content

Use RefCountedCache for ZipFs instances in Cache to avoid duplicate allocations #6723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 28, 2025

Conversation

dannyvv
Copy link
Contributor

@dannyvv dannyvv commented Mar 11, 2025

What's the problem this PR addresses?

In pnpm mode we are seeing ZipFS instances created multiple times for the same package file in the cache. On large projects this easily leads to Out Of Memory Errors.

This is because the 'Link' phase runs worker threads and the LazyFs instance creates ZipFS on demand. Due to multipel threads accessing the FileSystem asyncroneously in the Link phase, threading issues causing the factory inside LazyFs to be called and ZipFS instances to be created multiple times.
We have a bunch of large internal packages (between 100Mb and 200Mb) that are referenced by many packages causing sometimes up to 6 instances of ZipFS for a single package. This makes the WebAssembly instance to run out of memory very fast :)

#6722 Has more details on the conclusion and a discussion of approachers

Resolves #6722
...

How did you fix it?

I created a RefCountedCache class that has an addOrCreate function that takes a key. (In our use case we pass the zip's cached path as the key) and a function to create create the ZipFS when needed. internally that class uses reference counting for handles and will return the cached object when there is already one live. It will also return a release function that will decrease the refcount and will call the destructor passed into the constructor of the RefCountedCache. In this case the destructor called will be ZipFS.discardAndClose
In the out of range cases the addOrCreate and the release call will throw an error.

I provided a unittest to cover the basic RefCountedCache implementation and relied on integration tests and local testing in our repo for validation.

...

Checklist

  • I have set the packages that need to be released for my changes to be effective.
  • I will check that all automated PR checks pass before the PR gets reviewed.

@dannyvv dannyvv force-pushed the pr/AddRefCountCacheForZipFSCreation branch 2 times, most recently from ad4801b to 0e2e5e0 Compare March 11, 2025 04:25
@dannyvv dannyvv changed the title Option 1: Use RefCountedCache for ZipFs instances in Cache Use RefCountedCache for ZipFs instances in Cache to avoid duplicate allocations Mar 11, 2025
@dannyvv dannyvv force-pushed the pr/AddRefCountCacheForZipFSCreation branch 3 times, most recently from ba08db2 to b035c40 Compare March 11, 2025 05:06
@dannyvv
Copy link
Contributor Author

dannyvv commented Mar 11, 2025

If you think it is worth it I'd be more than happy to put this behind a feature flag: i.e. enableZipFileSystemCaching

@dannyvv dannyvv marked this pull request as ready for review March 11, 2025 05:39
@dannyvv dannyvv force-pushed the pr/AddRefCountCacheForZipFSCreation branch from b035c40 to 36f966a Compare March 11, 2025 17:05
@dannyvv dannyvv force-pushed the pr/AddRefCountCacheForZipFSCreation branch from 36f966a to 4b4289a Compare March 11, 2025 17:08
@arcanis arcanis merged commit 092f059 into yarnpkg:master Mar 28, 2025
25 of 26 checks passed
@arcanis
Copy link
Member

arcanis commented Mar 28, 2025

Thanks for digging @dannyvv, the fix makes a lot of sense !

@goloveychuk
Copy link
Contributor

goloveychuk commented Mar 28, 2025

I've tried to bump from 4.7.0 to 4.8.0 on highly inefficient project :)
and got

YN0001: │ Error: Failed to open the cache entry for SOME _PACKAGE.1.0.61.tgz: Couldn't allocate enough memory
[13:06:56.771]     at Zi.allocateBuffer (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:149:204603)
[13:06:56.771]     at Zi.allocateUnattachedSource (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:149:204845)
[13:06:56.771]     at new Zi (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:149:193892)
[13:06:56.771]     at /home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:199:3471
[13:06:56.771]     at PQ.addOrCreate (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:196:599)
[13:06:56.771]     at z (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:199:3447)
[13:06:56.771]     at /home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:199:3533
[13:06:56.771]     at iO (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:140:53951)
[13:06:56.771]     at pm.factory (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:199:3522)
[13:06:56.771]     at get baseFs (/home/builduser/work/9a203e1840704986/.yarn/releases/yarn-4.8.0.cjs:9:16898)

I think this PR could be related.
Let's wait for more reports, I'll probably debug later when have time

@dannyvv
Copy link
Contributor Author

dannyvv commented Mar 28, 2025

@goloveychuk : Is this a public project or are you able to share a smaller subset of the repo (package.jsons, lock & yarn config?) Assuming it uses a public feed of course.

@goloveychuk
Copy link
Contributor

No. It's private.
I will think how to reproduce. Probably easiest is to me to debug. I'll check later

@goloveychuk
Copy link
Contributor

goloveychuk commented Mar 29, 2025

@dannyvv image
DiscardAndClose is not called in 76 line

@arcanis arcanis mentioned this pull request Mar 29, 2025
3 tasks
@arcanis
Copy link
Member

arcanis commented Mar 29, 2025

@goloveychuk can you run and try yarn set version from sources --branch 6743 (#6743)?

@goloveychuk
Copy link
Contributor

goloveychuk commented Mar 29, 2025

it works
Cc @arcanis

arcanis added a commit that referenced this pull request Mar 30, 2025
## What's the problem this PR addresses?

It seems we forgot to call `discardAndClose` in
#6723.

Ref: #6723 (comment)

## How did you fix it?

Calls the function.

## Checklist

<!--- Don't worry if you miss something, chores are automatically
tested. -->
<!--- This checklist exists to help you remember doing the chores when
you submit a PR. -->
<!--- Put an `x` in all the boxes that apply. -->
- [x] I have read the [Contributing
Guide](https://yarnpkg.com/advanced/contributing).

<!-- See
https://yarnpkg.com/advanced/contributing#preparing-your-pr-to-be-released
for more details. -->
<!-- Check with `yarn version check` and fix with `yarn version check
-i` -->
- [x] I have set the packages that need to be released for my changes to
be effective.

<!-- The "Testing chores" workflow validates that your PR follows our
guidelines. -->
<!-- If it doesn't pass, click on it to see details as to what your PR
might be missing. -->
- [x] I will check that all automated PR checks pass before the PR gets
reviewed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug?]: ZipFs not cached in Pnpm mode causing Couldn't allocate enough memory (not node but wasm memory)
3 participants