Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: enable chunked reading for large files in readFileHandle #56022

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

mertcanaltin
Copy link
Member

Added chunked reading support to readFileHandle to handle files larger than 2 GiB, resolving size limitations while preserving existing functionality.

#55864

@nodejs-github-bot nodejs-github-bot added errors Issues and PRs related to JavaScript errors originated in Node.js core. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Nov 27, 2024
@mertcanaltin
Copy link
Member Author

mertcanaltin commented Nov 27, 2024

I wonder if I should apply this sorting in fs.js, right now I just did it for promises,

in addition there are still places that use ERR_FS_FILE_TOO_LARGE I did not delete this message so as not to break them

Copy link

codecov bot commented Nov 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.24%. Comparing base (8a5a849) to head (2160d30).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #56022      +/-   ##
==========================================
+ Coverage   90.22%   90.24%   +0.01%     
==========================================
  Files         629      629              
  Lines      184847   184844       -3     
  Branches    36207    36207              
==========================================
+ Hits       166780   166813      +33     
+ Misses      11015    11011       -4     
+ Partials     7052     7020      -32     
Files with missing lines Coverage Δ
lib/fs.js 98.27% <ø> (-0.01%) ⬇️
lib/internal/fs/promises.js 98.25% <100.00%> (+<0.01%) ⬆️

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mertcanaltin
Copy link
Member Author

I removed the limit test because the limit for reading large files exceeding the GiB limit with fs.readFile has been removed.

@BridgeAR BridgeAR added the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Dec 9, 2024
@BridgeAR
Copy link
Member

BridgeAR commented Dec 9, 2024

I added the tsc label to discuss, if we want to allow users to read such big files into memory, or if it would be better to try to point out streams instead.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is removed from the promise version but it's missing the callback readFile implementation. The error itself would not be needed anymore due to that and also has to be removed.
This has to be addressed before we could land this.

We discussed in the TSC meeting that it's not a good idea to read beyond that, while it's acceptable for some cases.
We also discussed around warning when reaching that limit instead. We did not yet have consensus around it, but we'll discuss it again next week to finish the decision for that.

@gireeshpunathil
Copy link
Member

just wondering what is the next action here - @BridgeAR

@BridgeAR
Copy link
Member

@gireeshpunathil I believe you wanted to think about the warning again.

I kept my change request since the implementation should also include the callback version next to the warning. That's currently missing :)

@mertcanaltin
Copy link
Member Author

@gireeshpunathil I believe you wanted to think about the warning again.

I kept my change request since the implementation should also include the callback version next to the warning. That's currently missing :)

thanks a lot for your comments, unfortunately I saw this place late and now I sent a commit for the callback version

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original reason for the limit was AFAIK something about some systems on some versions not being able to read that file size. If that's the case, we'd have to handle chunking on our side. I am just not certain if that still applies or if it's a legacy issue.

Please also add the warning instead of the error.

@gireeshpunathil
Copy link
Member

@gireeshpunathil I believe you wanted to think about the warning again.

I kept my change request since the implementation should also include the callback version next to the warning. That's currently missing :)

I thought we are going to continue the discussion in the TSC on the necessity of the warning, as we didn't converge on that IIRC.

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the warning to all occurrences and test for the warning.

BridgeAR

This comment was marked as duplicate.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mertcanaltin
Copy link
Member Author

this test was successful on my local linux machine

@mcollina mcollina self-requested a review March 12, 2025 06:12
if (size > kIoMaxLength) {
process.emitWarning(
`Warning: Detected \`fs.readFile()\` to read a huge file in memory (${size} bytes). Please consider using \`fs.createReadStream()\` instead to minimize memory overhead and increase the performance.`,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would likely be worthwhile to assign a warning code to this warning so that it can be suppressed with the --dsable-warning CLI flag.

image

Copy link
Member Author

@mertcanaltin mertcanaltin Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I try to use the ERR_FS_FILE_TOO_LARGE error code, do you think this is true?


function createVirtualLargeFile() {
return Buffer.alloc(LARGE_FILE_SIZE, 'A');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a large allocation that may fail on some our more resource challenge builders. I believe there's a utility in test/common for skipping if there's not enough available memory? Can't recall exactly what it is but we likely should be safer here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks I find this method enoughTestMem


const virtualFile = createVirtualLargeFile();

await writeFile(FILE_PATH, virtualFile);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also possible that the FS on test runners could completely run out of space with this large of a file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a check for this place too, thanks for the suggestion

@mertcanaltin mertcanaltin requested a review from jasnell March 16, 2025 14:41
@mertcanaltin
Copy link
Member Author

Hello everyone, I wonder if there is an update about this place @jasnell @BridgeAR 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
errors Issues and PRs related to JavaScript errors originated in Node.js core. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. review wanted PRs that need reviews.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants