-
Notifications
You must be signed in to change notification settings - Fork 258
chunked: use temporary file for tar-split data #2312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chunked: use temporary file for tar-split data #2312
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
cebc9ff
to
13cbaef
Compare
94a8a23
to
b0fd638
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
4f0de4e
to
6342175
Compare
f04c803
to
187c56b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found this PR while I happened to be following a few threads related to this, and noticed that using buffered i/o would likely be worthwhile - feel free to disregard, of course!
187c56b
to
d70254a
Compare
extract the blob checksum validation logic from decodeAndValidateBlob into a separate function. Signed-off-by: Giuseppe Scrivano <[email protected]>
Signed-off-by: Giuseppe Scrivano <[email protected]>
d70254a
to
0084bc1
Compare
6472d74
to
dfd505c
Compare
Replace the in-memory buffer with a O_TMPFILE file. This reduces the memory requirements for a partial pull since the tar-split data can be written to disk. Signed-off-by: Giuseppe Scrivano <[email protected]>
Replace the direct call to unix.Open with the O_TMPFILE flag with the dedicated openTmpFile helper function. Signed-off-by: Giuseppe Scrivano <[email protected]>
dfd505c
to
729821c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m afraid I’m way behind, just a quick note.
I’m also generally confused about the lifetime of DriverWithDifferOutput.TarSplit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super late — I’ll file a PR for the nits.
I’d love a discussion about placing the files in /run
, though.
@@ -2550,10 +2550,14 @@ func (r *layerStore) applyDiffFromStagingDirectory(id string, diffOutput *driver | |||
if err != nil { | |||
compressor = pgzip.NewWriter(&tsdata) | |||
} | |||
if _, err := diffOutput.TarSplit.Seek(0, 0); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.SeekStart
, throughout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #2328 .
// tmpDir is a directory where the tar-split temporary file is written to. The file is opened with | ||
// O_TMPFILE so that it is automatically removed when it is closed. | ||
// Returns (manifest blob, parsed manifest, tar-split file or nil, manifest offset). The opened tar-split file | ||
// points to the end of the file (equivalent to Seek(0, 2)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.SeekEnd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#2328 removes that instead.
return err | ||
} | ||
|
||
decoder, err := zstd.NewReader(bytes.NewReader(blob)) //nolint:contextcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the lint suppression match anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #2328 .
@@ -157,10 +159,33 @@ func readEstargzChunkedManifest(blobStream ImageSourceSeekable, blobSize int64, | |||
return manifestUncompressed, tocOffset, nil | |||
} | |||
|
|||
func openTmpFile(tmpDir string) (*os.File, error) { | |||
file, err := os.OpenFile(tmpDir, unix.O_TMPFILE|unix.O_RDWR|unix.O_CLOEXEC|unix.O_EXCL, 0o600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Contra #2312 (comment) ) this seems to work, but AFAIK nothing promises that passing syscall.O_…
flags to os.OpenFile
is supported — let alone flags from a “third-party” x/sys/unix.O_…
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #2328 .
@@ -333,13 +339,16 @@ func makeConvertFromRawDiffer(store storage.Store, blobDigest digest.Digest, blo | |||
// makeZstdChunkedDiffer sets up a chunkedDiffer for a zstd:chunked layer. | |||
// It may return an error matching ErrFallbackToOrdinaryLayerDownload / errFallbackCanConvert. | |||
func makeZstdChunkedDiffer(store storage.Store, blobSize int64, tocDigest digest.Digest, annotations map[string]string, iss ImageSourceSeekable, pullOptions pullOptions) (*chunkedDiffer, error) { | |||
manifest, toc, tarSplit, tocOffset, err := readZstdChunkedManifest(iss, tocDigest, annotations) | |||
manifest, toc, tarSplit, tocOffset, err := readZstdChunkedManifest(store.RunRoot(), iss, tocDigest, annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RunRoot
is, by default, in /run
… so the net total on this code path is that it moves the uncompressed tar-split from process memory (which can be swapped out, and is limited by swap size) to a tmpfs
(also limited by swap size).
It’s still a bit of an improvement, because we have a bit of a more direct control over allocation / uses / lifetime of the memory — but it does it substantially change the actual limit?
Fix nits from #2312
Replace the in-memory buffer with a O_TMPFILE file. This reduces the memory requirements for a partial pull since the tar-split data can be written to disk.
c/image needs the following patch: