-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-129005: Align FileIO.readall between _pyio and _io #129705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Utilize `bytearray.resize()` and `os.readinto()` to reduce copies and match behavior of `_io.FileIO.readall()`. There is still an extra copy which means twice the memory required compared to FileIO because there isn't a zero-copy path from `bytearray` -> `bytes` currently. On my system reading a 2GB file `./python -m test -M8g -uall test_largefile -m test.test_largefile.PyLargeFileTest.test_large_read -v` Goes from ~2.7 seconds -> ~2.2 seconds
result += chunk | ||
|
||
bytes_read += n | ||
result.resize(bytes_read) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result = memoryview(result)[bytes_read:]
would avoid a truncation which can imply a memory copy in the worst case, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the resize "shrink" in bytearray doesn't actually resize unless the buffer's "capacity" is 2x the requested size (https://github.com/python/cpython/blob/main/Objects/bytearrayobject.c#L201-L214). Just updates its internal "this is how long the bytes is" counter (which for things like full-file readall with known size, this should already be just one byte over the right size).
My plan currently is to make it so bytes(bytearray(10))
and bytearray(b'\0' * 10)
both don't copy (Ongoing discussion in https://discuss.python.org/t/add-zero-copy-conversion-of-bytearray-to-bytes-by-providing-bytes/79164). Having a memoryview
would mean there's more than one reference to the bytearray, and I couldn't do / use that optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm fine with using result.resize()
here.
Co-authored-by: Victor Stinner <[email protected]>
Hypothesis test failure in |
Lib/_pyio.py
Outdated
bufsize += max(bufsize, DEFAULT_BUFFER_SIZE) | ||
n = bufsize - len(result) | ||
if bytes_read >= bufsize: | ||
# Parallels _io/fileio.c new_buffersize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the C code, new_buffersize() argument is bytes_read, not bufsize. You may keep new_buffersize() as a private module-level function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the loop to no longer use bufsize
at all, this is the only line that used it, and it feels more Pythonic to me to just use len(result)
.
That enables rewriting to:
try:
# Read until EOF (n == 0)
while n := os.readinto(self._fd, memoryview(result)[bytes_read:]):
bytes_read += n
if bytes_read >= len(result):
result.resize(_new_buffersize(bytes_read))
except BlockingIOError:
if not bytes_read:
return None
assert len(result) - bytes_read >= 1, \
"os.readinto buffer size 0 will result in erroneous EOF / returns 0"
result.resize(bytes_read)
return bytes(result)
which feels cleaner, but also starts changing structure relative to _io
version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decided to refactor to this. Control flow feels a lot simpler to me and a lot more readable than the branches and breaks.
Tests / Windows / build and test (Win32) failure is a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged, thank you. |
Utilize
bytearray.resize()
andos.readinto()
to reduce copies and match behavior of_io.FileIO.readall()
.There is still an extra copy which means twice the memory required compared to FileIO because there isn't a zero-copy path from
bytearray
->bytes
currently.On my system reading a 2GB file
./python -m test -M8g -uall test_largefile -m test.test_largefile.PyLargeFileTest.test_large_read -v
Goes from ~2.7 seconds -> ~2.2 seconds. The C
_io
implementation is ~1.2 seconds, so still some performance gap, but less.