-
Notifications
You must be signed in to change notification settings - Fork 590
feat(bindings/python): Enhance Reader and Writer #6086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -472,6 +475,20 @@ class AsyncFile: | |||
Args: | |||
bs (bytes): The content to write. | |||
""" | |||
async def write_from(self, other: AsyncFile) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking if it's a good idea for us to provide something like shutil.copyfileobj
. Is this more friendly for python users?
cc @messense and @erickguan for comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operator
works with shutil.copyfileobj
but not AsyncOperator
. Perhaps something like this as a utility function:
async def async_copyfileobj(reader, writer, bufsize: int = 64 * 1024):
while True:
chunk = await reader.read(bufsize)
if not chunk:
break
await writer.write(chunk)
It would be great if python offers an async version of shutil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rust based write_from seems to have an improvement of 18~28% copy throughput compared to the naive python loop based on the benchmark shared here: #5943 (comment)
is the suggestion here to provide a function with similar function signature as shutil.copyfileobj but copying is still done on the rust side?
but on the rust side, the chunk size itself is already an option for both the reader and writer,
not sure if there is a need to ask for bufsize again.. (I assume the bufsize would be most optimal if set based on the chunk size options set on the reader and writer)
|
||
#[pyclass(module = "opendal")] | ||
#[derive(FromPyObject, Default)] | ||
pub struct ReaderOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we don't change the class name.
range_end = randint(range_start, len(content) - 1) | ||
|
||
async with await async_operator.open( | ||
filename, "rb", range_start=range_start, range_end=range_end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"range_start" and "range_end" don't seem ideal to me, since Python users may not be familiar with the term "range" in this context. How about using "offset" and "size" instead?
Which issue does this PR close?
Closes #5943
What changes are included in this PR?
open(...)
AsyncFile::write_from
Are there any user-facing changes?
No breaking changes, but users can use the new features.