-
Notifications
You must be signed in to change notification settings - Fork 591
feat(bindings/python): Enhance Reader and Writer #6086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
chitralverma
wants to merge
9
commits into
apache:main
Choose a base branch
from
chitralverma:options-for-rw
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+386
−164
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
f4033cf
add reader and writer options to async open
chitralverma 91c0b9d
Update test_write.py
chitralverma 7157ffa
Merge branch 'main' into options-for-rw
chitralverma c5684d4
Add test_async_writer_options test stub
chitralverma 7d6789d
Enhance capability struct with additional conditional operations
chitralverma 614ad25
Merge branch 'apache:main' into options-for-rw
chitralverma 98e5847
Implement AsyncFile::write_from
chitralverma 7ebf04c
Merge branch 'apache:main' into options-for-rw
chitralverma 85446eb
Merge branch 'main' into options-for-rw
chitralverma File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,17 +15,106 @@ | |
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
use crate::format_pyerr; | ||
use crate::ocore::{Operator, Reader, Writer}; | ||
use dict_derive::FromPyObject; | ||
use pyo3::prelude::PyResult; | ||
use pyo3::pyclass; | ||
use std::collections::HashMap; | ||
|
||
use std::ops::Bound as RangeBound; | ||
|
||
#[pyclass(module = "opendal")] | ||
#[derive(FromPyObject, Default)] | ||
pub struct ReaderOptions { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest that we don't change the class name. |
||
pub version: Option<String>, | ||
pub concurrent: Option<usize>, | ||
pub chunk: Option<usize>, | ||
pub gap: Option<usize>, | ||
pub range_start: Option<usize>, | ||
pub range_end: Option<usize>, | ||
} | ||
|
||
impl ReaderOptions { | ||
pub fn make_range(&self) -> (RangeBound<u64>, RangeBound<u64>) { | ||
let start_bound = self | ||
.range_start | ||
.map_or(RangeBound::Unbounded, |s| RangeBound::Included(s as u64)); | ||
let end_bound = self | ||
.range_end | ||
.map_or(RangeBound::Unbounded, |e| RangeBound::Excluded(e as u64)); | ||
|
||
(start_bound, end_bound) | ||
} | ||
|
||
pub async fn create_reader(&self, op: &Operator, path: String) -> PyResult<Reader> { | ||
let mut fr = op.reader_with(&path); | ||
|
||
if let Some(version) = &self.version { | ||
fr = fr.version(version); | ||
}; | ||
if let Some(concurrent) = self.concurrent { | ||
fr = fr.concurrent(concurrent); | ||
}; | ||
if let Some(chunk) = self.chunk { | ||
fr = fr.chunk(chunk); | ||
}; | ||
if let Some(gap) = self.gap { | ||
fr = fr.gap(gap); | ||
}; | ||
|
||
let reader = fr.await.map_err(format_pyerr)?; | ||
Ok(reader) | ||
} | ||
} | ||
|
||
#[pyclass(module = "opendal")] | ||
#[derive(FromPyObject, Default)] | ||
pub struct WriteOptions { | ||
pub struct WriterOptions { | ||
pub append: Option<bool>, | ||
pub chunk: Option<usize>, | ||
pub concurrent: Option<usize>, | ||
pub cache_control: Option<String>, | ||
pub content_type: Option<String>, | ||
pub content_disposition: Option<String>, | ||
pub cache_control: Option<String>, | ||
pub content_encoding: Option<String>, | ||
pub if_not_exists: Option<bool>, | ||
pub user_metadata: Option<HashMap<String, String>>, | ||
} | ||
|
||
impl WriterOptions { | ||
pub async fn create_writer(&self, op: &Operator, path: String) -> PyResult<Writer> { | ||
let mut fw = op.writer_with(&path); | ||
|
||
if let Some(append) = self.append { | ||
fw = fw.append(append); | ||
}; | ||
if let Some(chunk) = self.chunk { | ||
fw = fw.chunk(chunk); | ||
}; | ||
if let Some(concurrent) = self.concurrent { | ||
fw = fw.concurrent(concurrent); | ||
}; | ||
if let Some(cache_control) = &self.cache_control { | ||
fw = fw.cache_control(cache_control); | ||
}; | ||
if let Some(content_type) = &self.content_type { | ||
fw = fw.content_type(content_type); | ||
}; | ||
if let Some(content_disposition) = &self.content_disposition { | ||
fw = fw.content_disposition(content_disposition); | ||
}; | ||
if let Some(content_encoding) = &self.content_encoding { | ||
fw = fw.content_encoding(content_encoding); | ||
}; | ||
if let Some(if_not_exists) = self.if_not_exists { | ||
fw = fw.if_not_exists(if_not_exists); | ||
}; | ||
if let Some(user_metadata) = &self.user_metadata { | ||
fw = fw.user_metadata(user_metadata.clone()); | ||
}; | ||
|
||
let writer = fw.await.map_err(format_pyerr)?; | ||
Ok(writer) | ||
} | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking if it's a good idea for us to provide something like
shutil.copyfileobj
. Is this more friendly for python users?cc @messense and @erickguan for comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operator
works withshutil.copyfileobj
but notAsyncOperator
. Perhaps something like this as a utility function:It would be great if python offers an async version of
shutil
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rust based write_from seems to have an improvement of 18~28% copy throughput compared to the naive python loop based on the benchmark shared here: #5943 (comment)
is the suggestion here to provide a function with similar function signature as shutil.copyfileobj but copying is still done on the rust side?
but on the rust side, the chunk size itself is already an option for both the reader and writer,
not sure if there is a need to ask for bufsize again.. (I assume the bufsize would be most optimal if set based on the chunk size options set on the reader and writer)