Skip to content

Adds traits for the lazy reader API to abstract over format #596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 11, 2023

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Jul 10, 2023

The lazy reader API introduced in #546 only supports binary Ion. In order to eventually support text Ion alongside binary, this PR adds a new family of traits (LazyFormat) and makes the existing binary reader API an implementation of those traits. Providing a text implementation of the traits will happen in one or more follow-on PRs.

The LazyFormat trait requires implementations to specify each type they will use to represent various parts of the lazy reader API.

pub trait LazyFormat<'data>: Sized + Debug + Clone {
    type Reader: LazyRawReader<'data, Self>;
    type Sequence: LazyRawSequence<'data, Self>;
    type Struct: LazyRawStruct<'data, Self>;
    type Value: LazyRawValue<'data, Self>;
    type AnnotationsIterator: Iterator<Item = IonResult<RawSymbolTokenRef<'data>>>;
}

Types that do not specify a format name (for example: LazyValue) have been renamed to reflect that they are specific to binary (now: LazyBinaryValue). The format-agnostic names are now used for the implementation that is generic over a LazyFormat impl.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Jul 10, 2023

Codecov Report

Patch coverage: 72.80% and project coverage change: -0.14 ⚠️

Comparison is base (93dce09) 82.53% compared to head (11664eb) 82.39%.

❗ Current head 11664eb differs from pull request most recent head 5347a4a. Consider uploading reports for the commit 5347a4a to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #596      +/-   ##
==========================================
- Coverage   82.53%   82.39%   -0.14%     
==========================================
  Files         110      111       +1     
  Lines       19486    19550      +64     
  Branches    19486    19550      +64     
==========================================
+ Hits        16082    16108      +26     
- Misses       1861     1899      +38     
  Partials     1543     1543              
Impacted Files Coverage Δ
src/lazy/binary/encoded_value.rs 93.33% <ø> (ø)
src/lazy/binary/format.rs 0.00% <0.00%> (ø)
src/lazy/raw_stream_item.rs 52.94% <ø> (ø)
src/lazy/system_stream_item.rs 0.00% <ø> (ø)
src/lazy/value_ref.rs 73.79% <33.33%> (ø)
src/lazy/binary/raw/lazy_raw_sequence.rs 60.93% <60.00%> (-6.41%) ⬇️
src/lazy/struct.rs 66.66% <60.86%> (ø)
src/lazy/binary/raw/struct.rs 62.22% <62.22%> (ø)
src/lazy/sequence.rs 73.07% <62.50%> (ø)
src/lazy/reader.rs 66.66% <64.28%> (ø)
... and 7 more

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@zslayton zslayton marked this pull request as ready for review July 11, 2023 14:54
Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour

Comment on lines -17 to +20
use ion_rs::lazy::binary::lazy_reader::LazyReader;
use ion_rs::lazy::binary::system::lazy_sequence::LazySequence;
use ion_rs::lazy::binary::system::lazy_struct::LazyStruct;
use ion_rs::lazy::binary::system::lazy_value::LazyValue;
use ion_rs::lazy::r#struct::LazyBinaryStruct;
use ion_rs::lazy::reader::LazyBinaryReader;
use ion_rs::lazy::sequence::LazyBinarySequence;
use ion_rs::lazy::value::LazyBinaryValue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This file (lazy_read_all_values) is an example program that demonstrates the lazy reader API. We currently only have a binary implementation, so the names of the various types have been changed to reflect that.

The names that don't mention a format are generic, while the names that do are just type aliases over a specific implementation of that generic type.

For example, LazyValue is generic over a format:

pub struct LazyValue<'top, 'data, F: LazyFormat<'data>> {
    pub(crate) raw_value: F::Value,
    pub(crate) symbol_table: &'top SymbolTable,
}

while LazyBinaryValue is a type alias over a specific implementation of LazyValue:

pub type LazyBinaryValue<'top, 'data> = LazyValue<'top, 'data, BinaryFormat>;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ An implementation of the LazyFormat trait is just a collection of concrete types that collectively form an API for that format.

I'm open to better names than LazyFormat/BinaryFormat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LazyDecoder for BinaryEncoding? I don't know. 🤷‍♂️

Comment on lines 29 to 33
impl<'data> LazyContainerPrivate<'data, BinaryFormat> for LazyRawBinarySequence<'data> {
fn from_value(value: LazyRawBinaryValue<'data>) -> Self {
LazyRawBinarySequence { value }
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ There are a few *Private traits that house implementation details we need but do not wish to expose to users. See the src/lazy/format.rs file for an explanation of how this works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This functionality was moved, not deleted. However, LazyRawStruct was also renamed to LazyRawBinaryStruct, so it seems that git couldn't keep track.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Some of the module names in the hierarchy were getting pretty repetitive. I removed name elements that were already reflected in parent modules. e.g. crate::lazy::binary::raw::lazy_raw_reader became crate::lazy::binary::raw::reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ As called out prior, git thought this was file was deleted but it was just moved and its types renamed.

Comment on lines 53 to 63
fn ion_type(&self) -> IonType {
self.ion_type()
}

fn annotations(&self) -> RawBinaryAnnotationsIterator<'data> {
self.annotations()
}

fn read(&self) -> IonResult<RawValueRef<'data, BinaryFormat>> {
self.read()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ These impls just forward the method call to the inherent implementation on the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The 'system' types no longer live in the binary module, as the system level is format-agnostic.

}

impl<'data> LazyReader<'data> {
pub fn new(ion_data: &'data [u8]) -> IonResult<LazyReader<'data>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ As LazyReader is now generic over a LazyFormat, the new() method now lives in the concretely-typed LazyBinaryReader below.

symbol_table: self.symbol_table,
initial_offset: self
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The initial_offset member field was a copy/pasted holdover from the raw implementation. I've removed it in each of the user-level types that had erroneously retained it.

@zslayton zslayton requested review from almann and popematt July 11, 2023 15:17
pub mod lazy_raw_value;
pub mod raw_annotations_iterator;
pub mod reader;
pub mod r#struct;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you could call it strukt. (But actually, please don't.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LazyDecoder for BinaryEncoding? I don't know. 🤷‍♂️

@@ -41,7 +41,7 @@ mod lazy_reader_example {
// We're going to recursively visit and read every value in the input stream, counting
// them as we go.
let mut count = 0;
let mut reader = LazyReader::new(ion_data)?;
let mut reader = LazyBinaryReader::new(ion_data)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the user required to know whether the data is binary or text ahead of time? (Or this just a temporary thing because the lazy text reader is not implemented yet?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or this just a temporary thing because the lazy text reader is not implemented yet?)

This one!

@zslayton zslayton merged commit 420a276 into main Jul 11, 2023
@zslayton zslayton deleted the lazy-traits branch July 11, 2023 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants