-
Notifications
You must be signed in to change notification settings - Fork 15
Make stats
resilient to incomplete streams
#192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Also adds a `--count` option for simpler output
54adae2
to
87394fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments, you can choose whether to address them or just open a TODO
.
src/bin/ion/commands/stats.rs
Outdated
match current_value.read().unwrap() { | ||
Struct(s) => { | ||
for field in s { | ||
stack.push((field.unwrap().value(), depth + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be good to return an anyhow::Result
instead of .unwrap()
ping all of the errors in this method.
src/bin/ion/commands/stats.rs
Outdated
VersionMarker(_) | EncodingDirective(_) => continue, | ||
SymbolTable(_) => symtab_count += 1, | ||
system_value @ Value(raw_value) => { | ||
let size = system_value.raw_stream_item().unwrap().span().bytes().len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Ion 1.1, you can have system values that are not backed by a raw stream item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do I do about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's an ephemeral/computed value, then it doesn't have a size. You can choose whether that means size=0
bytes or if you don't even count it. The unwrap()
will cause a panic, though.
} | ||
// SystemStreamItem is non_exhaustive | ||
unsupported => panic!("Unsupported system stream item: {unsupported:?}"), | ||
}, | ||
} | ||
} | ||
// Reduce the number of shared symbols. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took me a second—it's subtracting the number of system symbols in Ion 1.0. This will produce inaccurate results in Ion 1.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even look at that. reader.detected_encoding().version().system_symbol_table().len()
would be more accurate, but that doesn't really tell me if the system symbols were visible/relevant in a 1.1 stream... right?
What would you advise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 1.1, the symbol table size is already what you want. The system symbols that are automatically pre-populated in the user symbol table are optional, so you can consider them the same as the user selecting them themselves.
While we're here, this:
let symbols_count = reader.symbol_table().symbols().iter().len() - 10;
could be:
let symbols_count = reader.symbol_table().len() - 10;
No need to count them one by one.
unparseable_count += 1; | ||
if matches!(e, IonError::Incomplete(..)) { | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this logic encounters a malformed (vs incomplete) item, I think it will go into an infinite loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After offline conversation, next_item
if it errors will always error. next_item() => Err(_)
means the stream is irretrievable by the SystemReader from this point on.
I'll fix this.
4d71fc6
to
e28eb61
Compare
--count
option for simpler outputBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.