-
Notifications
You must be signed in to change notification settings - Fork 451
Migrate file format to protobuf #8995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Web viewer built successfully. If applicable, you should also test it:
Note: This comment is updated whenever you push a commit. |
b1ebf6e
to
cb6f4f8
Compare
cb6f4f8
to
ea90dd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
We will need to fire a nightly ASAP once this is merged -- this will break all developers' welcome screen for a while. Let's try to do that at a not too inconvenient time (marking as DNM for safety).
// NOTE: This is `size_of` on a `repr(Rust)` struct, | ||
// which is fine because we control the layout | ||
// in the definition above, and tests would quickly | ||
// catch any sort of regression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not switching to the C ABI if we need control over layout though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this a bit more, and maybe this should just be a literal constant. We need control over the encoding, not the layout of the struct.
Alternatively we can control the layout of the struct, and use something like bytemuck
to cast to/from a slice of bytes. But that's an extra dependency that may not be worth it for something so simple
// NOTE: We use little-endian encoding, because we live in | ||
// the 21st century. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this just after reviewing an arrow PR that encodes stuff as BE to be spec compliant lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is our format, and it doesn't go over the network so nobody can claim it should be network-endian! 😄
...well it technically does if you download it over http. But we must resist the CPU cycles wasted on swapping bytes.
Related
rrd
to protobuf #8993MsgPack
serializer #8994What
protobuf
encoding for our file format.msgpack
code from the repository.Tested by opening running the
clock
example and attempting to load the data in both the native and web viewers.There are still some pieces left in the file format that have now been made redundant. One example is the per-recording
compression
field. We don't use that anymore, compression is now specified perArrowMsg
. There are a few other things like that, but I believe that refactoring is better left to the time when we decide to get rid ofLogMsg
, which would be another breaking change for the file format.