-
Notifications
You must be signed in to change notification settings - Fork 451
Tagged columnar updates: Rust #8764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/// Note that this API ignores any stateful time set on the log stream via the | ||
/// [`Self::set_timepoint`]/[`Self::set_time_nanos`]/etc. APIs. | ||
/// Furthermore, this will _not_ inject the default timelines `log_tick` and `log_time` timeline columns. | ||
pub fn send_columns_v2( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rename this to send_columns
and remove all the legacy stuff in a follow-up, I don't want the diffs to be unnecessarily hard to follow.
92b66c3
to
434d32c
Compare
Web viewer built successfully. If applicable, you should also test it:
Note: This comment is updated whenever you push a commit. |
Ha, interestingly this won't work with I'll auto-generate plural methods for these instances in a follow-up. |
Related: I also intend to introduce a |
/// This generates a correctly sized [`SerializedComponentColumn`] for a given indicator, where the | ||
/// specified `num_rows` value represents the number of rows in the column. | ||
#[doc(hidden)] // public so we can access it from re_types too | ||
pub fn indicator_column<A: Archetype>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be greatly simplified, and much cheaper as well -> indicators are straight null arrays these days, no lists involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does mean making SerializedComponentColumn
more complicated just for the sake of indicators, i.e. legacy...
Annoying, but there isn't really a choice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just so I get this right, it would be more complicated because you have to keep the round count separate from the arrow data? (whereas right now you can just check the list array length)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah we'd have the same problem in C++ as well (in Python all the indicator handling happens on the Rust side anyways iirc)
Is it really worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(re Cpp: ComponentColumn
there has a from_indicators
which does pretty much what we're doing here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more complicated because SerializedComponentColumn
would have to carry an erased Array instead, and have two paths for everything. I don't like it.
I'm gonna investigate an alternative: have send_columns_v2
automagically craft indicator columns as required on the fly. I don't like it either but objectively it is far less invasive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh god, this actually is an already existing problem, indicators are wrapped in a ListArray
when put into chunks (probably to avoid adding a ton of complexity for a dead feature) 🤦.
That means:
- Indicators take 4 bytes per row as of today (the outer list-array's offsets).
- What I did is actually the right thing to do, and it makes me very sad.
Man, indicators really gotta go...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, there might be a win to have here: automagically craft the indicators in send_columns_v2
, but then log them in their own independent chunk.
And while we're doing that, we should check if the same can be done with the absolute minimal efforts in the batcher, which would fix the existing problem for vanilla log calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Goes without saying, this will have to happen in yet another follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until (if) we get to that, I will need to introduce a hardcoded exception to not automatically generate indicators for scalars, otherwise that's going to be a pretty big regression compared to today (we guide users towards not sending an indicator in the exist column-oriented scalar examples).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very happy you found an ergonomic way without generating tons of with_
methods again. I like the new-new design you landed at :) 👍
I ended up with something slightly different than the original design, because it turned out to really not feel that great in practice.
In particular, it required duplicating a lot of generated code for little (no?) added value.
In short:
.columns
becomes a vanilla method rather than a static method.send_column
api #7167