-
Notifications
You must be signed in to change notification settings - Fork 930
Improve multi-column sorting for primitive arrays by avoiding multi-column sort #7532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It gave me some rough code to work with and it's pretty promising: ~3x improvement for sorting two f32 colums!
|
Thinking out loud: an alternative would be improving the row format in such a way it more efficiently can create / sort fixed-width data (e.g. viewing the data as a |
I think the key here is avoiding a codegen explosion, a specialized fixed width row format sounds promising |
Another option would be to reuse existing |
I'd be wary of making arrow-ord depend on arrow-row, but there is definitely the possibility of copying the encoding mechanism used for integers and floats and using that to pack into u64 and u128 respectively. The complex logic in arrow-row is to handle variable width types, e.g. strings, which I suspect wouldn't be relevant here. |
Good to know - if there is considerable overlap it might be possible to extract it to some common place - we can start with just copying. |
Uh oh!
There was an error while loading. Please reload this page.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
When sorting multiple columns, like two u32 columns, we take the approach of creating a
LexicalComparator
, and sort by that.However, this has higher overhead compared to sorting by a single primitive.
Describe the solution you'd like
We can try combining arrays (pack) into one or a smaller number of primitive arrays.
We can pack the items (based on sorting options) to the next biggest type:
(u16, u16) ->
u32
(u16, f32) ->
u64
(u8, u32) ->
u64
(u32, u32) ->
u64
(u64, u32) ->
u128
(u64, u64) ->
u128
...etc.
and sort on that. This is similar to the
RowFormat
, but should have much lower overhead of creating the arrays and sorting is faster on primitives compared to variable-sized data.Even when the columns can't be packed into one column, sorting can still be faster as it will create fewer columns to sort on.
I believe we can also extend this idea a bit further for larger types and more columns by converting the inputs into a fixed number of
u64
tuples / fixed array sizes, so the compiler can optimize that. E.g. for a 4 column sort on i64 types we can convert it toVec<[u64; 4]>
and sort based on this.Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: