Skip to content

Allow merging primitive dictionary values in concat and interleave kernels #7518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
asubiotto opened this issue May 16, 2025 · 0 comments · May be fixed by #7519
Open

Allow merging primitive dictionary values in concat and interleave kernels #7518

asubiotto opened this issue May 16, 2025 · 0 comments · May be fixed by #7519
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@asubiotto
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Currently, dictionary values are merged only if they are Utf8, Binary, LargeUtf8, and LargeBinary because the pointer equality closure construction in should_merge_dictionaries returns false to the outer function in the default arm:

pub fn should_merge_dictionary_values<K: ArrowDictionaryKeyType>(
dictionaries: &[&DictionaryArray<K>],
len: usize,
) -> bool {
use DataType::*;
let first_values = dictionaries[0].values().as_ref();
let ptr_eq: Box<PtrEq> = match first_values.data_type() {
Utf8 => Box::new(bytes_ptr_eq::<Utf8Type>),
LargeUtf8 => Box::new(bytes_ptr_eq::<LargeUtf8Type>),
Binary => Box::new(bytes_ptr_eq::<BinaryType>),
LargeBinary => Box::new(bytes_ptr_eq::<LargeBinaryType>),
_ => return false,

We've observed unnecessarily high memory usage when concatenating dictionaries with primitive values.

Describe the solution you'd like

should_merge_dictionaries should return true for other types as well.

@asubiotto asubiotto added the enhancement Any new improvement worthy of a entry in the changelog label May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant