You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> I did try signed->unsigned for 32 and 64 bit ints and there was no difference.
Ahh, the reason for this is that I32/64->U32/64 is handled above (around L171). I would think anything that falls through and relies on arrow_cast::cast is going to be potentially slow due to use of unary_opt, but a quick glance at the decimal code looks like it will figure out which casts are infallible and use unary instead. Perhaps other conversions do a similar optimization.
It might be worth exploring enumerating all of the allowed Parquet physical to logical type mappings and account for them here and not rely on arrow_cast machinery.
I wonder if it is actually possible to avoid the cast (or some of the conversions) and do the conversion when building the value buffer in the reader? @etseidl
This should avoid a separate copy / conversion (and for smaller types also reduces memory usage somewhat).
Ahh, the reason for this is that I32/64->U32/64 is handled above (around L171). I would think anything that falls through and relies on
arrow_cast::cast
is going to be potentially slow due to use ofunary_opt
, but a quick glance at the decimal code looks like it will figure out which casts are infallible and useunary
instead. Perhaps other conversions do a similar optimization.It might be worth exploring enumerating all of the allowed Parquet physical to logical type mappings and account for them here and not rely on
arrow_cast
machinery.Originally posted by @etseidl in #7055 (comment)
The text was updated successfully, but these errors were encountered: