-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement vectorized decompression for bools #7899
base: main
Are you sure you want to change the base?
Conversation
521c902
to
10771d9
Compare
6178b9a
to
96075ac
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7899 +/- ##
==========================================
+ Coverage 80.06% 81.85% +1.78%
==========================================
Files 190 250 +60
Lines 37181 46343 +9162
Branches 9450 11637 +2187
==========================================
+ Hits 29770 37935 +8165
- Misses 2997 3808 +811
- Partials 4414 4600 +186 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d299691
to
00c9ffb
Compare
(.state_code == "XX000" | ||
and .error_severity != "LOG" | ||
and (.statement | contains("ignore this error in CI") | not)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to the change below. We do need to ignore this error message, because it is debug/test only and only used to verify the vectorized filtering works.
And yes, unrelated to the 'vectorized bool filtering'. It fixes a problem that we already had. I have no idea why it didn't cause issues before.
\set ON_ERROR_STOP 0 | ||
set timescaledb.debug_require_vector_qual to 'forbid'; | ||
select count(*) from vectorqual where metric4 > 4; | ||
select count(*), 'ignore this error in CI' as x from vectorqual where metric4 > 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI collects the error messages and is unhappy seeing the one generated by set timescaledb.debug_require_vector_qual to 'forbid';
because it thinks this is a new internal error message being introduced.
I was thinking either add this debug/test error message to the list of error messages or ignore it, as this is debug/test only. I think it is better to ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is because that code uses elog, i would prefer adjusting the code to not use elog and have proper sql state code instead of filtering this out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe ERRCODE_QUERY_CANCELED would be more appropriate here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the purpose of this check is that it should be impossible to trigger the ERRCODE_INTERNAL_ERROR during normal program operation. It is reserved for bugs or catalog corruption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE
, the ERRCODE_QUERY_CANCELED
is for operator intervention.
3c1d766
to
5fd8c2c
Compare
This change decompresses into BitArray and add the supporting code into Simple8bRleBitArray. With this the ArrowArray stores bits and not bytes. A new DecompressionType was added: DT_ArrowBits to inform the decompression logic about the data format. The TSL planner is also modified so it generates the vectorized decompression and filtering ops to take advantage of these. The Postgres planner creates plan nodes for boolean columns that we didn't handle before. For BoolCol = True it reduces the expression to a single Var, while for BoolCol = False it creates a BoolExpr(Not, Var) node. With this change we handle both cases.
5fd8c2c
to
9af059f
Compare
This change decompresses into BitArray and add the supporting code into Simple8bRleBitArray. With this the ArrowArray stores bits and not bytes. A new DecompressionType was added: DT_ArrowBits to inform the decompression logic about the data format.
The TSL planner is also modified so it generates the vectorized decompression and filtering ops to take advantage of these. The Postgres planner creates plan nodes for boolean columns that we didn't handle before. For BoolCol = True it reduces the expression to a single Var, while for BoolCol = False it creates a BoolExpr(Not, Var) node. With this change we handle both cases.