Skip to content

Vectorized hash grouping #7316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 335 commits into
base: main
Choose a base branch
from
Draft

Vectorized hash grouping #7316

wants to merge 335 commits into from

Conversation

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

Attention: Patch coverage is 84.62998% with 81 lines in your changes missing coverage. Please review.

Project coverage is 81.90%. Comparing base (59f50f2) to head (7c03e46).
Report is 730 commits behind head on main.

Files with missing lines Patch % Lines
...odes/vector_agg/hashing/hash_strategy_serialized.c 82.19% 10 Missing and 24 partials ⚠️
...des/vector_agg/hashing/hash_strategy_single_text.c 82.08% 7 Missing and 17 partials ⚠️
tsl/src/import/ts_simplehash.h 89.43% 3 Missing and 12 partials ⚠️
.../src/nodes/vector_agg/hashing/hash_strategy_impl.c 60.00% 2 Missing and 4 partials ⚠️
...rc/nodes/vector_agg/hashing/batch_hashing_params.h 90.00% 0 Missing and 1 partial ⚠️
tsl/src/nodes/vector_agg/plan.c 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7316      +/-   ##
==========================================
+ Coverage   80.06%   81.90%   +1.83%     
==========================================
  Files         190      245      +55     
  Lines       37181    45285    +8104     
  Branches     9450    11315    +1865     
==========================================
+ Hits        29770    37089    +7319     
- Misses       2997     3730     +733     
- Partials     4414     4466      +52     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

akuzm and others added 29 commits October 29, 2024 12:59
This commit has various assorted refactorings and cosmetic changes:
* Various cosmetic things I don't know where to put.
* The definitions of aggregate functions and grouping columns in the
  vector agg node are now typed arrays and not lists.
* The aggegate function implementation always work with at most one
  filter bitmap. This reduces the amount of code and will help to
support the aggregate FILTER clauses.
* Parts of the aggregate function implementations are restructured and
  renamed in a way that will make it easier to support hash grouping.
* EXPLAIN output is added for vector agg node that mentions the grouping
  policy that is being used.
* datum key in serialized policy
* specializaations for no scalar in serialized policy
We didn't properly resolve INDEX_VARs in the output targetlist of
DecompressChunk nodes, which are present when it uses a custom scan
targetlist. Fix this by always working with the targetlist where these
variables are resolved to uncompressed chunk variables, like we do
during execution.
Co-authored-by: Erik Nordström <[email protected]>
Signed-off-by: Alexander Kuzmenkov <[email protected]>
akuzm and others added 30 commits February 27, 2025 13:05
The continuous aggregate incremental refresh test accidentally used the
now() function which makes it fail. Replace it with fixed dates.
This case was handled incorrectly and led to a segfault when grouping by
multiple columns, one of which is a UUID segmentby column.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants