Skip to content

[Enhancement] Removes reserve() from array_agg(). (backport #56958) #57009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 17, 2025

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Mar 17, 2025

Why I'm doing:

While debugging a slow query which took a few minutes, we found that it was due to skewed data.
But it was not clear why the skewed data caused such a long latency. After further investigation, I found that it was due to unnecessary reserve() calls in array_agg().

After appending a large array to a result column in one aggregated row, following finalize_to_column()s even with a small number of elements took a few ms likely to allocate new memory and copy existing values. Thousands of aggregated rows with a few ms added a lot of latency.
Repeatedly calling reserve() with small increases is harmful. Without reserve(), in the following append calls, std::vector would be able to increase the capacity exponentially efficiently.

What I'm doing:

Removes reserve().

In a test on skewed data, a latency of a query with array_agg() decreased from 2m45s to 21s.

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

@wanpengfei-git wanpengfei-git enabled auto-merge (squash) March 17, 2025 21:16
@wanpengfei-git wanpengfei-git merged commit c9272f3 into branch-3.3 Mar 17, 2025
36 checks passed
@wanpengfei-git wanpengfei-git deleted the mergify/bp/branch-3.3/pr-56958 branch March 17, 2025 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants