Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small optimization to avoid unnecessary Extents merge when growing cache entry #4026

Merged
merged 7 commits into from
Apr 8, 2021

Conversation

alvinlin123
Copy link
Contributor

@alvinlin123 alvinlin123 commented Mar 30, 2021

What this PR does:

Given 2 Extents that starts at the same time, but ends at different time, it is better to put the Extent ends later (larger) at the front of the slice during sort-by-start-time time. Reason being that because the larger Extent "covers" the smaller Extents, so during the merging phase the smaller Extent simply gets dropped.

If we put the smaller Extent before the larger Extent, then during the merge phase, the algorithm will need to figure out the overlapping point the merge the larger Extent into the smaller Extent, which is another O(log n) operation.

Before:

Extent 1: |---|
Extent 2: |----------------|

Merging the above results in finding intersection: 

Extent 1: |---|
Extent 2: |---x-------------|

Merged:|---|-------------| (first 3 sample points are from Extent 1, rest from Extent 2)

After:

Extent 2: |----------------|
Extent 1: |---|

Merging the above is easy, simply drop Extent 1.  

Merged: |----------------| (all sample points are from Extent 1)

Which issue(s) this PR fixes:
Didn't create issue because it's an small optimization.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks. Is it worth adding a test for this to verify intended behavior?

@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Mar 30, 2021

Nice, thanks. Is it worth adding a test for this to verify intended behavior?

I was pondering about the same question, but then I came to the conclusion that since this is a small optimization, so as long as existing tests passes then the change is functionally correct because it's not supposed to change any functional behaviour. In other words, I was thinking how I can write a test that fails before this change, but passes after change; I couldn't. If I am able to write some test that fail before this change, then something not related to this change is wrong :-)

I also thought about refactor the sort.Slice(...) to a separate function, and test that function, but then this kind of test seems too testing too much into the implementation detail (testing too much of "how" rather than "what").

Let me know your thoughts on this as well :-)

@pstibrany
Copy link
Contributor

In other words, I was thinking how I can write a test that fails before this change, but passes after change; I couldn't.

I haven't looked into details, but I was thinking that unit test could verify that merging such extents returns original big extent, instead of two smaller ones?

@alvinlin123 alvinlin123 reopened this Mar 31, 2021
@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Mar 31, 2021

I haven't looked into details, but I was thinking that unit test could verify that merging such extents returns original big extent, instead of two smaller ones?

Thanks for sharing your thought, I see your point. But in order to test this we might need to modify either Extent or handleHit. Because the mergedExtents returned by handleHit() does not contain information about how things were merged. To the caller (the test), it's just a normal slice of Extents.

@pstibrany
Copy link
Contributor

pstibrany commented Mar 31, 2021

To the caller (the test), it's just a normal slice of Extents.

Oh, I thought that the test could check if this slice is equal to input slice (= big one) or not equal (= new merged slice), but apparently slices in Go are not comparable. Perhaps it would be doable by constructing input slices with some big capacity, to be able to distinguish between them, or by checking for number of allocations (using https://golang.org/pkg/testing/#AllocsPerRun), which should be 0 if input slice was returned unmodified.

@alvinlin123 alvinlin123 force-pushed the sort-extends-optimized branch from d82315c to 67a714c Compare March 31, 2021 13:32
@alvinlin123
Copy link
Contributor Author

alvinlin123 commented Mar 31, 2021

or by checking for number of allocations (using https://golang.org/pkg/testing/#AllocsPerRun), which should be 0 if input slice was returned unmodified.

Oh this was interesting, I learned something new! But I did not go with this approach because I was afraid it would make the test brittle. However, your comments gave me an idea about how to test, take a look at my new revision :-)

I confirm that the test fails in the absence of the optimization with error that "response is nil".

@alvinlin123
Copy link
Contributor Author

@pstibrany any more comments on this PR?

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, LGTM. Thanks!

Signed-off-by: Marco Pracucci <[email protected]>
@pracucci
Copy link
Contributor

pracucci commented Apr 8, 2021

Thanks a lot @alvinlin123 for working on this. The PR is good and I will merge it as soon as the CI pass.

For the next optimization PR (I'm sure there will be many coming ;)) I would suggest you to test the optimization with a benchmark (ideally an existing one or write a new one if none exists) in order to check its impact. 👏

@pracucci pracucci merged commit fc90bde into cortexproject:master Apr 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants