-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue where cached entry size keeps increasing when making tiny query repeatedly #3968
Fix issue where cached entry size keeps increasing when making tiny query repeatedly #3968
Conversation
…uery repeatedly Signed-off-by: Alvin Lin <[email protected]>
b66e70f
to
0cbc752
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on it! I reviewed the code: good job spotting the issue, but I'm not sure this is correct.
The problem is that the extends returned by handleHit()
overwrite the cached response. Now think about running a query over "last 24h", then running the same query over "last 1h". With the changes proposed in this PR, the second query on "last 1h" would overwrite the cached extends for the "previous 23h" while we don't want to loose them.
Spot on, I have another idea, will submit a an update soon. |
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
a34a6c7
to
eb9cf54
Compare
I decided to go with different approach to fix the issue; instead of modifying the One thing I am not quite certain is the following code in for _, res := range responses {
promResponses = append(promResponses, res.(*PrometheusResponse))
resultsCacheGenNumberHeaderValues = append(resultsCacheGenNumberHeaderValues, getHeaderValuesWithName(res, ResultsCacheGenNumberHeaderName)...)
} Is there any special handling needed if the given |
Signed-off-by: Alvin Lin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on my feedback! Your change looks good to me. I left few nits I would be glad if you could take a look. In the meanwhile I will ping another person expert on the query-frontend for a 2nd look, cause this stuff is tricky :)
func chopOffOverlapPortion(samples []cortexpb.Sample, choppingPointTs int64) []cortexpb.Sample { | ||
|
||
// assuming stream.Samples is sorted by by timestamp in ascending order. | ||
searchResult := sort.Search(len(samples), func(i int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should never happen with the current logic but, to be on the safe side, I would handle the case sort.Search()
returns -1
(in this cause I believe we should the input samples
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, better to be more defensive. But I took a look at the doc https://golang.org/pkg/sort/#Search sort.Search()
never returns -1
it returns [0, n], where if return value is n
it means not found.
func main() {
x := []int{3, 5, 6, 7}
// both prints "4"
fmt.Println(sort.Search(len(x), func(i int) bool {return x[i] > 100}))
fmt.Println(sort.Search(len(x), func(i int) bool {return x[i] < 0}))
}
https://play.golang.org/p/HBtNWh3_MsR
But I think I get your point, I will add special handling if minTs
is smaller than samples[0]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice investigation and good job on the PR. This code is very tricky (took me a while to reacquaint myself with the code). The fix looks good to me once you address @pracucci's comments. I'm looking forward to revendoring this into Loki :)
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
fb51b56
to
4f3cb5e
Compare
Fix duplicate CHANGELOG.md entry. Signed-off-by: Alvin Lin <[email protected]>
eda1d61
to
68890c4
Compare
Signed-off-by: Alvin Lin <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Signed-off-by: Marco Pracucci <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙈
What this PR does:
See my comment at #3920 (comment)
Which issue(s) this PR fixes:
This issue was found as part of #3920, not sure if this PR fixes the issue though.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]