-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize ingester Push() on blocks storage when the per-user or per-metric series limit is reached #3971
Optimize ingester Push() on blocks storage when the per-user or per-metric series limit is reached #3971
Conversation
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. FormatError
and global errors is not the idiomatic way of doing it, but using global error values helps to avoid allocation unless necessary. Good job!
pkg/ingester/ingester_v2.go
Outdated
// Caused by limits. | ||
if firstPartialErr == nil { | ||
firstPartialErr = ve | ||
if cause == errMaxSeriesPerUserLimitExceeded || cause == errMaxSeriesPerMetricLimitExceeded { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify the logic on lines 787 – 820 with:
updateFirstPartial := func(errFn func() error) {
if firstPartialErr == nil {
firstPartialErr = errFn()
}
}
switch cause {
case storage.ErrOutOfBounds:
sampleOutOfBoundsCount++
updateFirstPartial(func() error { return wrappedTSDBIngestErr(err, model.Time(s.TimestampMs), ts.Labels) })
continue
case storage.ErrOutOfOrderSample:
sampleOutOfOrderCount++
updateFirstPartial(func() error { return wrappedTSDBIngestErr(err, model.Time(s.TimestampMs), ts.Labels) })
continue
case storage.ErrDuplicateSampleForTimestamp:
newValueForTimestampCount++
updateFirstPartial(func() error { return wrappedTSDBIngestErr(err, model.Time(s.TimestampMs), ts.Labels) })
continue
case errMaxSeriesPerUserLimitExceeded:
perUserSeriesLimitCount++
updateFirstPartial(func() error { return makeLimitError(perUserSeriesLimit, i.limiter.FormatError(userID, cause)) })
continue
case errMaxSeriesPerMetricLimitExceeded:
perMetricSeriesLimitCount++
updateFirstPartial(func() error { return makeMetricLimitError(perMetricSeriesLimit, copiedLabels, i.limiter.FormatError(userID, cause)) })
continue
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
} | ||
|
||
var ve *validationError | ||
if errors.As(cause, &ve) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed because, following the code flow, I think the only errors that could be returned are per-user and per-metric series limit.
Signed-off-by: Marco Pracucci <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great job!
What this PR does:
Following up the PR #3969, in this PR I'm optimizing the
v2Push()
when an high volume of per-user/series limit is reached.In the profile I've seen much CPU and memory is consumed formatting the
Limiter
error. The idea in this PR is to return static errors fromLimiter
, offer aFormatError()
function and then format only the 1st partial error inv2Push()
.Metadata and chunks storage hasn't been changed.
Benchmark:
The benchmarks are on top of #3969 (I merged #3969 first and then worked on this PR):
Which issue(s) this PR fixes:
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]