-
Notifications
You must be signed in to change notification settings - Fork 661
feat: serve recording rules from config in tenant-settings #4299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
} | ||
|
||
type RecordingRules struct { | ||
cfg Config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused
pkg/settings/recording/recording.go
Outdated
return connect.NewResponse(&settingsv1.GetRecordingRuleResponse{Rule: r}), nil | ||
} | ||
} | ||
return nil, connect.NewError(connect.CodeNotFound, fmt.Errorf("no rule with id='%s' found", req.Msg.Id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case was considered connect.CodeInternal
error (in s.Get
). Rewrote this to be a 404 in case there's no storage/config rule hitting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah 404 feels correct
@@ -86,6 +86,9 @@ message RecordingRule { | |||
// This allows recording rules to focus on specific functions and calculate their "total" | |||
// resource usage. | |||
optional StacktraceFilter stacktrace_filter = 8; | |||
|
|||
// Provisioned rules are added by config and can't be Upsert or Deleted | |||
bool provisioned = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should help UI to disable delete icon AND/OR to mark it as provisioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this.
The main thing I like you to change before merging is what store takes priority.
We currently have it inconsistent and IMHO the overrides (or rules from config), should always win. As this allows us most operation flexibility.
pkg/settings/recording/recording.go
Outdated
res := &settingsv1.ListRecordingRulesResponse{ | ||
Rules: make([]*settingsv1.RecordingRule, 0, len(rules.Rules)), | ||
Rules: make([]*settingsv1.RecordingRule, 0, len(rulesFromStore.Rules)+len(rulesFromOverrides)), | ||
} | ||
for _, rule := range rules.Rules { | ||
for _, rule := range rulesFromStore.Rules { | ||
res.Rules = append(res.Rules, convertRuleToAPI(rule)) | ||
} | ||
res.Rules = append(res.Rules, rulesFromOverrides...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check if there is a conflict between a rule from the API and a rule from overrides.
I think rules from overrides should always "win". This would allow us to override a misbehaving rule fairly quickly without having to fire requests/go to the bucket.
So basically
- You have an API rule `the-killer-rule'
- At a later stage we add an overriding rule in the overrides called
the-killer-rule
, do nothing.
Now it behaves there would only be a provisioned the-killer-rule
, so the rule in the API store cannot be list/read/deleted/updated until we remove the override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay as a recap I'll build a proper override system as follows:
- Rules coming from config may have an ID, this may be the same a one generated through API (UI).
- For consistency, rules coming from config without ID will be filled with a deterministic autogenerated ID.
- When calling Get, we will return overridden rules first, then API rules.
- When calling List, we will return rules with unique IDs, so we filter out any API rule that was overridden (this is important as this is the call the compaction-worker does).
- When calling Delete, we will only check for API rules (ignore overrides and straightly go to delete the OG id).
- When calling Upsert, we can do a similar logic as delete: only update API rules (ignoring overrides), and insert rules that doesn't repeat an ID in overrides.
The operational idea behind is:
- Given a broken rule, we can create an override. Then, we can fix it through bucket or Delete/Upsert operation.
- We'd lose the capability of call Delete/Upsert easily from the UI, as provisioned rules won't offer the delete 🗑️ icon. But we can still make a call by other means to fix/delete the rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this all sounds good to me. Thanks for writing this summary 👍
pkg/settings/recording/recording.go
Outdated
} | ||
return connect.NewResponse(res), nil | ||
|
||
// look for provisioned rules: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do this first and only return the one from the API store, when we do have no override.
pkg/settings/recording/recording.go
Outdated
return connect.NewResponse(&settingsv1.GetRecordingRuleResponse{Rule: r}), nil | ||
} | ||
} | ||
return nil, connect.NewError(connect.CodeNotFound, fmt.Errorf("no rule with id='%s' found", req.Msg.Id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah 404 feels correct
@@ -128,7 +155,7 @@ func (r *RecordingRules) DeleteRecordingRule(ctx context.Context, req *connect.R | |||
return nil, connect.NewError(connect.CodeInvalidArgument, fmt.Errorf("invalid request: %v", err)) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy the check from the Upsert method into here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted to leave upsert and delete as is. This way if a rule is overridden by config, we can still update/delete the original, so these two operations just ignore overrides and go directly to store
In this PR we provide tenant-settings the ability to load recording rules from config (apart from existing storage rules). So now it needs to handle rules from storage and rules from config. Note following properties:
I've detected some improvement points to do in a following PR (or maybe this one):
Not found
if requested rule Id is not present, it returns internal error instead.