-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add api to list all alertmanager configs and rule groups #3529
Conversation
Thanks @AllenzhLi for your PR. @gotjosh could you do the first pass review, please? |
Hi @AllenzhLi, apologies for the delay on this and thank you for your contribution - I have it on agenda to review this soon. Before I do, could you please add at least an unit test to this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! A couple of quick things I would add:
- A CHANGELOG entry
- Documentation of new endpoints to
docs/api/_index.md
- Unit tests
465a3fb
to
e776b50
Compare
Signed-off-by: allenzhli <[email protected]>
The above is done. |
Signed-off-by: allenzhli <[email protected]>
Signed-off-by: allenzhli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on it, and apologies for super-slow review cycle.
I've tried to test this code (built Cortex, and started Alertmanager), but unfortunately it doesn't work. Reason is that StreamWriteResponse
expects that response writer implements http.Flusher
, but Cortex uses response writer internally, which doesn't do that. In that case StreamWriteResponse
returns error, and never drains the channel (iter
), which causes goroutines trying to call Put
(send to channel) block forever.
Please do the following changes:
- remove
RespIterator
- change
StreamWriteResponse
to not requirehttp.Flusher
. If response implementshttp.Flusher
, then this method can callFlush
method, but it should not require it. - change functions passed to
concurrency.ForEachUser
in bothListAllConfigs
andListAllRules
as follows: Instead of callingiter.Put(data)
, use the following idiom (assumingiter
is now channel):
select {
case iter <- data:
case <-done:
}
This allows both ListAllConfigs
and ListAllRules
to stop early, if sending response has already finished.
- Instead of using
done <- struct{}{}
to signal that response has been sent, please useclose(done)
instead. The difference is that closed channel can be read repeatedly without blocking (it returns default value, and optional bool to indicate if read succeeded or channel was closed), while if we send empty struct to channel, this can be only read once -- perhaps by the select as shown above, but then additional read<-done
at the end ofListAllConfigs
/ListAllRules
would never end.
Signed-off-by: allenzhli <[email protected]>
Only response implements If no Do you have any other suggestions? |
This sounds very suspicious. It’s perfectly valid to call |
Signed-off-by: allenzhli <[email protected]>
Signed-off-by: allenzhli <[email protected]>
Signed-off-by: allenzhli <[email protected]>
You are right. I have make a misstake. The new commit have implement stream write yaml to I also update I have make a three tenants alertmanger test, and use chrome access the
|
Just now, I have test one hundred tenants alertmanager instance, work well. When I increase the number of instance to three hundred, get err My test env is 8 core, 16G mem. Seems the problem is the |
When I update the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing my feedback. I have few more comments, but this PR is almost ready now.
Signed-off-by: allenzhli <[email protected]>
Signed-off-by: allenzhli <[email protected]>
Latest changes added two extra files:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is in a good state now (after fixing extra files and typos).
Thank you very much for working on this!
Signed-off-by: allenzhli <[email protected]>
Signed-off-by: allenzhli <[email protected]>
Aleady fixing extra files and typos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! LGTM modulo few last comments. Once addressed we can merge. Thanks for your patience! 🙏 🙇
Co-authored-by: Marco Pracucci <[email protected]> Update CHANGELOG.md Co-authored-by: Marco Pracucci <[email protected]> Update pkg/ruler/ruler_test.go Co-authored-by: Marco Pracucci <[email protected]> Update pkg/util/http_test.go Co-authored-by: Marco Pracucci <[email protected]> Update pkg/alertmanager/api_test.go Co-authored-by: Marco Pracucci <[email protected]> Update pkg/alertmanager/api_test.go Co-authored-by: Marco Pracucci <[email protected]> Update pkg/alertmanager/api.go Co-authored-by: Marco Pracucci <[email protected]> fix review comments Signed-off-by: allenzhli <[email protected]>
Thank you very much for your detailed review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to patiently address our feedback! ❤️ 🙏
What this PR does:
add api to list all user alertmangaer configs
Which issue(s) this PR fixes:
Fixes #3530
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]