Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing Stale alert manager templates files #4495

Merged
merged 5 commits into from
Oct 12, 2021

Conversation

alanprot
Copy link
Member

@alanprot alanprot commented Sep 23, 2021

What this PR does:

Currently, if we remove a template file from the alert manager config, this file will continue to be dangling in the filesystem. This PR just introduce a code to remove those dangling files.
Leaving these files there is problematic as removing some files from the config will not trigger the AlertManager reload - as the hasChanged will not be true - or when we are using patterns as below:

template_files:
  first.tpl: |
    {{ define "__alertmanager" }}AlertManager{{ end }}
    {{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver | urlquery }}{{ end }}
  second.tpl: |
    {{ define "__alertmanager" }}AlertManager{{ end }}
    {{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver | urlquery }}{{ end }}
alertmanager_config: |
  global:
    smtp_smarthost: 'localhost:25'
    smtp_from: '[email protected]'
  templates:
    - '*.tpl'
  route:
    receiver: example-email
  receivers:
    - name: example-email
      email_configs:
      - to: '[email protected]'

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • [NA] Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@alanprot alanprot force-pushed the remove-stale-template-file branch 5 times, most recently from e9a7172 to ca09f05 Compare September 24, 2021 00:03
Copy link
Contributor

@alvinlin123 alvinlin123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems ok; I'm not very familiar with this feature.
Is there any danger that there could be a file there for some other reason?
Could someone remove a template, then decide to use it again but you deleted it?

var userTemplateDir = filepath.Join(am.getTenantDirectory(cfg.User), templatesDir)
var filesToRemove = make(map[string]string)

if oldTemplateFiles, err := ioutil.ReadDir(userTemplateDir); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this deserves a comment like "first we list all the files, then we will remove those from the list that are in use, then we will remove the files that are never used".

Copy link
Member Author

@alanprot alanprot Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any danger that there could be a file there for some other reason?

I dont think so. Cortex expose an api to configure alert manager for a tenant and this API receives the template files and the prometheus AlertManager configuration on the same request. On a regular prometheus alert manager, the config file are stored in the filesystem and referenced by the configuration. Cortex implementation receives both on the same request, drops the templates files on the filesystem and reference them on the alert manager config. The folder where cortex drops the file is exclusive for a given tenant/alert manager.

Could someone remove a template, then decide to use it again but you deleted it?

As we configure the templates and the alert manager on the same request, we should never have a alert manager configuration that uses a template that is not defined in the same request.

I think this deserves a comment like "first we list all the files, then we will remove those from the list that are in use, then we will remove the files that are never used".

Make sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comments

Copy link
Contributor

@stevesg stevesg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a minor suggestion.

@alanprot alanprot force-pushed the remove-stale-template-file branch from f33bf7d to 412be4a Compare October 4, 2021 17:15
alanprot and others added 4 commits October 4, 2021 10:17
Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>
Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>
Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>
@alanprot alanprot force-pushed the remove-stale-template-file branch from 412be4a to c408466 Compare October 4, 2021 17:18
Signed-off-by: Alan Protasio <[email protected]>
@alanprot alanprot force-pushed the remove-stale-template-file branch from c408466 to 1b31f5a Compare October 4, 2021 17:24
@alanprot
Copy link
Member Author

alanprot commented Oct 4, 2021

@stevesg done! :D

Copy link
Contributor

@stevesg stevesg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looks good to me.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@alvinlin123 alvinlin123 merged commit 464c424 into cortexproject:master Oct 12, 2021
srijan55 pushed a commit to srijan55/cortex that referenced this pull request Nov 26, 2021
* Removing Stale alert manager templates files

Signed-off-by: Alan Protasio <[email protected]>

* Update pkg/alertmanager/multitenant.go

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>

* Update pkg/alertmanager/multitenant.go

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>

* Update pkg/alertmanager/multitenant.go

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>

* Address comments

Signed-off-by: Alan Protasio <[email protected]>

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Manish Kumar Gupta <[email protected]>
alvinlin123 pushed a commit to ac1214/cortex that referenced this pull request Jan 14, 2022
* Removing Stale alert manager templates files

Signed-off-by: Alan Protasio <[email protected]>

* Update pkg/alertmanager/multitenant.go

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>

* Update pkg/alertmanager/multitenant.go

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>

* Update pkg/alertmanager/multitenant.go

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alan Protasio <[email protected]>

* Address comments

Signed-off-by: Alan Protasio <[email protected]>

Co-authored-by: Steve Simpson <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants