Skip to content

Commit 753f3b8

Browse files
authored
Allow to override Alertmanager receivers firewall settings on a per-tenant basis (#4143)
* Allow to override Alertmanager receivers firewall settings on a per-tenant basis Signed-off-by: Marco Pracucci <[email protected]> * Updated doc Signed-off-by: Marco Pracucci <[email protected]>
1 parent c1d5418 commit 753f3b8

File tree

12 files changed

+153
-101
lines changed

12 files changed

+153
-101
lines changed

CHANGELOG.md

+3
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@
33
## master / unreleased
44

55
* [CHANGE] Querier / ruler: deprecated `-store.query-chunk-limit` CLI flag (and its respective YAML config option `max_chunks_per_query`) in favour of `-querier.max-fetched-chunks-per-query` (and its respective YAML config option `max_fetched_chunks_per_query`). The new limit specifies the maximum number of chunks that can be fetched in a single query from ingesters and long-term storage: the total number of actual fetched chunks could be 2x the limit, being independently applied when querying ingesters and long-term storage. #4125
6+
* [CHANGE] Alertmanager: allowed to configure the experimental receivers firewall on a per-tenant basis. The following CLI flags (and their respective YAML config options) have been changed and moved to the limits config section: #4143
7+
- `-alertmanager.receivers-firewall.block.cidr-networks` renamed to `-alertmanager.receivers-firewall-block-cidr-networks`
8+
- `-alertmanager.receivers-firewall.block.private-addresses` renamed to `-alertmanager.receivers-firewall-block-private-addresses`
69

710
## 1.9.0 in progress
811

docs/blocks-storage/production-tips.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -114,5 +114,7 @@ If the Alertmanager API is enabled, users with access to Cortex can autonomously
114114

115115
Despite hardening the system is out of the scope of Cortex, Cortex provides a basic built-in firewall to block connections created by Alertmanager receiver integrations:
116116

117-
- `-alertmanager.receivers-firewall.block.cidr-networks`
118-
- `-alertmanager.receivers-firewall.block.private-addresses`
117+
- `-alertmanager.receivers-firewall-block-cidr-networks`
118+
- `-alertmanager.receivers-firewall-block-private-addresses`
119+
120+
_These settings can also be overridden on a per-tenant basis via overrides specified in the [runtime config](../configuration/arguments.md#runtime-configuration-file)._

docs/configuration/config-file-reference.md

+12-14
Original file line numberDiff line numberDiff line change
@@ -1849,20 +1849,6 @@ The `alertmanager_config` configures the Cortex alertmanager.
18491849
# CLI flag: -alertmanager.max-recv-msg-size
18501850
[max_recv_msg_size: <int> | default = 16777216]
18511851
1852-
receivers_firewall:
1853-
block:
1854-
# Comma-separated list of network CIDRs to block in Alertmanager receiver
1855-
# integrations.
1856-
# CLI flag: -alertmanager.receivers-firewall.block.cidr-networks
1857-
[cidr_networks: <string> | default = ""]
1858-
1859-
# True to block private and local addresses in Alertmanager receiver
1860-
# integrations. It blocks private addresses defined by RFC 1918 (IPv4
1861-
# addresses) and RFC 4193 (IPv6 addresses), as well as loopback, local
1862-
# unicast and local multicast addresses.
1863-
# CLI flag: -alertmanager.receivers-firewall.block.private-addresses
1864-
[private_addresses: <boolean> | default = false]
1865-
18661852
# Shard tenants across multiple alertmanager instances.
18671853
# CLI flag: -alertmanager.sharding-enabled
18681854
[sharding_enabled: <boolean> | default = false]
@@ -4108,6 +4094,18 @@ The `limits_config` configures default and per-tenant limits imposed by Cortex s
41084094
# override is set, the encryption context will not be provided to S3. Ignored if
41094095
# the SSE type override is not set.
41104096
[s3_sse_kms_encryption_context: <string> | default = ""]
4097+
4098+
# Comma-separated list of network CIDRs to block in Alertmanager receiver
4099+
# integrations.
4100+
# CLI flag: -alertmanager.receivers-firewall-block-cidr-networks
4101+
[alertmanager_receivers_firewall_block_cidr_networks: <string> | default = ""]
4102+
4103+
# True to block private and local addresses in Alertmanager receiver
4104+
# integrations. It blocks private addresses defined by RFC 1918 (IPv4
4105+
# addresses) and RFC 4193 (IPv6 addresses), as well as loopback, local unicast
4106+
# and local multicast addresses.
4107+
# CLI flag: -alertmanager.receivers-firewall-block-private-addresses
4108+
[alertmanager_receivers_firewall_block_private_addresses: <boolean> | default = false]
41114109
```
41124110

41134111
### `redis_config`

pkg/alertmanager/alertmanager.go

+32-13
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ import (
4545
"github.com/prometheus/common/route"
4646

4747
"github.com/cortexproject/cortex/pkg/alertmanager/alertstore"
48+
"github.com/cortexproject/cortex/pkg/util/flagext"
4849
util_net "github.com/cortexproject/cortex/pkg/util/net"
4950
"github.com/cortexproject/cortex/pkg/util/services"
5051
)
@@ -61,13 +62,13 @@ const (
6162

6263
// Config configures an Alertmanager.
6364
type Config struct {
64-
UserID string
65-
Logger log.Logger
66-
Peer *cluster.Peer
67-
PeerTimeout time.Duration
68-
Retention time.Duration
69-
ExternalURL *url.URL
70-
ReceiversFirewall FirewallConfig
65+
UserID string
66+
Logger log.Logger
67+
Peer *cluster.Peer
68+
PeerTimeout time.Duration
69+
Retention time.Duration
70+
ExternalURL *url.URL
71+
Limits Limits
7172

7273
// Tenant-specific local directory where AM can store its state (notifications, silences, templates). When AM is stopped, entire dir is removed.
7374
TenantDataDir string
@@ -97,7 +98,6 @@ type Alertmanager struct {
9798
wg sync.WaitGroup
9899
mux *http.ServeMux
99100
registry *prometheus.Registry
100-
firewallDialer *util_net.FirewallDialer
101101

102102
// The Dispatcher is the only component we need to recreate when we call ApplyConfig.
103103
// Given its metrics don't have any variable labels we need to re-use the same metrics.
@@ -151,10 +151,6 @@ func New(cfg *Config, reg *prometheus.Registry) (*Alertmanager, error) {
151151
cfg: cfg,
152152
logger: log.With(cfg.Logger, "user", cfg.UserID),
153153
stop: make(chan struct{}),
154-
firewallDialer: util_net.NewFirewallDialer(util_net.FirewallDialerConfig{
155-
BlockCIDRNetworks: cfg.ReceiversFirewall.Block.CIDRNetworks,
156-
BlockPrivateAddresses: cfg.ReceiversFirewall.Block.PrivateAddresses,
157-
}),
158154
configHashMetric: promauto.With(reg).NewGauge(prometheus.GaugeOpts{
159155
Name: "alertmanager_config_hash",
160156
Help: "Hash of the currently loaded alertmanager configuration.",
@@ -326,7 +322,10 @@ func (am *Alertmanager) ApplyConfig(userID string, conf *config.Config, rawCfg s
326322
return d + waitFunc()
327323
}
328324

329-
integrationsMap, err := buildIntegrationsMap(conf.Receivers, tmpl, am.firewallDialer, am.logger)
325+
// Create a firewall binded to the per-tenant config.
326+
firewallDialer := util_net.NewFirewallDialer(newFirewallDialerConfigProvider(userID, am.cfg.Limits))
327+
328+
integrationsMap, err := buildIntegrationsMap(conf.Receivers, tmpl, firewallDialer, am.logger)
330329
if err != nil {
331330
return nil
332331
}
@@ -507,3 +506,23 @@ func (p *NilPeer) AddState(string, cluster.State, prometheus.Registerer) cluster
507506
type NilChannel struct{}
508507

509508
func (c *NilChannel) Broadcast([]byte) {}
509+
510+
type firewallDialerConfigProvider struct {
511+
userID string
512+
limits Limits
513+
}
514+
515+
func newFirewallDialerConfigProvider(userID string, limits Limits) firewallDialerConfigProvider {
516+
return firewallDialerConfigProvider{
517+
userID: userID,
518+
limits: limits,
519+
}
520+
}
521+
522+
func (p firewallDialerConfigProvider) BlockCIDRNetworks() []flagext.CIDR {
523+
return p.limits.AlertmanagerReceiversBlockCIDRNetworks(p.userID)
524+
}
525+
526+
func (p firewallDialerConfigProvider) BlockPrivateAddresses() bool {
527+
return p.limits.AlertmanagerReceiversBlockPrivateAddresses(p.userID)
528+
}

pkg/alertmanager/api_test.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -546,7 +546,7 @@ receivers:
546546
// Create the Multitenant Alertmanager.
547547
reg := prometheus.NewPedanticRegistry()
548548
cfg := mockAlertmanagerConfig(t)
549-
am, err := createMultitenantAlertmanager(cfg, nil, nil, alertStore, nil, log.NewNopLogger(), reg)
549+
am, err := createMultitenantAlertmanager(cfg, nil, nil, alertStore, nil, nil, log.NewNopLogger(), reg)
550550
require.NoError(t, err)
551551
require.NoError(t, services.StartAndAwaitRunning(context.Background(), am))
552552
defer services.StopAndAwaitTerminated(context.Background(), am) //nolint:errcheck

pkg/alertmanager/firewall.go

-26
This file was deleted.

pkg/alertmanager/multitenant.go

+23-11
Original file line numberDiff line numberDiff line change
@@ -101,12 +101,11 @@ func init() {
101101

102102
// MultitenantAlertmanagerConfig is the configuration for a multitenant Alertmanager.
103103
type MultitenantAlertmanagerConfig struct {
104-
DataDir string `yaml:"data_dir"`
105-
Retention time.Duration `yaml:"retention"`
106-
ExternalURL flagext.URLValue `yaml:"external_url"`
107-
PollInterval time.Duration `yaml:"poll_interval"`
108-
MaxRecvMsgSize int64 `yaml:"max_recv_msg_size"`
109-
ReceiversFirewall FirewallConfig `yaml:"receivers_firewall"`
104+
DataDir string `yaml:"data_dir"`
105+
Retention time.Duration `yaml:"retention"`
106+
ExternalURL flagext.URLValue `yaml:"external_url"`
107+
PollInterval time.Duration `yaml:"poll_interval"`
108+
MaxRecvMsgSize int64 `yaml:"max_recv_msg_size"`
110109

111110
// Enable sharding for the Alertmanager
112111
ShardingEnabled bool `yaml:"sharding_enabled"`
@@ -159,7 +158,6 @@ func (cfg *MultitenantAlertmanagerConfig) RegisterFlags(f *flag.FlagSet) {
159158

160159
cfg.AlertmanagerClient.RegisterFlagsWithPrefix("alertmanager.alertmanager-client", f)
161160
cfg.Persister.RegisterFlagsWithPrefix("alertmanager", f)
162-
cfg.ReceiversFirewall.RegisterFlagsWithPrefix("alertmanager.receivers-firewall", f)
163161
cfg.ShardingRing.RegisterFlags(f)
164162
cfg.Store.RegisterFlags(f)
165163
cfg.Cluster.RegisterFlags(f)
@@ -215,6 +213,17 @@ func newMultitenantAlertmanagerMetrics(reg prometheus.Registerer) *multitenantAl
215213
return m
216214
}
217215

216+
// Limits defines limits used by Alertmanager.
217+
type Limits interface {
218+
// AlertmanagerReceiversBlockCIDRNetworks returns the list of network CIDRs that should be blocked
219+
// in the Alertmanager receivers for the given user.
220+
AlertmanagerReceiversBlockCIDRNetworks(user string) []flagext.CIDR
221+
222+
// AlertmanagerReceiversBlockPrivateAddresses returns true if private addresses should be blocked
223+
// in the Alertmanager receivers for the given user.
224+
AlertmanagerReceiversBlockPrivateAddresses(user string) bool
225+
}
226+
218227
// A MultitenantAlertmanager manages Alertmanager instances for multiple
219228
// organizations.
220229
type MultitenantAlertmanager struct {
@@ -257,6 +266,8 @@ type MultitenantAlertmanager struct {
257266
peer *cluster.Peer
258267
alertmanagerClientsPool ClientsPool
259268

269+
limits Limits
270+
260271
registry prometheus.Registerer
261272
ringCheckErrors prometheus.Counter
262273
tenantsOwned prometheus.Gauge
@@ -266,7 +277,7 @@ type MultitenantAlertmanager struct {
266277
}
267278

268279
// NewMultitenantAlertmanager creates a new MultitenantAlertmanager.
269-
func NewMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, store alertstore.AlertStore, logger log.Logger, registerer prometheus.Registerer) (*MultitenantAlertmanager, error) {
280+
func NewMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, store alertstore.AlertStore, limits Limits, logger log.Logger, registerer prometheus.Registerer) (*MultitenantAlertmanager, error) {
270281
err := os.MkdirAll(cfg.DataDir, 0777)
271282
if err != nil {
272283
return nil, fmt.Errorf("unable to create Alertmanager data directory %q: %s", cfg.DataDir, err)
@@ -326,10 +337,10 @@ func NewMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, store alerts
326337
}
327338
}
328339

329-
return createMultitenantAlertmanager(cfg, fallbackConfig, peer, store, ringStore, logger, registerer)
340+
return createMultitenantAlertmanager(cfg, fallbackConfig, peer, store, ringStore, limits, logger, registerer)
330341
}
331342

332-
func createMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, fallbackConfig []byte, peer *cluster.Peer, store alertstore.AlertStore, ringStore kv.Client, logger log.Logger, registerer prometheus.Registerer) (*MultitenantAlertmanager, error) {
343+
func createMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, fallbackConfig []byte, peer *cluster.Peer, store alertstore.AlertStore, ringStore kv.Client, limits Limits, logger log.Logger, registerer prometheus.Registerer) (*MultitenantAlertmanager, error) {
333344
am := &MultitenantAlertmanager{
334345
cfg: cfg,
335346
fallbackConfig: string(fallbackConfig),
@@ -341,6 +352,7 @@ func createMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, fallbackC
341352
store: store,
342353
logger: log.With(logger, "component", "MultiTenantAlertmanager"),
343354
registry: registerer,
355+
limits: limits,
344356
ringCheckErrors: promauto.With(registerer).NewCounter(prometheus.CounterOpts{
345357
Name: "cortex_alertmanager_ring_check_errors_total",
346358
Help: "Number of errors that have occurred when checking the ring for ownership.",
@@ -877,7 +889,7 @@ func (am *MultitenantAlertmanager) newAlertmanager(userID string, amConfig *amco
877889
ReplicationFactor: am.cfg.ShardingRing.ReplicationFactor,
878890
Store: am.store,
879891
PersisterConfig: am.cfg.Persister,
880-
ReceiversFirewall: am.cfg.ReceiversFirewall,
892+
Limits: am.limits,
881893
}, reg)
882894
if err != nil {
883895
return nil, fmt.Errorf("unable to start Alertmanager for user %v: %v", userID, err)

0 commit comments

Comments
 (0)