Skip to content

Saner defaults for configs #2344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 30, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,16 @@
* [CHANGE] Frontend worker in querier now starts after all Querier module dependencies are started. This fixes issue where frontend worker started to send queries to querier before it was ready to serve them (mostly visible when using experimental blocks storage). #2246
* [CHANGE] Lifecycler component now enters Failed state on errors, and doesn't exit the process. (Important if you're vendoring Cortex and use Lifecycler) #2251
* [CHANGE] `/ready` handler now returns 200 instead of 204. #2330
* [CHANGE] Better defaults for the following options:
- `-<prefix>.consul.consistent-reads`: Old default: `true`, new default: `false`. This reduces the load on Consul.
- `-<prefix>.consul.watch-rate-limit`: Old default: 0, new default: 1. This rate limits the reads to 1 per second. Which is good enough for ring watches.
- `-distributor.health-check-ingesters`: Old default: `false`, new default: `true`.
- `-ingester.max-stale-chunk-idle`: Old default: 0, new default: 2m. This lets us expire series that we know are stale early.
- `-ingester.num-tokens`: Old default: 128, new default: 128. This allows for better load distribution in large clusters.
- `-ingester.spread-flushes`: Old default: false, new default: true. This allows to better de-duplicate data and use less space.
- `-<prefix>.memcached.batchsize`: Old default: 0, new default: 1024. This allows batching of requests and keeps the concurrent requests low.
- `-<prefix>.memcached.consistent-hash`: Old default: false, new default: true. This allows for better cache hits when the memcaches are scaled up and down.
- `-querier.batch-iterators`: Old default: false, new default: true.
* [FEATURE] Added experimental storage API to the ruler service that is enabled when the `-experimental.ruler.enable-api` is set to true #2269
* `-ruler.storage.type` flag now allows `s3`,`gcs`, and `azure` values
* `-ruler.storage.(s3|gcs|azure)` flags exist to allow the configuration of object clients set for rule storage
Expand Down
14 changes: 7 additions & 7 deletions docs/configuration/config-file-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ pool:

# Run a health check on each ingester client during periodic cleanup.
# CLI flag: -distributor.health-check-ingesters
[health_check_ingesters: <boolean> | default = false]
[health_check_ingesters: <boolean> | default = true]

ha_tracker:
# Enable the distributors HA tracker so that it can accept samples from
Expand Down Expand Up @@ -505,7 +505,7 @@ lifecycler:

# Number of tokens for each ingester.
# CLI flag: -ingester.num-tokens
[num_tokens: <int> | default = 128]
[num_tokens: <int> | default = 512]

# Period at which to heartbeat to consul.
# CLI flag: -ingester.heartbeat-period
Expand Down Expand Up @@ -560,7 +560,7 @@ lifecycler:
# flushing. 0 disables it and a stale series is not flushed until the
# max-chunk-idle timeout is reached.
# CLI flag: -ingester.max-stale-chunk-idle
[max_stale_chunk_idle_time: <duration> | default = 0s]
[max_stale_chunk_idle_time: <duration> | default = 2m0s]

# Timeout for individual flush operations.
# CLI flag: -ingester.flush-op-timeout
Expand All @@ -581,7 +581,7 @@ lifecycler:
# If true, spread series flushes across the whole period of
# -ingester.max-chunk-age.
# CLI flag: -ingester.spread-flushes
[spread_flushes: <boolean> | default = false]
[spread_flushes: <boolean> | default = true]

# Period with which to update the per-user ingestion rates.
# CLI flag: -ingester.rate-update-period
Expand Down Expand Up @@ -609,7 +609,7 @@ The `querier_config` configures the Cortex querier.
# Use batch iterators to execute query, as opposed to fully materialising the
# series in memory. Takes precedent over the -querier.iterators flag.
# CLI flag: -querier.batch-iterators
[batch_iterators: <boolean> | default = false]
[batch_iterators: <boolean> | default = true]

# Use streaming RPCs to query ingester.
# CLI flag: -querier.ingester-streaming
Expand Down Expand Up @@ -1927,7 +1927,7 @@ The `consul_config` configures the consul client. The supported CLI flags `<pref

# Enable consistent reads to Consul.
# CLI flag: -<prefix>.consul.consistent-reads
[consistent_reads: <boolean> | default = true]
[consistent_reads: <boolean> | default = false]

# Rate limit when watching key or prefix in Consul, in requests per second. 0
# disables the rate limit.
Expand Down Expand Up @@ -2251,7 +2251,7 @@ The `memcached_client_config` configures the client used to connect to Memcached

# Use consistent hashing to distribute to memcache servers.
# CLI flag: -<prefix>.memcached.consistent-hash
[consistent_hash: <boolean> | default = false]
[consistent_hash: <boolean> | default = true]
```

### `fifo_cache_config`
Expand Down
2 changes: 1 addition & 1 deletion pkg/chunk/cache/memcached_client.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ func (cfg *MemcachedClientConfig) RegisterFlagsWithPrefix(prefix, description st
f.IntVar(&cfg.MaxIdleConns, prefix+"memcached.max-idle-conns", 16, description+"Maximum number of idle connections in pool.")
f.DurationVar(&cfg.Timeout, prefix+"memcached.timeout", 100*time.Millisecond, description+"Maximum time to wait before giving up on memcached requests.")
f.DurationVar(&cfg.UpdateInterval, prefix+"memcached.update-interval", 1*time.Minute, description+"Period with which to poll DNS for memcache servers.")
f.BoolVar(&cfg.ConsistentHash, prefix+"memcached.consistent-hash", false, description+"Use consistent hashing to distribute to memcache servers.")
f.BoolVar(&cfg.ConsistentHash, prefix+"memcached.consistent-hash", true, description+"Use consistent hashing to distribute to memcache servers.")
}

// NewMemcachedClient creates a new MemcacheClient that gets its server list
Expand Down
8 changes: 8 additions & 0 deletions pkg/distributor/distributor_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -897,6 +897,14 @@ type mockIngester struct {
queryDelay time.Duration
}

func (i *mockIngester) Check(ctx context.Context, in *grpc_health_v1.HealthCheckRequest, opts ...grpc.CallOption) (*grpc_health_v1.HealthCheckResponse, error) {
return &grpc_health_v1.HealthCheckResponse{}, nil
}

func (i *mockIngester) Close() error {
return nil
}

func (i *mockIngester) Push(ctx context.Context, req *client.WriteRequest, opts ...grpc.CallOption) (*client.WriteResponse, error) {
i.Lock()
defer i.Unlock()
Expand Down
2 changes: 1 addition & 1 deletion pkg/ingester/client/pool.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ type PoolConfig struct {
// RegisterFlags adds the flags required to config this to the given FlagSet.
func (cfg *PoolConfig) RegisterFlags(f *flag.FlagSet) {
f.DurationVar(&cfg.ClientCleanupPeriod, "distributor.client-cleanup-period", 15*time.Second, "How frequently to clean up clients for ingesters that have gone away.")
f.BoolVar(&cfg.HealthCheckIngesters, "distributor.health-check-ingesters", false, "Run a health check on each ingester client during periodic cleanup.")
f.BoolVar(&cfg.HealthCheckIngesters, "distributor.health-check-ingesters", true, "Run a health check on each ingester client during periodic cleanup.")
}

// Pool holds a cache of grpc_health_v1 clients.
Expand Down
4 changes: 2 additions & 2 deletions pkg/ingester/ingester.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,10 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.DurationVar(&cfg.RetainPeriod, "ingester.retain-period", 5*time.Minute, "Period chunks will remain in memory after flushing.")
f.DurationVar(&cfg.FlushOpTimeout, "ingester.flush-op-timeout", 1*time.Minute, "Timeout for individual flush operations.")
f.DurationVar(&cfg.MaxChunkIdle, "ingester.max-chunk-idle", 5*time.Minute, "Maximum chunk idle time before flushing.")
f.DurationVar(&cfg.MaxStaleChunkIdle, "ingester.max-stale-chunk-idle", 0, "Maximum chunk idle time for chunks terminating in stale markers before flushing. 0 disables it and a stale series is not flushed until the max-chunk-idle timeout is reached.")
f.DurationVar(&cfg.MaxStaleChunkIdle, "ingester.max-stale-chunk-idle", 2*time.Minute, "Maximum chunk idle time for chunks terminating in stale markers before flushing. 0 disables it and a stale series is not flushed until the max-chunk-idle timeout is reached.")
f.DurationVar(&cfg.MaxChunkAge, "ingester.max-chunk-age", 12*time.Hour, "Maximum chunk age before flushing.")
f.DurationVar(&cfg.ChunkAgeJitter, "ingester.chunk-age-jitter", 20*time.Minute, "Range of time to subtract from -ingester.max-chunk-age to spread out flushes")
f.BoolVar(&cfg.SpreadFlushes, "ingester.spread-flushes", false, "If true, spread series flushes across the whole period of -ingester.max-chunk-age.")
f.BoolVar(&cfg.SpreadFlushes, "ingester.spread-flushes", true, "If true, spread series flushes across the whole period of -ingester.max-chunk-age.")
f.IntVar(&cfg.ConcurrentFlushes, "ingester.concurrent-flushes", 50, "Number of concurrent goroutines flushing to dynamodb.")
f.DurationVar(&cfg.RateUpdatePeriod, "ingester.rate-update-period", 15*time.Second, "Period with which to update the per-user ingestion rates.")
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/querier/querier.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.DurationVar(&promql.LookbackDelta, "promql.lookback-delta", promql.LookbackDelta, "Time since the last sample after which a time series is considered stale and ignored by expression evaluations.")
}
f.BoolVar(&cfg.Iterators, "querier.iterators", false, "Use iterators to execute query, as opposed to fully materialising the series in memory.")
f.BoolVar(&cfg.BatchIterators, "querier.batch-iterators", false, "Use batch iterators to execute query, as opposed to fully materialising the series in memory. Takes precedent over the -querier.iterators flag.")
f.BoolVar(&cfg.BatchIterators, "querier.batch-iterators", true, "Use batch iterators to execute query, as opposed to fully materialising the series in memory. Takes precedent over the -querier.iterators flag.")
f.BoolVar(&cfg.IngesterStreaming, "querier.ingester-streaming", false, "Use streaming RPCs to query ingester.")
f.IntVar(&cfg.MaxSamples, "querier.max-samples", 50e6, "Maximum number of samples a single query can load into memory.")
f.DurationVar(&cfg.QueryIngestersWithin, "querier.query-ingesters-within", 0, "Maximum lookback beyond which queries are not sent to ingester. 0 means all queries are sent to ingester.")
Expand Down
2 changes: 1 addition & 1 deletion pkg/ring/kv/consul/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet, prefix string) {
f.StringVar(&cfg.Host, prefix+"consul.hostname", "localhost:8500", "Hostname and port of Consul.")
f.StringVar(&cfg.ACLToken, prefix+"consul.acl-token", "", "ACL Token used to interact with Consul.")
f.DurationVar(&cfg.HTTPClientTimeout, prefix+"consul.client-timeout", 2*longPollDuration, "HTTP timeout when talking to Consul")
f.BoolVar(&cfg.ConsistentReads, prefix+"consul.consistent-reads", true, "Enable consistent reads to Consul.")
f.BoolVar(&cfg.ConsistentReads, prefix+"consul.consistent-reads", false, "Enable consistent reads to Consul.")
f.Float64Var(&cfg.WatchKeyRateLimit, prefix+"consul.watch-rate-limit", 0, "Rate limit when watching key or prefix in Consul, in requests per second. 0 disables the rate limit.")
f.IntVar(&cfg.WatchKeyBurstSize, prefix+"consul.watch-burst-size", 1, "Burst size used in rate limit. Values less than 1 are treated as 1.")
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/ring/lifecycler.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ func (cfg *LifecyclerConfig) RegisterFlagsWithPrefix(prefix string, f *flag.Flag
prefix = "ingester."
}

f.IntVar(&cfg.NumTokens, prefix+"num-tokens", 128, "Number of tokens for each ingester.")
f.IntVar(&cfg.NumTokens, prefix+"num-tokens", 512, "Number of tokens for each ingester.")
f.DurationVar(&cfg.HeartbeatPeriod, prefix+"heartbeat-period", 5*time.Second, "Period at which to heartbeat to consul.")
f.DurationVar(&cfg.JoinAfter, prefix+"join-after", 0*time.Second, "Period to wait for a claim from another member; will join automatically after this.")
f.DurationVar(&cfg.ObservePeriod, prefix+"observe-period", 0*time.Second, "Observe tokens after generating to resolve collisions. Useful when using gossiping ring.")
Expand Down