-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "wait ring stability" to store-gateway and fix cold start issue #4271
Changes from all commits
69fd791
d5cb004
e6c6a1f
084556f
1f0ca99
c06ce07
7841a07
43ca5f2
827c8b9
4ca9139
6083de6
68cacc1
349afbb
05c4265
7c8af95
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -81,6 +81,14 @@ The store-gateway replication optionally supports [zone-awareness](../guides/zon | |
2. Enable blocks zone-aware replication via the `-store-gateway.sharding-ring.zone-awareness-enabled` CLI flag (or its respective YAML config option). Please be aware this configuration option should be set to store-gateways, queriers and rulers. | ||
3. Rollout store-gateways, queriers and rulers to apply the new configuration | ||
|
||
### Waiting for stable ring at startup | ||
|
||
In the event of a cluster cold start or scale up of 2+ store-gateway instances at the same time we may end up in a situation where each new store-gateway instance starts at a slightly different time and thus each one runs the initial blocks sync based on a different state of the ring. For example, in case of a cold start, the first store-gateway joining the ring may load all blocks since the sharding logic runs based on the current state of the ring, which is 1 single store-gateway. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above. |
||
|
||
To reduce the likelihood this could happen, the store-gateway waits for a stable ring at startup. A ring is considered stable if no instance is added/removed to the ring for at least `-store-gateway.sharding-ring.wait-stability-min-duration`. If the ring keep getting changed after `-store-gateway.sharding-ring.wait-stability-max-duration`, the store-gateway will stop waiting for a stable ring and will proceed starting up normally. | ||
|
||
To disable this waiting logic, you can start the store-gateway with `-store-gateway.sharding-ring.wait-stability-min-duration=0`. | ||
|
||
## Blocks index-header | ||
|
||
The [index-header](./binary-index-header.md) is a subset of the block index which the store-gateway downloads from the object storage and keeps on the local disk in order to speed up queries. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,8 +3,10 @@ package storegateway | |
import ( | ||
"context" | ||
|
||
"github.com/gogo/protobuf/types" | ||
"github.com/pkg/errors" | ||
"github.com/prometheus/prometheus/storage" | ||
"github.com/thanos-io/thanos/pkg/store/hintspb" | ||
"github.com/thanos-io/thanos/pkg/store/storepb" | ||
) | ||
|
||
|
@@ -19,6 +21,7 @@ type bucketStoreSeriesServer struct { | |
|
||
SeriesSet []*storepb.Series | ||
Warnings storage.Warnings | ||
Hints hintspb.SeriesResponseHints | ||
} | ||
|
||
func newBucketStoreSeriesServer(ctx context.Context) *bucketStoreSeriesServer { | ||
|
@@ -30,6 +33,13 @@ func (s *bucketStoreSeriesServer) Send(r *storepb.SeriesResponse) error { | |
s.Warnings = append(s.Warnings, errors.New(r.GetWarning())) | ||
} | ||
|
||
if rawHints := r.GetHints(); rawHints != nil { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this change related? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. It's just used by tests. I've added the ability to read hints too. |
||
// We expect only 1 hints entry so we just keep 1. | ||
if err := types.UnmarshalAny(rawHints, &s.Hints); err != nil { | ||
return errors.Wrap(err, "failed to unmarshal series hints") | ||
} | ||
} | ||
|
||
if recvSeries := r.GetSeries(); recvSeries != nil { | ||
// Thanos uses a pool for the chunks and may use other pools in the future. | ||
// Given we need to retain the reference after the pooled slices are recycled, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, shouldn't this be "greater than or equal to
replication_factor
store-gateway instances at the same time"?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends. For a cold start, yes (because if you have a number of replicas <= RF then all replicas load all blocks). For the scale up case you may have a RF=3 and scale up by +2 and this PR still improve it cause the 2 new replicas will not load extra blocks they will not need anymore once they will be both ACTIVE in the ring (after the initial sync is completed).