Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store-gateway blocks resharding during rollout #2823

Closed
pracucci opened this issue Jul 1, 2020 · 9 comments
Closed

Store-gateway blocks resharding during rollout #2823

pracucci opened this issue Jul 1, 2020 · 9 comments
Labels
stale storage/blocks Blocks storage engine

Comments

@pracucci
Copy link
Contributor

pracucci commented Jul 1, 2020

When running the blocks storage, store-gateways reshard blocks whenever the ring topology changes. This means that during a rollout of the store-gateways (ie. deploy a config change or version upgrade) blocks are resharded across instances.

This is highly inefficient in a cluster with a large number of tenants or few very large tenants. Ideally, no blocks resharding should occur during a rollout (if the blocks replication factor is > 1).

Rollouts

We could improve the system to avoid the blocks resharding when the following conditions are met:

  • -experimental.store-gateway.replication-factor is > 1 (so that while a store-gateway restarts all of its blocks are replicated to at least another instance)
  • -experimental.store-gateway.tokens-file-path is configured (so that previous tokens are picked up on restart)
  • The store-gateway instance ID is stable across restarts (ie. Kubernetes Statfulsets)

To avoid blocks resharding during store-gateways rollout we need the restarting store-gateway instance to not be unregistered from the ring during the restart.

When a store-gateway shutdowns, the instance could be left in the LEAVING state within the ring and we could change the BlocksReplicationStrategy.ShouldExtendReplicaSet() to not extend the replica set if an instance is in the LEAVING state.

This means that during the rollout, for the blocks hold by the restarting replica there will be N-1 replicas (contrary to the N desired replicas configured). Once the instance restarts, it will have the same instance ID and same tokens (assuming tokens-file-path is configured) and thus will replace its state from LEAVING to JOINING within the ring.

Scale down

There's no way to distinguish between a rollout and a scale down: the process just receives a termination signal.

This means that during a scale down, the instance would be left in the LEAVING state within the ring. However, the store-gateway has an auto-forget feature which removes unhealthy instances after 10x heartbeat timeouts (default: 1m timeout = 10m before an unhealthy instance is forgotten).

A scale down of a number of instance < replication factor could leverage on the auto-forget. However, there's no easy way to have a smooth scale down unless we'll have a way to signal the process whether it's going to shutdown because of a scale down or rollout.

Crashes

In case a store-gateway crashes, there would be no difference compared to today.

@pracucci pracucci added the storage/blocks Blocks storage engine label Jul 1, 2020
@stale
Copy link

stale bot commented Aug 30, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 30, 2020
@pracucci
Copy link
Contributor Author

Still valid

@stale stale bot removed the stale label Sep 14, 2020
@gouthamve gouthamve added the keepalive Skipped by stale bot label Sep 28, 2020
@pracucci pracucci removed the keepalive Skipped by stale bot label Sep 28, 2020
@stale
Copy link

stale bot commented Nov 27, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 27, 2020
@pracucci
Copy link
Contributor Author

pracucci commented Dec 1, 2020

Still valid

@stale stale bot removed the stale label Dec 1, 2020
@andrejbranch
Copy link

Hello I'm working on this issue and have a merge request here #3604

I have a couple questions:

  1. Should extend writes be a flag or should we simply say if replication factor is > 1 do not extend writes
  2. The current store gateway config does not include a lifecycler config such as ingester config does. Currently I'm putting the store gateways new flag unregister_on_shutdown under RingConfig. To be more consistent with @csmarchbanks change we could add a lifecycler config to the store gateway config but this would be a breaking change.

Appreciate any input thanks

@stale
Copy link

stale bot commented Jun 16, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 16, 2021
@jtlisi
Copy link
Contributor

jtlisi commented Jun 17, 2021

@pracucci Is this still valid after your recent PR #4271. It seems to me that it is, although some aspects of this may be more graceful with the new behavior.

@stale stale bot removed the stale label Jun 17, 2021
@pracucci
Copy link
Contributor Author

@pracucci Is this still valid after your recent PR #4271. It seems to me that it is, although some aspects of this may be more graceful with the new behavior.

It's still valid because the store-gateway unregisters from the ring at shutdown and it triggers a resharding. The work done in #4271 should be a good foundation to also solve this issue, adding the ability to disable "unregister from ring at shutdown" like we did in the ingesters.

@stale
Copy link

stale bot commented Sep 16, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 16, 2021
@stale stale bot closed this as completed Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale storage/blocks Blocks storage engine
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants