-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative to scaling down ingesters #6144
Comments
Setting up querier.query-store-after=0 is definitely a bad idea. I mentioned it in #5121 So If I understand correctly, you could have many ingesters on readonly and get rid of them at once. Is that it? I think "/ingester/blocks" is probably not the right name. Something like local blocks or maybe /ingester/blocks?local=true, etc. This endpoint returns empty results when its ready to be deleted. right? This requires more thought. I can see multiple edge cases and failure scenarios. |
Agree, the Correct, the idea is to have ingesters on READONLY mode and you can terminate them whenever you want. For example if your Basic usage of it would be T0: Ingester 5,6,7 are set to READONLY Any time after T2 it would be safe to remove ingesters without any service impact The idea of the |
@danielblando thanks for explaining more. |
Is your feature request related to a problem? Please describe.
Today, scaling down ingesters is a complicated and highly manual process. As described on the website, scaling down requires ensuring that blocks are flushed to storage and that queries use the stored data set to 0s (.query-store-after). However, this approach is not suitable for all use cases, as in some scenarios, we want to utilize ingesters for querying as well, to improve request performance.
Describe the solution you'd like
Automating the scale down of ingesters is not a trivial task. It is desirable for ingesters to have a mechanism that allows users to scale them down gradually without missing data.
A proposed solution is to introduce a new state for ingesters called READONLY. In this state, ingesters cannot receive data, meaning all Push requests would fail, but they can still accept query data. Cortex would use the ring operation to filter out the correct ingesters by state, allowing the distributor and query/ruler to use the appropriate set of ingesters.
To enable users to set an ingester to READONLY mode, ingesters would have a new API that allows them to transition to READONLY or ACTIVE. It would be permissible for an ingester to return to ACTIVE mode as a way to cancel a scale down if needed.
Furthermore, to allow ingesters to be safely removed from the ring, they would also have a new API that lets users know which blocks an ingester has loaded. The idea is that when an ingester has deleted all blocks, it can be stopped.
This approach introduces a new READONLY state for ingesters, enabling a controlled scale down process without data loss. Users can transition ingesters to READONLY mode, preventing new data ingestion while allowing queries on existing data. Once an ingester has deleted all its blocks, it can be safely stopped and removed from the ring.
Describe alternatives you've considered
Using the LEAVING state as READONLY. This was discarded as the LEAVING state already has multiple logics and premises on why the pod is in that state, which could make the code more confusing.
Not having the /ingester/blocks endpoint and using the .query-store-after configuration to scale down ingesters. While this can still work, it adds complexity for the user as they would need to track the time, ensure the configuration hasn't changed, and account for failures in ingesters pushing blocks to storage.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: