You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ADOPTERS.md
+1
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,7 @@
2
2
3
3
This is the list of organisations that are using Cortex in **production environments** to power their metrics and monitoring systems. Please send PRs to add or remove organisations.
4
4
5
+
*[Amazon Web Services (AWS)](https://aws.amazon.com/prometheus)
Copy file name to clipboardExpand all lines: CHANGELOG.md
+8-1
Original file line number
Diff line number
Diff line change
@@ -6,17 +6,24 @@
6
6
*[CHANGE] Blocks storage: compactor is now required when running a Cortex cluster with the blocks storage, because it also keeps the bucket index updated. #3583
7
7
*[CHANGE] Blocks storage: block deletion marks are now stored in a per-tenant global markers/ location too, other than within the block location. The compactor, at startup, will copy deletion marks from the block location to the global location. This migration is required only once, so you can safely disable it via `-compactor.block-deletion-marks-migration-enabled=false` once new compactor has successfully started once in your cluster. #3583
8
8
*[ENHANCEMENT] Blocks storage: introduced a per-tenant bucket index, periodically updated by the compactor, used to avoid full bucket scanning done by queriers and store-gateways. The bucket index is updated by the compactor during blocks cleanup, on every `-compactor.cleanup-interval`. #3553#3555#3561#3583
9
+
*[ENHANCEMENT] Blocks storage: introduced an option `-blocks-storage.bucket-store.bucket-index.enabled` to enable the usage of the bucket index in the querier. When enabled, the querier will use the bucket index to find a tenant's blocks instead of running the periodic bucket scan. The following new metrics have been added: #3614
10
+
*`cortex_bucket_index_loads_total`
11
+
*`cortex_bucket_index_load_failures_total`
12
+
*`cortex_bucket_index_load_duration_seconds`
13
+
*`cortex_bucket_index_loaded`
9
14
*[ENHANCEMENT] Compactor: exported the following metrics. #3583
10
15
*`cortex_bucket_blocks_count`: Total number of blocks per tenant in the bucket. Includes blocks marked for deletion.
11
16
*`cortex_bucket_blocks_marked_for_deletion_count`: Total number of blocks per tenant marked for deletion in the bucket.
12
17
*`cortex_bucket_index_last_successful_update_timestamp_seconds`: Timestamp of the last successful update of a tenant's bucket index.
13
18
*[ENHANCEMENT] Ruler: Add `cortex_prometheus_last_evaluation_samples` to expose the number of samples generated by a rule group per tenant. #3582
14
19
*[ENHANCEMENT] Memberlist: add status page (/memberlist) with available details about memberlist-based KV store and memberlist cluster. It's also possible to view KV values in Go struct or JSON format, or download for inspection. #3575
15
-
*[ENHANCEMENT] Memberlist: client can now keep a size-bounded buffer with sent and received messages and display them in the admin UI (/memberlist) for troubleshooting. #3581
20
+
*[ENHANCEMENT] Memberlist: client can now keep a size-bounded buffer with sent and received messages and display them in the admin UI (/memberlist) for troubleshooting. #3581#3602
21
+
*[BUGFIX] Allow `-querier.max-query-lookback` use `y|w|d` suffix like deprecated `-store.max-look-back-period`. #3598
16
22
*[BUGFIX] Query-Frontend: `cortex_query_seconds_total` now return seconds not nanoseconds. #3589
17
23
*[ENHANCEMENT] Add api to list all tenant alertmanager configs and ruler rules. #3259
18
24
-`GET /multitenant_alertmanager/configs`
19
25
-`GET /ruler/rules`
26
+
*[BUGFIX] Memberlist: Entry in the ring should now not appear again after using "Forget" feature (unless it's still heartbeating). #3603
Copy file name to clipboardExpand all lines: docs/blocks-storage/_index.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@ When running the Cortex blocks storage, the Cortex architecture doesn't signific
29
29
30
30
The **[store-gateway](./store-gateway.md)** is responsible to query blocks and is used by the [querier](./querier.md) at query time. The store-gateway is required when running the blocks storage.
31
31
32
-
The **[compactor](./compactor.md)** is responsible to merge and deduplicate smaller blocks into larger ones, in order to reduce the number of blocks stored in the long-term storage for a given tenant and query them more efficiently. It also keeps the bucket index updated and, for this reason, it's a required component.
32
+
The **[compactor](./compactor.md)** is responsible to merge and deduplicate smaller blocks into larger ones, in order to reduce the number of blocks stored in the long-term storage for a given tenant and query them more efficiently. It also keeps the [bucket index](./bucket-index.md) updated and, for this reason, it's a required component.
33
33
34
34
Finally, the [**table-manager**](../chunks-storage/table-manager.md) and the [**schema config**](../chunks-storage/schema-config.md) are **not used** by the blocks storage.
The bucket index is a **per-tenant file containing the list of blocks and block deletion marks** in the storage. The bucket index itself is stored in the backend object storage, is periodically updated by the compactor and used by queriers to discover blocks in the storage.
9
+
10
+
The bucket index usage is **optional** and can be enabled via `-blocks-storage.bucket-store.bucket-index.enabled=true` (or its respective YAML config option).
11
+
12
+
## Benefits
13
+
14
+
The [querier](./querier.md) needs to have an almost up-to-date view over the entire storage bucket, in order to find the right blocks to lookup at query time. Because of this, querier needs to periodically scan the bucket to look for new blocks uploaded by ingester or compactor, and blocks deleted (or marked for deletion) by compactor.
15
+
16
+
When this bucket index is enabled, the querier periodically look up the per-tenant bucket index instead of scanning the bucket via "list objects" operations. This brings few benefits:
17
+
18
+
1. Reduced number of API calls to the object storage by querier
19
+
2. No "list objects" storage API calls done by querier
20
+
3. The [querier](./querier.md) is up and running immediately after the startup (no need to run an initial bucket scan)
21
+
22
+
## Structure of the index
23
+
24
+
The `bucket-index.json.gz` contains:
25
+
26
+
-**`blocks`**<br />
27
+
List of complete blocks of a tenant, including blocks marked for deletion (partial blocks are excluded from the index).
28
+
-**`block_deletion_marks`**<br />
29
+
List of block deletion marks.
30
+
-**`updated_at`**<br />
31
+
Unix timestamp (seconds precision) of when the index has been updated (written in the storage) the last time.
32
+
33
+
## How it gets updated
34
+
35
+
The [compactor](./compactor.md) periodically scans the bucket and uploads an updated bucket index to the storage. The frequency at which the bucket index is updated can be configured via `-compactor.cleanup-interval`.
36
+
37
+
Despite using the bucket index is optional, the index itself is built and updated by the compactor even if `-blocks-storage.bucket-store.bucket-index.enabled` has **not** been enabled. This is intentional, so that once a Cortex cluster operator decides to enable the bucket index in a live cluster, the bucket index for any tenant is already existing and query results consistency is guaranteed. The overhead introduced by keeping the bucket index updated is expected to be non significative.
38
+
39
+
## How it's used by the querier
40
+
41
+
The [querier](./querier.md), at query time, checks whether the bucket index for the tenant has already been loaded in memory. If not, the querier downloads it from the storage and cache it in memory.
42
+
43
+
_Given it's a small file, lazy downloading it doesn't significantly impact on first query performances, but allows to get a querier up and running without pre-downloading every tenant's bucket index. Moreover, if the [metadata cache](./querier.md#metadata-cache) is enabled, the bucket index will be cached for a short time in a shared cache, reducing the actual latency and number of API calls to the object storage in case multiple queriers will fetch the same tenant's bucket index in a short time._
<!-- Diagram source at https://docs.google.com/presentation/d/1bHp8_zcoWCYoNU2AhO2lSagQyuIrghkCncViSqn14cU/edit -->
47
+
48
+
While in-memory, a background process will keep it **updated at periodic intervals**, so that subsequent queries from the same tenant to the same querier instance will use the cached (and periodically updated) bucket index. There are two config options involved:
If downloading a bucket index fails, the failure is cached for a short time in order to avoid hammering the backend storage. This option configures how frequently a bucket index, which previously failed to load, should be tried to load again.
54
+
55
+
If a bucket index is unused for a long time (configurable via `-blocks-storage.bucket-store.bucket-index.idle-timeout`), e.g. because that querier instance is not receiving any query from the tenant, the querier will offload it, stopping to keep it updated at regular intervals. This is particularly for tenants which are resharded to different queriers when [shuffle sharding](../guides/shuffle-sharding.md) is enabled.
56
+
57
+
Finally, the querier, at query time, checks how old is a bucket index (based on its `updated_at`) and fail a query if its age is older than `-blocks-storage.bucket-store.bucket-index.max-stale-period`. This circuit breaker is used to ensure queriers will not return any partial query results due to a stale view over the long-term storage.
Copy file name to clipboardExpand all lines: docs/blocks-storage/compactor.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ slug: compactor
10
10
The **compactor** is an service which is responsible to:
11
11
12
12
- Compact multiple blocks of a given tenant into a single optimized larger block. This helps to reduce storage costs (deduplication, index size reduction), and increase query speed (querying fewer blocks is faster).
13
-
- Keep the per-tenant bucket index updated. The bucket index is used by [queriers](./querier.md)and [store-gateways](./store-gateway.md) to discover new blocks in the storage.
13
+
- Keep the per-tenant bucket index updated. The [bucket index](./bucket-index.md)is used by [queriers](./querier.md) to discover new blocks in the storage.
Copy file name to clipboardExpand all lines: docs/blocks-storage/compactor.template
+1-1
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ slug: compactor
10
10
The **compactor** is an service which is responsible to:
11
11
12
12
- Compact multiple blocks of a given tenant into a single optimized larger block. This helps to reduce storage costs (deduplication, index size reduction), and increase query speed (querying fewer blocks is faster).
13
-
- Keep the per-tenant bucket index updated. The bucket index is used by [queriers](./querier.md) and [store-gateways](./store-gateway.md) to discover new blocks in the storage.
13
+
- Keep the per-tenant bucket index updated. The [bucket index](./bucket-index.md) is used by [queriers](./querier.md) to discover new blocks in the storage.
Copy file name to clipboardExpand all lines: docs/blocks-storage/querier.md
+64-4
Original file line number
Diff line number
Diff line change
@@ -13,12 +13,28 @@ The querier is **stateless**.
13
13
14
14
## How it works
15
15
16
-
At startup **queriers** iterate over the entire storage bucket to discover all tenants blocks and download the `meta.json` for each block. During this initial bucket scanning phase, a querier is not ready to handle incoming queries yet and its `/ready` readiness probe endpoint will fail.
16
+
The querier needs to have an almost up-to-date view over the entire storage bucket, in order to find the right blocks to lookup at query time. The querier can keep the bucket view updated in to two different ways:
17
+
18
+
1. Periodically scanning the bucket (default)
19
+
2. Periodically downloading the [bucket index](./bucket-index.md)
20
+
21
+
### Bucket index disabled (default)
22
+
23
+
At startup, **queriers** iterate over the entire storage bucket to discover all tenants blocks and download the `meta.json` for each block. During this initial bucket scanning phase, a querier is not ready to handle incoming queries yet and its `/ready` readiness probe endpoint will fail.
17
24
18
25
While running, queriers periodically iterate over the storage bucket to discover new tenants and recently uploaded blocks. Queriers do **not** download any content from blocks except a small `meta.json` file containing the block's metadata (including the minimum and maximum timestamp of samples within the block).
19
26
20
27
Queriers use the metadata to compute the list of blocks that need to be queried at query time and fetch matching series from the [store-gateway](./store-gateway.md) instances holding the required blocks.
21
28
29
+
### Bucket index enabled
30
+
31
+
When [bucket index](./bucket-index.md) is enabled, queriers lazily download the bucket index upon the first query received for a given tenant, cache it in memory and periodically keep it update. The bucket index contains the list of blocks and block deletion marks of a tenant, which is later used during the query execution to find the set of blocks that need to be queried for the given query.
32
+
33
+
Given the bucket index removes the need to scan the bucket, it brings few benefits:
34
+
35
+
1. The querier is expected to be ready shortly after startup.
36
+
2. Lower volume of API calls to object storage.
37
+
22
38
### Anatomy of a query request
23
39
24
40
When a querier receives a query range request, it contains the following parameters:
@@ -60,6 +76,7 @@ Caching is optional, but **highly recommended** in a production environment. Ple
60
76
- List of blocks per tenant
61
77
- Block's `meta.json` content
62
78
- Block's `deletion-mark.json` existence and content
79
+
- Tenant's `bucket-index.json.gz` content
63
80
64
81
Using the metadata cache can significantly reduce the number of API calls to object storage and protects from linearly scale the number of these API calls with the number of querier and store-gateway instances (because the bucket is periodically scanned and synched by each querier and store-gateway).
65
82
@@ -341,8 +358,8 @@ blocks_storage:
341
358
# CLI flag: -blocks-storage.filesystem.dir
342
359
[dir: <string> | default = ""]
343
360
344
-
# This configures how the store-gateway synchronizes blocks stored in the
345
-
# bucket.
361
+
# This configures how the querier and store-gateway discover and synchronize
362
+
# blocks stored in the bucket.
346
363
bucket_store:
347
364
# Directory to store synchronized TSDB index headers.
0 commit comments