Support ingesting exemplars into TSDB when blocks storage is enabled #4124

mdisibio · 2021-04-26T17:12:44Z

What this PR does:
Support for an in-memory buffer of exemplars was added to TSDB recently. This PR takes the first steps to supporting the same in cortex's ingest path by enabling the feature in TSDB and storing exemplars from the remote write data. Future PRs will add query support and integration with per-tenant limits.

This PR is marked WIP because there are several parts that could use some consideration:

This is enabled with a new -blocks-storage.tsdb.max-exemplars=<n> command line argument. This is available only to the ingester, but ideally there is a way to have the distributor be aware and skip validation of exemplars (currently it always validates any exemplars even if discarded by the ingester). Is there a recommend config location to have the param shared between both distributors and ingesters?
Exemplars are counted in the rate limiting in the distributor. This seems good since exemplars have processing overhead, but wanted to double check if there is something else that should be done.
There are 5 exemplar metrics in TSDB and they are exposed for per-tenant. However, a new cortex_ingester_ingested_exemplars_total global metric was also added which follows the pattern for samples, so although partially redundant seems worthwhile.
The PR for remote write of exemplars in Prometheus (Add Exemplar Remote Write support prometheus/prometheus#8296) is not merged yet, so the proto is still subject to change.

Note - Source branch was moved to grafana/cortex repo so a new PR was created. Old PR is here: #4104

cstyan

Just left a few comments that are mostly questions to other reviewers.

As for Marty's comments, those could also use some input:

This is enabled with a new -blocks-storage.tsdb.max-exemplars= command line argument. This is available only to the ingester, but ideally there is a way to have the distributor be aware and skip validation of exemplars (currently it always validates any exemplars even if discarded by the ingester). Is there a recommend config location to have the param shared between both distributors and ingesters?

I know we just pass all config flags to all components when we deploy them, but I think that's strictly the single binary + our deployment strategy? Probably not something all users would want to do. Any recommendations here from other maintainers? AFAICT there aren't many common config flags used across services unless they're part of a common component/package like the KV store, but exemplars are not their own package yet. If someone can point me to an example of such a config flag I can make the changes here.

Exemplars are counted in the rate limiting in the distributor. This seems good since exemplars have processing overhead, but wanted to double check if there is something else that should be done.

Exemplars should probably be their own rate limit, 1/10 or 1/100 of the sample rate limit IMO.

cstyan · 2021-04-28T04:38:31Z

pkg/distributor/distributor.go

 	d.receivedMetadata.DeleteLabelValues(userID)
 	d.incomingSamples.DeleteLabelValues(userID)
+	d.incomingExemplars.DeleteLabelValues(userID)
 	d.incomingMetadata.DeleteLabelValues(userID)
 	d.nonHASamples.DeleteLabelValues(userID)


should we potentially have nonHAExemplars as well?

I'm wondering if it's an useful information. I'm a bit dubious about it.

cstyan · 2021-04-28T04:47:28Z

pkg/ingester/ingester_v2.go

@@ -1479,6 +1516,7 @@ func (i *Ingester) createTSDB(userID string) (*userTSDB, error) {
 		WALSegmentSize:            i.cfg.BlocksStorageConfig.TSDB.WALSegmentSizeBytes,
 		SeriesLifecycleCallback:   userDB,
 		BlocksToDelete:            userDB.blocksToDelete,
+		MaxExemplars:              i.cfg.BlocksStorageConfig.TSDB.MaxExemplars,


This is the only value we need to pass to enable the in-memory storage within TSDB, which we already have via vendoring. (nothing to do here, just pointing this out to other reviewers)

cstyan · 2021-04-28T04:49:05Z

pkg/ingester/ingester_v2.go

+		// already exist.  If it does not then drop.
+		if ref == 0 && len(ts.Exemplars) > 0 {
+			updateFirstPartial(func() error {
+				return wrappedTSDBIngestExemplarErr(errors.New("exemplars not ingested because series not already present"),


I think not ingesting the exemplar if we don't already know about the series is the correct thing to do, but I'm not sure what kind of error message we want to return to users here.

I think not ingesting the exemplar if we don't already know about the series is the correct thing to do

Agree.

but I'm not sure what kind of error message we want to return to users here

Do we want to return an error at all? Metadata ingestion is best-effort and we never return any error about it. Should exemplars ingestion be a best-effort as well?

Do we want to return an error at all? Metadata ingestion is best-effort and we never return any error about it. Should exemplars ingestion be a best-effort as well?

We should IMO, exemplars are more useful/important than metadata. Exemplar ingestion could be best-effort right now for the in-memory storage, though it shouldn't be forever when we eventually have a better storage implementation. FWIW what Marty has done here is essentially exactly what would happen in Prometheus' exemplar storage if you tried to store an exemplar for a series: https://github.com/prometheus/prometheus/blob/main/tsdb/head.go#L1326-L1345

gouthamve · 2021-04-28T15:09:17Z

Can you please rebase against master to pull in #4137 and make sure to put the changelog entry to the top?

pracucci

Very good job @mdisibio! This is a very high quality PR considered it's one of your first Cortex contributions 👏 I left few comments I would be glad you to take a look.

Some other notes:

It's OK not having a CHANGELOG entry yet, given this is a WIP
Please mark it experimental in docs/configuration/v1-guarantees.md

This is enabled with a new -blocks-storage.tsdb.max-exemplars= command line argument

Let me think a bit more on the config option.

Exemplars are counted in the rate limiting in the distributor.

LGTM. They're also counted towards the samples/s ingestion rate limit, which is what we do for the metadata and it LGTM too.

pracucci · 2021-04-30T12:39:51Z

pkg/cortexpb/cortex.proto

+  // Exemplar labels, different than series labels
+  repeated LabelPair labels = 1 [(gogoproto.nullable) = false, (gogoproto.customtype) = "LabelAdapter"];
+  double value = 2;
+  int64 timestamp_ms = 3;


I would name this timestamp to keep it consistent with Prometheus.

We already use timestamp_ms for Sample in this file, and there's been discussion about the name of the field in the Prometheus Exemplar remote write PR: https://github.com/prometheus/prometheus/pull/8296/files#r611218904

My take is that the message should equal to Prometheus one (unless there's a good reason to not). I've seen in the Prometheus PR it's named timestamp so I was wondering if there's a good reason to name it differently here.

Definitely not a blocker.

pracucci · 2021-04-30T12:42:29Z

pkg/cortexpb/timeseries.go

-				Samples: make([]Sample, 0, expectedSamplesPerSeries),
+				Labels:    make([]LabelAdapter, 0, expectedLabels),
+				Samples:   make([]Sample, 0, expectedSamplesPerSeries),
+				Exemplars: make([]Exemplar, 0, expectedExemplarsPerSeries),


Do we actually expect 1 exemplar per series? If a client has exemplars tracking enabled, do we actually expect an exemplar for each series in each remote write request? Could you share more thoughts about this?

No, each series will contain either samples or exemplars. The current prometheus remote write implementation only populates 1 exemplar per series. Data on exemplars is limited, but we estimate that exemplars will be 1% of samples typically. It seems the decision is whether 1% is worth prealloc/pooling. Additionally, exemplars are larger objects than samples as they contain additional labels.

It seems the decision is whether 1% is worth prealloc/pooling

Exactly. I was wondering if it's worth preallocating. I'm wondering if it'actually better to initialise Exemplars to nil adding a comment to the reason. Then we can reconsider it when profiling a prod cluster.

Thoughts?

pracucci · 2021-04-30T12:44:03Z

pkg/distributor/distributor.go

 	d.receivedMetadata.DeleteLabelValues(userID)
 	d.incomingSamples.DeleteLabelValues(userID)
+	d.incomingExemplars.DeleteLabelValues(userID)
 	d.incomingMetadata.DeleteLabelValues(userID)
 	d.nonHASamples.DeleteLabelValues(userID)


I'm wondering if it's an useful information. I'm a bit dubious about it.

pkg/distributor/distributor.go

pkg/ingester/metrics.go

pkg/storage/tsdb/config.go

pkg/util/validation/validate.go

pkg/storage/tsdb/config.go

cstyan · 2021-05-03T05:06:00Z

Fixed a lot of the review comments from @pracucci review, though for the life of me I can't figure out the per user metrics registry. I added the new metrics to the tsdb metrics test, and the new metrics have the user labels attached properly, but the expected output says the user labels should not be attached.

pracucci · 2021-05-04T15:00:28Z

pkg/cortexpb/cortex.proto

+  // Exemplar labels, different than series labels
+  repeated LabelPair labels = 1 [(gogoproto.nullable) = false, (gogoproto.customtype) = "LabelAdapter"];
+  double value = 2;
+  int64 timestamp_ms = 3;


My take is that the message should equal to Prometheus one (unless there's a good reason to not). I've seen in the Prometheus PR it's named timestamp so I was wondering if there's a good reason to name it differently here.

Definitely not a blocker.

pracucci · 2021-05-04T15:03:25Z

pkg/cortexpb/timeseries.go

-				Samples: make([]Sample, 0, expectedSamplesPerSeries),
+				Labels:    make([]LabelAdapter, 0, expectedLabels),
+				Samples:   make([]Sample, 0, expectedSamplesPerSeries),
+				Exemplars: make([]Exemplar, 0, expectedExemplarsPerSeries),


It seems the decision is whether 1% is worth prealloc/pooling

Exactly. I was wondering if it's worth preallocating. I'm wondering if it'actually better to initialise Exemplars to nil adding a comment to the reason. Then we can reconsider it when profiling a prod cluster.

Thoughts?

pkg/ingester/ingester_v2.go

pkg/ingester/ingester_v2_test.go

pkg/distributor/distributor.go

pracucci · 2021-05-04T15:15:33Z

pkg/ingester/metrics.go

@@ -496,6 +512,29 @@ func newTSDBMetrics(r prometheus.Registerer) *tsdbMetrics {
 			"Total number of TSDB checkpoint creations attempted.",
 			nil, nil),

+		tsdbExemplarsTotal: prometheus.NewDesc(


I see your point and I agree the more granularity the better from an observability perspective. We also have clusters with a large number of tenants per ingester (several thousands tenants / ingester) so this can easily end up in additional significative metrics exported by Cortex.

That's why I was asking if all of them are required. We typically start small and split by user if it turns out the metric is not useful unless splitted by user.

I think the following could be global to have a sense of the rate:

cortex_ingester_tsdb_exemplar_exemplars_appended_total

cortex_ingester_tsdb_exemplar_exemplars_in_storage

cortex_ingester_tsdb_exemplar_series_with_exemplars_in_storage

cortex_ingester_tsdb_exemplar_out_of_order_exemplars_total

That being said, up to you. Not a blocker!

pkg/util/validation/validate.go

pkg/util/validation/errors.go

pkg/util/validation/validate_test.go

pkg/storage/tsdb/config.go

pracucci

Thanks @mdisibio and @cstyan for addressing my feedback and replying my questions. I don't have any concern but I would be glad if you could take a look at my last comments. Thanks!

pkg/distributor/distributor.go

pracucci

Thanks a lot for addressing my feedback. LGTM! 🤘 🚀 👏

pstibrany

LGTM, thank you!

Should this PR have a changelog entry, or do you plan to add it after query-path is updated too?

pkg/distributor/distributor_test.go

pkg/ingester/ingester_v2_test.go

pstibrany · 2021-05-05T08:56:20Z

pkg/ingester/metrics.go

@@ -496,6 +512,29 @@ func newTSDBMetrics(r prometheus.Registerer) *tsdbMetrics {
 			"Total number of TSDB checkpoint creations attempted.",
 			nil, nil),

+		tsdbExemplarsTotal: prometheus.NewDesc(


Why is cortex_ingester_tsdb_exemplar_last_exemplars_timestamp_seconds useful to have per-user? (Asking to understand better.)

pkg/util/validation/errors.go

pstibrany · 2021-05-06T07:31:44Z

Thank you for addressing my feedback.

…to tsdb when blocks storage is enabled. Signed-off-by: Martin Disibio <[email protected]>

Signed-off-by: Martin Disibio <[email protected]>

…rded per reason Signed-off-by: Martin Disibio <[email protected]>

Signed-off-by: Martin Disibio <[email protected]>

…lars Signed-off-by: Martin Disibio <[email protected]>

Signed-off-by: Martin Disibio <[email protected]>

Signed-off-by: Callum Styan <[email protected]>

Signed-off-by: Martin Disibio <[email protected]>

…. Count runes in exemplar labels instead of bytes Signed-off-by: Martin Disibio <[email protected]>

…feedback about limiting cardinality. Update other code formatting and logic based on review feedback Signed-off-by: Martin Disibio <[email protected]>

Simplify string formatting Co-authored-by: Peter Štibraný <[email protected]> Signed-off-by: Martin Disibio <[email protected]>

Signed-off-by: Martin Disibio <[email protected]>

pull-request-size bot added the size/L label Apr 26, 2021

mdisibio mentioned this pull request Apr 26, 2021

WIP: Support ingesting exemplars into TSDB when blocks storage is enabled #4104

Closed

3 tasks

cstyan reviewed Apr 28, 2021

View reviewed changes

pracucci reviewed Apr 30, 2021

View reviewed changes

pull-request-size bot added size/XL and removed size/L labels May 3, 2021

pracucci reviewed May 4, 2021

View reviewed changes

pkg/distributor/distributor.go Show resolved Hide resolved

pracucci approved these changes May 5, 2021

View reviewed changes

pracucci changed the title ~~WIP: Support ingesting exemplars into TSDB when blocks storage is enabled~~ Support ingesting exemplars into TSDB when blocks storage is enabled May 5, 2021

pstibrany approved these changes May 5, 2021

View reviewed changes

pstibrany reviewed May 5, 2021

View reviewed changes

pkg/util/validation/errors.go Outdated Show resolved Hide resolved

mdisibio force-pushed the exemplar-ingest branch 2 times, most recently from e46bdf9 to f22d2d0 Compare May 5, 2021 18:06

pstibrany enabled auto-merge (squash) May 6, 2021 07:31

mdisibio and others added 11 commits May 6, 2021 08:26

Add experimental support for ingesting exemplars from remote write in…

ee977e1

…to tsdb when blocks storage is enabled. Signed-off-by: Martin Disibio <[email protected]>

Comment for clarity

8934991

Signed-off-by: Martin Disibio <[email protected]>

Skip entire exemplar loop if series ref missing

4553cae

Signed-off-by: Martin Disibio <[email protected]>

Add exemplar metrics for distributor received/in and validation disca…

ecbd4f2

…rded per reason Signed-off-by: Martin Disibio <[email protected]>

Fix spacing

53468bf

Signed-off-by: Martin Disibio <[email protected]>

Comments

0596341

Signed-off-by: Martin Disibio <[email protected]>

Don't alloc empty sample slice when validating series with only exemp…

90020e8

…lars Signed-off-by: Martin Disibio <[email protected]>

Track exemplars that can't be ingested due to missing ref as failed

4edb2ce

Signed-off-by: Martin Disibio <[email protected]>

Address some review comments.

03110ad

Signed-off-by: Callum Styan <[email protected]>

Address more review comments.

0684cdb

Signed-off-by: Callum Styan <[email protected]>

Update ingester tests for new exemplar metric names

505b081

Signed-off-by: Martin Disibio <[email protected]>

mdisibio and others added 6 commits May 6, 2021 08:26

make doc for new max_exemplars

1ecfdcd

Signed-off-by: Martin Disibio <[email protected]>

Update distributor validation messages and comments based on feedback…

bbaa0b8

…. Count runes in exemplar labels instead of bytes Signed-off-by: Martin Disibio <[email protected]>

Convert some exemplar metrics to global instead of per-user based on …

1ac8b92

…feedback about limiting cardinality. Update other code formatting and logic based on review feedback Signed-off-by: Martin Disibio <[email protected]>

Update pkg/util/validation/errors.go

d3a6921

Simplify string formatting Co-authored-by: Peter Štibraný <[email protected]> Signed-off-by: Martin Disibio <[email protected]>

Add test cases based on review comments

14f0523

Signed-off-by: Martin Disibio <[email protected]>

Update changelog

038adf8

Signed-off-by: Martin Disibio <[email protected]>

auto-merge was automatically disabled May 6, 2021 12:47
Head branch was pushed to by a user without write access

mdisibio force-pushed the exemplar-ingest branch from f22d2d0 to 038adf8 Compare May 6, 2021 12:47

pstibrany enabled auto-merge (squash) May 6, 2021 13:09

pstibrany merged commit 926a691 into cortexproject:master May 6, 2021

yeya24 mentioned this pull request May 13, 2021

Receiver can ingest exemplars via remote write thanos-io/thanos#4235

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ingesting exemplars into TSDB when blocks storage is enabled #4124

Support ingesting exemplars into TSDB when blocks storage is enabled #4124

mdisibio commented Apr 26, 2021

cstyan left a comment

cstyan Apr 28, 2021

pracucci Apr 30, 2021

cstyan Apr 28, 2021 •

edited

Loading

cstyan Apr 28, 2021

pracucci Apr 30, 2021

cstyan May 3, 2021

gouthamve commented Apr 28, 2021

pracucci left a comment

pracucci Apr 30, 2021

cstyan May 3, 2021

pracucci May 4, 2021

pracucci Apr 30, 2021

mdisibio May 3, 2021

pracucci May 4, 2021

pracucci Apr 30, 2021

cstyan commented May 3, 2021

pracucci May 4, 2021

pracucci May 4, 2021

pracucci May 4, 2021

pracucci left a comment

pracucci left a comment

pstibrany left a comment

pstibrany May 5, 2021

pstibrany commented May 6, 2021

Support ingesting exemplars into TSDB when blocks storage is enabled #4124

Support ingesting exemplars into TSDB when blocks storage is enabled #4124

Conversation

mdisibio commented Apr 26, 2021

cstyan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cstyan Apr 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gouthamve commented Apr 28, 2021

pracucci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cstyan commented May 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pstibrany left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pstibrany commented May 6, 2021

cstyan Apr 28, 2021 •

edited

Loading