handle conflits in prw v2 #39570

perebaj · 2025-04-22T22:43:00Z

Description

Handle conflicts in prw v2

Link to tracking issue

Partially fixes #33661

jmichalek132

LGTM, Thank you for picking this up.,

krajorama

hi, nice job. I have a comment on comments :) Also it would be nice to test the collision handling somehow, either find a conflict (I've started to run a simple finder on 4 cores, maybe get lucky) or make the signature function be possible to influence.

krajorama · 2025-04-24T07:21:20Z

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

+	if len(ts1.LabelsRefs) != len(ts2.LabelsRefs) {
+		return false
+	}
+	// As the labels are sorted as name, value, name, value, ... we can compare the labels by index jumping 2 steps at a time


plz update the comment on (new) line 134:

// TODO: Read the PRW spec to see if labels need to be sorted. If it is, then we need to sort in export code. If not, we can sort in the test. (@dashpole have more context on this)

since we're going to depend on this, it should say something like "We need to sort labels for..."

Sorry, I don't know if I got it. Here we are comparing the LabelsRefs from different TSs, that are a list of integer arranged to represent: name, value, name, value....

The sort that we are doing on L132-L135 is related to sort the []prompb.Label. The confusion here was made because I used the sort word?

Yes, the sort word is a bit ambiguous here, let's say

Suggested change

// As the labels are sorted as name, value, name, value, ... we can compare the labels by index jumping 2 steps at a time

// As the labels are ordered as name, value, name, value, ... we can compare the labels by index jumping 2 steps at a time

But anyway, isSameMetricV2 will only work correctly if the order of labels is consistent, since otherwise it will say "not the same" for {a="1", b="2"} vs {b="2", a="1"} label sets.

On lines 132-135 we're sorting the labels []prompb.Label , but then we're converting these labels into the references on lines 137-143. So the order of the references is the same as the order of the sorted labels. Which means that we ensure consistent order for isSameMetricV2 with the sort on 132-135.

I did take a look at where the labels come from and it's createAttributes which will return consistent ordering probably - config changes not withstanding. So we could probably get away without making the sort, but I feel like that's brittle.

krajorama · 2025-04-24T07:28:38Z

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

@@ -131,9 +143,39 @@ func (c *prometheusConverterV2) addSample(sample *writev2.Sample, lbls []prompb.
 		off = c.symbolTable.Symbolize(l.Value)
 		buf = append(buf, off)
 	}
-	ts := writev2.TimeSeries{
+
+	sig := timeSeriesSignature(lbls)


For a future PR: timeSeriesSignature also does a sort by label name.

Yep, I added it here because we are triggering a flaky test. I added more details here. I was trying to debug it with @dashpole before his leaving. This is a TODO task to me figure out how to fix🙃

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

perebaj · 2025-04-25T03:43:04Z

pkg/translator/prometheusremotewrite/metrics_to_prw_v2_test.go

+		require.Len(t, converter.unique, 1)
+		require.Len(t, converter.unique[timeSeriesSignature(labels)].Samples, 2)
+	})
+	// TODO: Test 3 Conflict - different metrics with same hash


I think that the current set of tests were a little bit poor. Struggling to implement this one... Maybe these that I added could be a good start point. Let me know what you think.

aknuds1

I can see a bug in that the case is not handled when a conflicting time series already exists. Please instead follow the example of the PRW v1 code, and introduce a method getOrCreateTimeSeries for obtaining the time series to add the sample to. Implementing this method correctly should solve the bug and make for more understandable/performant code.

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

aknuds1 · 2025-04-27T07:31:24Z

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

@@ -131,9 +141,40 @@ func (c *prometheusConverterV2) addSample(sample *writev2.Sample, lbls []prompb.
 		off = c.symbolTable.Symbolize(l.Value)
 		buf = append(buf, off)
 	}
-	ts := writev2.TimeSeries{
+
+	sig := timeSeriesSignature(lbls)


I think you should here follow the same logic as in prometheusConverter.addSample, and use a method getOrCreateTimeSeries to obtain the time series to add the sample to. Creating a new time series for every sample even if the time series already exists is bad for performance.

+1

Let's keep the code consistent between v1 and v2.

Thanks for this review. Now I had time to come back to it.

Fixed here: 2afc6dc

aknuds1 · 2025-04-27T07:33:16Z

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

+		// if the time series is already in the unique map, check if it is the same metric
+		if !isSameMetricV2(existingTS, ts) {
+			// if the time series is not the same metric, add it to the conflicts map
+			c.conflicts[sig] = append(c.conflicts[sig], ts)


This is a bug since you're not checking whether c.conflicts[sig] already has a time series with the same labels.

.chloggen/handle-conflits.yaml

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go

perebaj · 2025-05-13T12:35:13Z

I'm leaving this PR in standby, meanwhile I'm focused on the receiver... I have plans to come back to it soon.

github-actions · 2025-06-10T05:21:12Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

jmichalek132 · 2025-06-21T09:06:00Z

Hey @perebaj I was wondering if you have an idea when you will come back to this pr :) ?

perebaj · 2025-06-24T18:59:34Z

Hey @perebaj I was wondering if you have an idea when you will come back to this pr :) ?

Hey, Im getting some trouble to finish this one. Before that, my entire focus will be histograms...

Would you like to continue this one?

jmichalek132 · 2025-06-25T09:33:08Z

Hey @perebaj I was wondering if you have an idea when you will come back to this pr :) ?

Hey, Im getting some trouble to finish this one. Before that, my entire focus will be histograms...

Would you like to continue this one?

If you don't get to it within the next two weeks (I have some travel planned) I can pick it up when I am back :).

handle conflits in prw v2

412915e

github-actions bot added the pkg/translator/prometheusremotewrite label Apr 22, 2025

github-actions bot requested review from Aneurysm9 and dashpole April 22, 2025 22:43

jmichalek132 approved these changes Apr 23, 2025

View reviewed changes

jmichalek132 mentioned this pull request Apr 23, 2025

[exporter/prometheusremotewrite] Support Prometheus Remote-Write v2 #33661

Open

29 tasks

perebaj marked this pull request as ready for review April 23, 2025 22:33

perebaj requested a review from a team as a code owner April 23, 2025 22:33

github-actions bot assigned atoulme Apr 23, 2025

atoulme added the waiting-for-code-owners label Apr 24, 2025

krajorama reviewed Apr 24, 2025

View reviewed changes

perebaj added 2 commits April 24, 2025 13:39

conflict count

3135279

TestConflictHandling

efd1714

perebaj commented Apr 25, 2025

View reviewed changes

aknuds1 suggested changes Apr 27, 2025

View reviewed changes

perebaj added 2 commits April 27, 2025 16:13

remove surfluous comments

0960504

existingTS != nil

303dcc8

github-actions bot mentioned this pull request Apr 29, 2025

Weekly Report: 2025-04-22 - 2025-04-29 #39708

Closed

ywwg reviewed May 2, 2025

View reviewed changes

.chloggen/handle-conflits.yaml Show resolved Hide resolved

github-actions bot mentioned this pull request May 6, 2025

Weekly Report: 2025-04-29 - 2025-05-06 #39865

Closed

ArthurSens reviewed May 7, 2025

View reviewed changes

pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go Show resolved Hide resolved

github-actions bot mentioned this pull request May 13, 2025

Weekly Report: 2025-05-06 - 2025-05-13 #40023

Closed

github-actions bot mentioned this pull request May 20, 2025

Weekly Report: 2025-05-13 - 2025-05-20 #40138

Closed

ArthurSens marked this pull request as draft May 26, 2025 21:37

ArthurSens removed the waiting-for-code-owners label May 26, 2025

ordered

fd94cbf

github-actions bot added the Stale label Jun 10, 2025

github-actions bot removed the Stale label Jun 22, 2025

perebaj added 4 commits June 28, 2025 13:38

merge conflicts

45af4ed

Merge branch 'main' into handle-conflits

dbd8967

getOrCreateTimeSeriesV2

2afc6dc

new Metric

41738e6

perebaj marked this pull request as ready for review June 30, 2025 13:34

github-actions bot assigned songy23 Jun 30, 2025

	// As the labels are sorted as name, value, name, value, ... we can compare the labels by index jumping 2 steps at a time
	// As the labels are ordered as name, value, name, value, ... we can compare the labels by index jumping 2 steps at a time

handle conflits in prw v2 #39570

Are you sure you want to change the base?

handle conflits in prw v2 #39570

Uh oh!

Conversation

perebaj commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Link to tracking issue

Uh oh!

jmichalek132 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krajorama left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aknuds1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

perebaj commented May 13, 2025

Uh oh!

github-actions bot commented Jun 10, 2025

Uh oh!

jmichalek132 commented Jun 21, 2025

Uh oh!

perebaj commented Jun 24, 2025

Uh oh!

jmichalek132 commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

perebaj commented Apr 22, 2025 •

edited

Loading

jmichalek132 left a comment •

edited

Loading

aknuds1 left a comment •

edited

Loading

jmichalek132 commented Jun 25, 2025 •

edited

Loading