Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming for ingester metadata APIs #4725

Merged

Conversation

harry671003
Copy link
Contributor

@harry671003 harry671003 commented Apr 26, 2022

Signed-off-by: 🌲 Harry 🌊 John 🏔 [email protected]

What this PR does:
Currently, the MetricsForLabelMatchers, LabelNames, LabelValues APIs doesn't support streaming. Because of this queriers have to load all the time-series into memory before the max_fetched_series_per_query limit can be applied. This causes queriers to OOM when a sufficiently large query is made. This PR introduces streaming for this API reducing the amount of data that must be fetched before the limit should be applied.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@harry671003 harry671003 force-pushed the streaming-metrics-for-label branch 2 times, most recently from b186d2d to 774cba9 Compare April 27, 2022 15:46
@@ -871,6 +871,10 @@ The `querier_config` configures the Cortex querier.
# CLI flag: -querier.ingester-streaming
[ingester_streaming: <boolean> | default = true]

# Use streaming RPCs for metadata APIs from ingester.
# CLI flag: -querier.ingester-metadata-streaming
[ingester_metadata_streaming: <boolean> | default = true]
Copy link
Contributor

@alvinlin123 alvinlin123 Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pondering on if we should have a new flag or just use the ingester_streaming flag. What's your thought on this?

I also worry about are we going to create new configuration flag as we enable streaming for other APIs like get labels or get examplars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pondering on if we should have a new flag or just use the ingester_streaming flag. What's your thought on this?

We need a way to migrate... so i think a new flag is needed indeed...

But i agree if we have more APIs that will be converted to streaming, we should do all at once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added streaming in LabelNames and LabelValues as well in this PR. This should reduce the number of new flags we need for live migration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default value of ingester_metadata_streaming is true, won't a lot of customers gets auto migrate to metadata stream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I can change that. :)

Copy link
Contributor

@alvinlin123 alvinlin123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are first batch of comments. I haven't 100% review everything.

metrics := map[model.Fingerprint]model.Metric{}

_, err = d.ForReplicationSet(ctx, replicationSet, func(ctx context.Context, client ingester_client.IngesterClient) (interface{}, error) {
stream, err := client.MetricsForLabelMatchersStream(ctx, req)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we were using queryIngesterStream as example for this functino. In queryIngesterStream I see there is defer stream.CLoseSend() how come we don't need it here?

@@ -991,6 +992,64 @@ func (d *Distributor) MetricsForLabelMatchers(ctx context.Context, from, through
return result, nil
}

func (d *Distributor) MetricsForLabelMatchersStream(ctx context.Context, from, through model.Time, matchers ...*labels.Matcher) ([]metric.Metric, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is quite a bit of duplicate code to MetricsForLabelMatchers, I wonder if we can refactor out some common code.

@harry671003 harry671003 force-pushed the streaming-metrics-for-label branch 2 times, most recently from 5556106 to 2445896 Compare May 16, 2022 23:18
@harry671003 harry671003 force-pushed the streaming-metrics-for-label branch 2 times, most recently from 10a7c2b to 56a4373 Compare May 18, 2022 21:46
@harry671003 harry671003 changed the title Add streaming for MetricsForLabelMatchers API Add streaming for ingester metadata APIs May 18, 2022
@harry671003 harry671003 force-pushed the streaming-metrics-for-label branch 5 times, most recently from 06bc0d4 to f9d19a4 Compare May 24, 2022 19:30
@alanprot
Copy link
Member

LGTM

@harry671003 harry671003 force-pushed the streaming-metrics-for-label branch from f9d19a4 to 0b8bc2e Compare May 26, 2022 16:45
Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]>
@harry671003 harry671003 force-pushed the streaming-metrics-for-label branch from 0b8bc2e to 329e726 Compare May 26, 2022 17:17
Copy link
Contributor

@alvinlin123 alvinlin123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMT

@alvinlin123 alvinlin123 merged commit 4326984 into cortexproject:master May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants