Skip to content

metric PeriodicReader ignores interval at Shutdown #6677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nktks opened this issue Apr 20, 2025 · 2 comments
Open

metric PeriodicReader ignores interval at Shutdown #6677

nktks opened this issue Apr 20, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@nktks
Copy link

nktks commented Apr 20, 2025

Description

We are using go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader) via cloud.google.com/go/spanner in following code.
https://github.com/googleapis/google-cloud-go/blob/spanner/v1.78.0/spanner/metrics.go#L270-L275

And we got error in shutdown process because export backend is Google Cloud Monitoring and it has quota of Rate at which data can be written to a single time series, one point each 5 seconds.
https://cloud.google.com/monitoring/quotas#custom_metrics_quotas
This is due to Shutdown forces export and ignores the interval, so there are cases where the Shutdown process sends a metrics immediately after the previous export in run goroutine.

I understand that in some cases the interval is a long number of seconds, so if we change Shutdown wait for the interval, the shutdown process will take longer.
It would be nice if the user side could stop the periodic export and wait a certain number of seconds before handling the final flush, but currently this seems difficult because the export is done in Shutdown.
To solve this with keeping backward compatibillity, I think we can add option of a minimum number of seconds to wait in Shutdown.

Example Error message.

This is an error message of export.

*status.Error: rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: timeSeries[0-3]: write for resource=spanner_instance_client{client_hash:xxxx,instance_config:unknown,location:xxxxx,instance_id:xxxx} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric.

Part of stacktrace.

cloud.google.com/go/monitoring/apiv3/v2.(*metricGRPCClient).CreateServiceTimeSeries
	cloud.google.com/go/[email protected]/apiv3/v2/metric_client.go:576
cloud.google.com/go/monitoring/apiv3/v2.(*MetricClient).CreateServiceTimeSeries
	cloud.google.com/go/[email protected]/apiv3/v2/metric_client.go:269
cloud.google.com/go/spanner.(*monitoringExporter).exportTimeSeries
	cloud.google.com/go/[email protected]/metrics_monitoring_exporter.go:157
cloud.google.com/go/spanner.(*monitoringExporter).Export
	cloud.google.com/go/[email protected]/metrics_monitoring_exporter.go:121
go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader).export
	go.opentelemetry.io/otel/sdk/[email protected]/periodic_reader.go:269
go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader).Shutdown.func1
	go.opentelemetry.io/otel/sdk/[email protected]/periodic_reader.go:330
sync.(*Once).doSlow
	sync/once.go:78
sync.(*Once).Do
	sync/once.go:69
go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader).Shutdown
	go.opentelemetry.io/otel/sdk/[email protected]/periodic_reader.go:308
go.opentelemetry.io/otel/sdk/metric.config.readerSignals.unifyShutdown.unify.func3
	go.opentelemetry.io/otel/sdk/[email protected]/config.go:49
go.opentelemetry.io/otel/sdk/metric.config.readerSignals.unifyShutdown.func2.1
	go.opentelemetry.io/otel/sdk/[email protected]/config.go:64
sync.(*Once).doSlow
	sync/once.go:78
sync.(*Once).Do
	sync/once.go:69
go.opentelemetry.io/otel/sdk/metric.config.readerSignals.unifyShutdown.func2
	go.opentelemetry.io/otel/sdk/[email protected]/config.go:64
go.opentelemetry.io/otel/sdk/metric.(*MeterProvider).Shutdown
	go.opentelemetry.io/otel/sdk/[email protected]/provider.go:142
cloud.google.com/go/spanner.newBuiltinMetricsTracerFactory.func2
	cloud.google.com/go/[email protected]/metrics.go:246
cloud.google.com/go/spanner.(*Client).Close
	cloud.google.com/go/[email protected]/client.go:781

Environment

Steps To Reproduce

Sorry this is difficult to provide complete code, because we need to use Google Cloud project and enable Cloud Monitoring and Cloud Spanner to reproduce.

Instead I wrote sample code to reproduce similar case with local exporter with rate limit.
https://gist.github.com/nktks/f7373c39c4f3731f671c09cb4502b286

Expected behavior

  • No error happens when we call Shutdown.
  • Pending telemetry is exported even during Shutdown.
@MrAlias
Copy link
Contributor

MrAlias commented Apr 20, 2025

Why not write your own exporter to support your desired behavior?

@nktks
Copy link
Author

nktks commented Apr 21, 2025

I thought if there are similar rate limit cases on the backend of other exporters, it would be nice if otel side handles it.
But I'll try to discuss to exporter side about this issue.
googleapis/google-cloud-go#12017

Thanks for your confirmation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants