Skip to content

metric PeriodicReader ignores interval at Shutdown #6677

Open
@nktks

Description

@nktks

Description

We are using go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader) via cloud.google.com/go/spanner in following code.
https://github.com/googleapis/google-cloud-go/blob/spanner/v1.78.0/spanner/metrics.go#L270-L275

And we got error in shutdown process because export backend is Google Cloud Monitoring and it has quota of Rate at which data can be written to a single time series, one point each 5 seconds.
https://cloud.google.com/monitoring/quotas#custom_metrics_quotas
This is due to Shutdown forces export and ignores the interval, so there are cases where the Shutdown process sends a metrics immediately after the previous export in run goroutine.

I understand that in some cases the interval is a long number of seconds, so if we change Shutdown wait for the interval, the shutdown process will take longer.
It would be nice if the user side could stop the periodic export and wait a certain number of seconds before handling the final flush, but currently this seems difficult because the export is done in Shutdown.
To solve this with keeping backward compatibillity, I think we can add option of a minimum number of seconds to wait in Shutdown.

Example Error message.

This is an error message of export.

*status.Error: rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: timeSeries[0-3]: write for resource=spanner_instance_client{client_hash:xxxx,instance_config:unknown,location:xxxxx,instance_id:xxxx} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric.

Part of stacktrace.

cloud.google.com/go/monitoring/apiv3/v2.(*metricGRPCClient).CreateServiceTimeSeries
	cloud.google.com/go/[email protected]/apiv3/v2/metric_client.go:576
cloud.google.com/go/monitoring/apiv3/v2.(*MetricClient).CreateServiceTimeSeries
	cloud.google.com/go/[email protected]/apiv3/v2/metric_client.go:269
cloud.google.com/go/spanner.(*monitoringExporter).exportTimeSeries
	cloud.google.com/go/[email protected]/metrics_monitoring_exporter.go:157
cloud.google.com/go/spanner.(*monitoringExporter).Export
	cloud.google.com/go/[email protected]/metrics_monitoring_exporter.go:121
go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader).export
	go.opentelemetry.io/otel/sdk/[email protected]/periodic_reader.go:269
go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader).Shutdown.func1
	go.opentelemetry.io/otel/sdk/[email protected]/periodic_reader.go:330
sync.(*Once).doSlow
	sync/once.go:78
sync.(*Once).Do
	sync/once.go:69
go.opentelemetry.io/otel/sdk/metric.(*PeriodicReader).Shutdown
	go.opentelemetry.io/otel/sdk/[email protected]/periodic_reader.go:308
go.opentelemetry.io/otel/sdk/metric.config.readerSignals.unifyShutdown.unify.func3
	go.opentelemetry.io/otel/sdk/[email protected]/config.go:49
go.opentelemetry.io/otel/sdk/metric.config.readerSignals.unifyShutdown.func2.1
	go.opentelemetry.io/otel/sdk/[email protected]/config.go:64
sync.(*Once).doSlow
	sync/once.go:78
sync.(*Once).Do
	sync/once.go:69
go.opentelemetry.io/otel/sdk/metric.config.readerSignals.unifyShutdown.func2
	go.opentelemetry.io/otel/sdk/[email protected]/config.go:64
go.opentelemetry.io/otel/sdk/metric.(*MeterProvider).Shutdown
	go.opentelemetry.io/otel/sdk/[email protected]/provider.go:142
cloud.google.com/go/spanner.newBuiltinMetricsTracerFactory.func2
	cloud.google.com/go/[email protected]/metrics.go:246
cloud.google.com/go/spanner.(*Client).Close
	cloud.google.com/go/[email protected]/client.go:781

Environment

Steps To Reproduce

Sorry this is difficult to provide complete code, because we need to use Google Cloud project and enable Cloud Monitoring and Cloud Spanner to reproduce.

Instead I wrote sample code to reproduce similar case with local exporter with rate limit.
https://gist.github.com/nktks/f7373c39c4f3731f671c09cb4502b286

Expected behavior

  • No error happens when we call Shutdown.
  • Pending telemetry is exported even during Shutdown.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions