Skip to content

extension/encoding/awscloudwatchmetricstreamsencodingextension: errors should be returned, not logged #38596

Open
@axw

Description

@axw

Component(s)

extension/encoding/awscloudwatchmetricstreamsencoding

What happened?

Description

The JSON unmarshaler currently log errors and continues unmarshaling. If nothing could be successfully unmarshaled, then a fairly generic and unhelpful error message is returned.

We have also carried across the latter behaviour to the OTLP unmarshaler for consistency, even though there's technically nothing wrong with having an empty pmetric.Metrics: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/38516/files#r1992449794

As I mentioned in #38445 (comment):

IMO the receiver should not be logging and swallowing these errors, it should return them to the client. Such errors most likely imply that either the collector has been misconfigured with the wrong encoding, or the client has been misconfigured to send the wrong type of data.

While it's possible to determine the causes of these by looking at the collector logs, that's not necessarily practical. The person who configures things on the client side (i.e. CloudWatch/Firehose) may not be the same person who is running the collector, and may not have access to the logs. In this case it would be more helpful to respond with an error to the client, so they can understand what they've done wrong.

In either of those cases, none of the data will ever be decoded successfully. I suspect the log-and-continue thing was added because of issues like #38433. The solution to that should be fix the bug.

Steps to Reproduce

  1. Configure the Firehose receiver with the extension using JSON format
  2. Create a CloudWatch Metric Stream with OpenTelemetry 1.0 format pointed at the collector

Expected Result

We should see helpful error messages in the Firehose delivery stream logs, giving a hint as to what has been misconfigured.

Actual Result

A fairly generic "0 metrics were extracted from the record" error message.

Collector version

v0.121.0

Environment information

No response

OpenTelemetry Collector configuration

Log output

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions