-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Missing attributes in internal Collector logs #12870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Because of this problematic "off" behavior of the feature gate, @djaglowski and I plan to remove it, and the changes to internal telemetry will be enabled permanently; see PR #12856. |
Please rephrase this to: attributes are missing all the time.
Upgrading to what? |
I tried to make it clearer that the attributes are missing no matter how they are emitted.
I removed the mention of upgrading, since it seems to have been decided that we will move forward with enabling the feature by default, which will not help with the "export through otlp but no scope attribute support" case. |
…behind feature gate (#12933) #### Context PR #12617 introduced logic to inject new instrumentation scope attributes in all internal telemetry to identify which Collector component it came from. These attributes had already been added to internal logs as regular log attributes, and this PR switched them to scope attributes for consistency. The new logic was placed behind an Alpha stage feature gate, `telemetry.newPipelineTelemetry`. Unfortunately, the default "off" state of the feature gate disabled the injection of component-identifying attributes entirely, which was a regression since they had been present in internal logs in previous releases. See issue #12870 for an in-depth discussion of this issue. To correct this, PR #12856 was filed, which stabilized the feature gate, making it on by default, with no way to disable it, and removed the logic that the feature gate used to toggle. This was thought to be the simplest way to mitigate the regression in the "off" state, since we planned to stabilize the feature eventually anyways. Unfortunately, it was found that the "on" state of the feature gate causes a different issue: [the Prometheus exporter](https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus) is the default way of exporting the Collector's internal metrics, accessible at `collector:8888/metrics`. This exporter does not currently have any support for instrumentation scope attributes, meaning that metric streams differentiated by said attributes but not by any other identifying property will appear as aliases to Prometheus, which causes an error. This completely breaks the export of Collector metrics through Prometheus under some simple configurations, which is a release blocker. #### Description To fix this issue, this PR sets the `telemetry.newPipelineTelemetry` feature gate back to "Alpha" (off by default), and reintroduces logic to disable the injection of the new instrumentation scope attributes when the gate is off, but only in internal metrics. Note that the new logic is still used unconditionally for logs and traces, to avoid reintroducing the logs issue (#12870). This should avoid breaking the Collector in its default configuration while we try to get a fix in the Prometheus exporter. #### Link to tracking issue No tracking issue currently, will probably file one later. #### Testing I performed some simple manual testing with a config file like the following: ```yaml receivers: otlp: [...] processors: batch: exporters: debug: [...] service: pipelines: logs: receivers: [otlp] processors: [batch] exporters: [debug] traces: receivers: [otlp] processors: [batch] exporters: [debug] telemetry: metrics: level: detailed traces: [...] logs: [...] ``` The two batch processors create aliased metric streams, which are only differentiated by the new component attributes. I checked that: 1. this config causes an error in the Prometheus exporter on main; 2. the error is resolved by default after applying this PR; 3. the error reappears when enabling the feature gate (this is expected) 4. scope attributes are added on the traces and logs no matter the state of the gate.
Description
In versions 0.123.0 and 0.124.0 of the Collector, there has been a regression in the attributes included in internal Collector logs. Specifically, the
otelcol.
attributes defined in the Pipeline Component Telemetry RFC, which have been included in said logs since version 0.120.0 of the Collector, are now missing by default.Reproduction
This can be reproduced by starting a Collector at version 0.122.1 with a
debug
exporter in one of the pipelines.You should see a log like this:
By contrast, in versions 0.123.0 or 0.124.0, with no feature gates set, you will see the following:
In this case, information is missing to identify which component is responsible for the log.
The attributes are missing no matter how the logs are emitted; both in standard error output like above, and when exporting the Collector's logs using
service::telemetry::logs::processors
, as described in the documentation.Cause
This regression was introduced by PR #12617, whose primary goal was to extend these component attributes to internal metrics and traces.
In cases where logs are exported through
service::telemetry::logs::processors
, it also switched the component attributes in internal logs from datapoint attributes to instrumentation scope attributes, to fit with the other two signals and reduce data redundancy. This is a breaking change, but note that at this time, we make no stability guarantees on the format of the Collector's internal logs, so please avoid relying on them.The issue is that the changes in this PR were hastily put behind an alpha (off by default) feature gate,
telemetry.newPipelineTelemetry
. The result was that the "off" behavior was to omit the component attributes altogether in all circumstances.Workarounds
If you are gathering the Collector's logs from terminal output / standard error:
Enabling the
telemetry.newPipelineTelemetry
feature gate should restore the missing attributes to the log output. This can be done by running the Collector with the--feature-gates telemetry.newPipelineTelemetry
command-line argument.If you are exporting the Collector's logs through OTLP using
service::telemetry::logs::processors
:If you export to an endpoint which supports instrumentation scope attributes:
Enabling the feature gate will restore the missing attributes, but as instrumentation scope attributes instead of standard log attributes.
Please check with your observability vendor on whether they support ingestion of instrumentation scope attributes.
Otherwise:
We unfortunately do not have a good workaround for this case. Here are some options:
code.filepath
,code.function
, andcode.lineno
attributes) which are still present on the logs may be sufficient to identify which component a log originates from.Note that the behavior behind the feature gate, ie. using instrumentation scope attributes in internal logs to identify components, will eventually become the default, so exporting to an endpoint with no support for scope attributes will become problematic.
The text was updated successfully, but these errors were encountered: