Surface failed metric collections to the collector #97

isker · 2025-01-29T22:55:04Z

We swallow errors making ECS task metadata API requests here and here. There is actually a way to surface errors to the caller: NewInvalidMetric.

I think we should use it so that errors are surfaced at write time, as in promhttp. The ECS APIs are supposed to be infallible, so the default behavior of serving an HTTP 500 on the /metrics request probably makes sense for everyone in all situations, though we could add a flag to control the behavior.

We occasionally see the ECS APIs called by ecs_exporter 500 on the first request or two, so the metrics served by the exporter are nonsense. By using NewInvalidMetric in response to Collect, we can ensure that HTTP 500s are served on /metrics whenever such errors occur. Closes prometheus-community#97. Signed-off-by: Ian Kerins <[email protected]>

isker mentioned this issue Jan 29, 2025

Fix duplicate metrics for network stats and panics on invalid HTTP status codes #98

Merged

isker linked a pull request Apr 12, 2025 that will close this issue

Surface failed ECS API requests to the collector #111

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surface failed metric collections to the collector #97

Surface failed metric collections to the collector #97

isker commented Jan 29, 2025

Surface failed metric collections to the collector #97

Surface failed metric collections to the collector #97

Comments

isker commented Jan 29, 2025