Skip to content

[exporter/prometheusremotewrite] Add Detailed Export Failure Metrics #39799

Open
@mt-hasan

Description

@mt-hasan

Component(s)

exporter/prometheusremotewrite

Is your feature request related to a problem? Please describe.

The prometheusremotewrite exporter currently lacks detailed metrics and logs for export failures. When issues like timeouts or authorization errors occur, users often encounter generic error messages, making it challenging to diagnose and address the root causes effectively.

Problem:

  1. No clear metrics when queue isn't full but sends fail
  2. Limited visibility into failure types
{"level":
    "error",
    "timestamp":"2025-04-25T19:38:22.027Z",
    "caller":"exporterhelper/queue_sender.go:90",
    "message":"Exporting failed. Dropping data.",
    "kind":"exporter",
    "data_type":"metrics",
    "name":"prometheusremotewrite",
    "error":"Permanent error: Permanent error: context deadline exceeded",
    "dropped_items":5,
    "stack":"go.opentelemetry.io/collector/exporter/exporterhelper.newQueueSender ...
}

Describe the solution you'd like

We propose enhancing the PRW exporter to provide more granular metrics and logs for export failures.

  • Granular Failure Metrics:
    • Introduce a metric prw_export_failures_total with a reason label to categorize failure types:
      • HTTP status code families (4xx, 5xx)
      • Specific error types (e.g., "out of order sample", "timeout", "authorization")
      • Add prw_export_retries_total to count retry attempts.
    • Improved Logging:
      • Structured error messages with clear failure categorization
      • Include relevant debugging details (status codes, error messages)

These enhancements would significantly improve observability and help users quickly identify and resolve issues in their data export pipeline, regardless of the specific remote write endpoint they're using (e.g., Prometheus, Cortex, Thanos, or cloud-based solutions)

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions