Skip to content

Commit 6e1abbf

Browse files
vijaysgbhatnitish
authored andcommitted
Default ServiceMonitor HonorLabels to true & document pod label conflict (#632) (#633)
Co-authored-by: Nitish Bhat <[email protected]> (cherry picked from commit 5a4fe675365227a818b8e2deb54fb8db3f93407d) Co-authored-by: Nitish Bhat <[email protected]>
1 parent e2adf87 commit 6e1abbf

File tree

2 files changed

+68
-1
lines changed

2 files changed

+68
-1
lines changed

api/v1alpha1/deviceconfig_types.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -489,9 +489,10 @@ type ServiceMonitorConfig struct {
489489
// +optional
490490
AttachMetadata *monitoringv1.AttachMetadata `json:"attachMetadata,omitempty"`
491491

492-
// HonorLabels chooses the metric's labels on collisions with target labels (default false)
492+
// HonorLabels chooses the metric's labels on collisions with target labels (default true)
493493
//+operator-sdk:csv:customresourcedefinitions:type=spec,displayName="HonorLabels",xDescriptors={"urn:alm:descriptor:com.amd.deviceconfigs:honorLabels"}
494494
// +optional
495+
// +kubebuilder:default=true
495496
HonorLabels *bool `json:"honorLabels,omitempty"`
496497

497498
// HonorTimestamps controls whether the scrape endpoints honor timestamps (default false)

docs/metrics/prometheus.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,72 @@ After the **ServiceMonitor** is deployed, Prometheus automatically begins scrapi
9494

9595
These selectors help Prometheus identify the correct ServiceMonitor to use in the AMD GPU Operator namespace and begin metrics scraping.
9696

97+
## Using with device-metrics-exporter Grafana Dashboards
98+
99+
The [ROCm/device-metrics-exporter](https://github.com/ROCm/device-metrics-exporter) repository includes Grafana dashboards designed to visualize the exported metrics, particularly focusing on job-level or pod-level GPU usage. These dashboards rely on specific labels exported by the metrics exporter, such as:
100+
101+
* `pod`: The name of the workload Pod currently utilizing the GPU.
102+
* `job_id`: An identifier for the job associated with the workload Pod.
103+
104+
### The `pod` Label Conflict
105+
106+
When Prometheus scrapes targets defined by a `ServiceMonitor`, it automatically attaches labels to the metrics based on the target's metadata. One such label is `pod`, which identifies the Pod being scraped (in this case, the metrics exporter Pod itself).
107+
108+
This creates a conflict:
109+
1. **Exporter Metric Label:** `pod="<workload-pod-name>"` (Indicates the actual GPU user)
110+
2. **Prometheus Target Label:** `pod="<metrics-exporter-pod-name>"` (Indicates the source of the metric)
111+
112+
### Solution 1: `honorLabels: true` (Default)
113+
114+
To ensure the Grafana dashboards function correctly by using the workload pod name, the `ServiceMonitor` created by the GPU Operator needs to prioritize the labels coming directly from the metrics exporter over the labels added by Prometheus during the scrape.
115+
116+
This is achieved by setting `honorLabels: true` in the `ServiceMonitor` configuration within the `DeviceConfig`. **This is the default setting in the GPU Operator.**
117+
118+
```yaml
119+
# Example DeviceConfig snippet
120+
spec:
121+
metricsExporter:
122+
prometheus:
123+
serviceMonitor:
124+
enable: true
125+
# honorLabels defaults to true, ensuring exporter's 'pod' label is kept
126+
# honorLabels: true
127+
# ... other ServiceMonitor settings
128+
```
129+
130+
**Important:** For this to work, the `device-metrics-exporter` must actually be exporting the `pod` label, which typically only happens when a workload is actively using the GPU on that node. If no workload is present, the `pod` label might be missing from the metric, and the dashboards might not display data as expected for that specific GPU/node.
131+
132+
### Solution 2: Relabeling
133+
134+
An alternative approach is to use Prometheus relabeling rules within the `ServiceMonitor` definition. This allows you to explicitly handle the conflicting `pod` label added by Prometheus.
135+
136+
You can rename the Prometheus-added `pod` label (identifying the exporter pod) to something else (e.g., `exporter_pod`) and then drop the original `pod` label added by Prometheus. This prevents the conflict and ensures the `pod` label from the exporter (identifying the workload) is the only one present on the final ingested metric.
137+
138+
Add the following `relabelings` to your `ServiceMonitor` configuration in the `DeviceConfig`:
139+
140+
```yaml
141+
# Example DeviceConfig snippet
142+
spec:
143+
metricsExporter:
144+
prometheus:
145+
serviceMonitor:
146+
enable: true
147+
honorLabels: false # Must be false if using relabeling to preserve exporter_pod
148+
relabelings:
149+
# Rename the Prometheus-added 'pod' label to 'exporter_pod'
150+
- sourceLabels: [pod]
151+
targetLabel: exporter_pod
152+
action: replace
153+
regex: (.*)
154+
replacement: $1
155+
# Drop the Prometheus-added 'pod' label to avoid conflict
156+
- action: labeldrop
157+
regex: pod
158+
# ... other ServiceMonitor settings
159+
```
160+
161+
This method explicitly resolves the conflict by manipulating the labels before ingestion, ensuring the `pod` label always refers to the workload pod as intended by the `device-metrics-exporter`.
162+
97163
## Conclusion
98164

99165
The AMD GPU Operator provides native support for Prometheus integration, simplifying GPU monitoring and alerting within Kubernetes clusters. By configuring the DeviceConfig CR, you can manage GPU metrics collection tailored to your requirements and preferences.

0 commit comments

Comments
 (0)