Skip to content

Commit 2171770

Browse files
spraveeniosajmera-pensando
authored andcommitted
update support matrix and exporter version for each driver version
1 parent cb76dbc commit 2171770

File tree

1 file changed

+28
-0
lines changed

1 file changed

+28
-0
lines changed

docs/metrics/exporter.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,33 @@
11
# Metrics Exporter
22

3+
## Features
4+
5+
- Prometheus-compatible metrics endpoint
6+
- Rich GPU telemetry data including:
7+
- Temperature monitoring
8+
- Utilization metrics
9+
- Memory usage statistics
10+
- Power consumption data
11+
- PCIe bandwidth metrics
12+
- Performance metrics
13+
- Kubernetes integration via Helm chart
14+
- Slurm integration support
15+
- Configurable service ports
16+
- Container-based deployment
17+
18+
## Requirements
19+
20+
- Ubuntu 22.04, 24.04
21+
- Docker (or compatible container runtime)
22+
23+
| Rocm Version | Driver Version | Exporter Image Version | Platform |
24+
|--------------|----------------|------------------------|--------------|
25+
| 6.2.x | 6.8.5 | v1.0.0 | MI2xx, MI3xx |
26+
| 6.3.x | 6.10.5 | v1.1.0, v1.2.0 | MI2xx, MI3xx |
27+
| 6.4.x | 6.12.12 | v1.3.0 | MI3xx |
28+
| 6.4.x | 6.12.12 | v1.3.0.1 | MI2xx, MI3xx |
29+
30+
331
## Configure metrics exporter
432

533
To start the Device Metrics Exporter along with the GPU Operator configure the ``` spec/metricsExporter/enable ``` field in deviceconfig Custom Resource(CR) to enable/disable metrics exporter

0 commit comments

Comments
 (0)