Skip to content

Commit 6ab73d6

Browse files
authored
Merge pull request #1533 from ingvagabund/node-utilization-util-snapshot
[lownodeutilization]: Actual utilization: integration with Prometheus
2 parents be4abe1 + e283c31 commit 6ab73d6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+6600
-85
lines changed

README.md

+27-10
Original file line numberDiff line numberDiff line change
@@ -129,14 +129,18 @@ These are top level keys in the Descheduler Policy that you can use to configure
129129
| `metricsProviders` | `[]object` | `nil` | Enables various metrics providers like Kubernetes [Metrics Server](https://kubernetes-sigs.github.io/metrics-server/) |
130130
| `evictionFailureEventNotification` | `bool` | `false` | Enables eviction failure event notification. |
131131
| `gracePeriodSeconds` | `int` | `0` | The duration in seconds before the object should be deleted. The value zero indicates delete immediately. |
132+
| `prometheus` |`object`| `nil` | Configures collection of Prometheus metrics for actual resource utilization |
133+
| `prometheus.url` |`string`| `nil` | Points to a Prometheus server url |
134+
| `prometheus.authToken` |`object`| `nil` | Sets Prometheus server authentication token. If not specified in cluster authentication token from the container's file system is read. |
135+
| `prometheus.authToken.secretReference` |`object`| `nil` | Read the authentication token from a kubernetes secret (the secret is expected to contain the token under `prometheusAuthToken` data key) |
136+
| `prometheus.authToken.secretReference.namespace` |`string`| `nil` | Authentication token kubernetes secret namespace (currently, the RBAC configuration permits retrieving secrets from the `kube-system` namespace. If the secret needs to be accessed from a different namespace, the existing RBAC rules must be explicitly extended. |
137+
| `prometheus.authToken.secretReference.name` |`string`| `nil` | Authentication token kubernetes secret name |
132138

133139
The descheduler currently allows to configure a metric collection of Kubernetes Metrics through `metricsProviders` field.
134-
The previous way of setting `metricsCollector` field is deprecated. There is currently one source to configure:
135-
```
136-
metricsProviders:
137-
- source: KubernetesMetrics
138-
```
139-
The list can be extended with other metrics providers in the future.
140+
The previous way of setting `metricsCollector` field is deprecated. There are currently two sources to configure:
141+
- `KubernetesMetrics`: enables metrics collection from Kubernetes Metrics server
142+
- `Prometheus`: enables metrics collection from Prometheus server
143+
140144
In general, each plugin can consume metrics from a different provider so multiple distinct providers can be configured in parallel.
141145

142146

@@ -174,9 +178,15 @@ maxNoOfPodsToEvictPerNode: 5000 # you don't need to set this, unlimited if not s
174178
maxNoOfPodsToEvictPerNamespace: 5000 # you don't need to set this, unlimited if not set
175179
maxNoOfPodsToEvictTotal: 5000 # you don't need to set this, unlimited if not set
176180
gracePeriodSeconds: 60 # you don't need to set this, 0 if not set
177-
# you don't need to set this, Kubernetes metrics are not collected if not set
181+
# you don't need to set this, metrics are not collected if not set
178182
metricsProviders:
179-
- source: KubernetesMetrics
183+
- source: Prometheus
184+
prometheus:
185+
url: http://prometheus-kube-prometheus-prometheus.prom.svc.cluster.local
186+
authToken:
187+
secretReference:
188+
namespace: "kube-system"
189+
name: "authtoken"
180190
profiles:
181191
- name: ProfileName
182192
pluginConfig:
@@ -303,6 +313,10 @@ like `kubectl top`) may differ from the calculated consumption, due to these com
303313
actual usage metrics. Metrics-based descheduling can be enabled by setting `metricsUtilization.metricsServer` field (deprecated)
304314
or `metricsUtilization.source` field to `KubernetesMetrics`.
305315
In order to have the plugin consume the metrics the metric provider needs to be configured as well.
316+
Alternatively, it is possible to create a prometheus client and configure a prometheus query to consume
317+
metrics outside of the kubernetes metrics server. The query is expected to return a vector of values for
318+
each node. The values are expected to be any real number within <0; 1> interval. During eviction only
319+
a single pod is evicted at most from each overutilized node. There's currently no support for evicting more.
306320
See `metricsProviders` field at [Top Level configuration](#top-level-configuration) for available options.
307321

308322
**Parameters:**
@@ -318,6 +332,7 @@ See `metricsProviders` field at [Top Level configuration](#top-level-configurati
318332
|`metricsUtilization`|object|
319333
|`metricsUtilization.metricsServer` (deprecated)|bool|
320334
|`metricsUtilization.source`|string|
335+
|`metricsUtilization.prometheus.query`|string|
321336

322337

323338
**Example:**
@@ -338,8 +353,10 @@ profiles:
338353
"cpu" : 50
339354
"memory": 50
340355
"pods": 50
341-
metricsUtilization:
342-
source: KubernetesMetrics
356+
# metricsUtilization:
357+
# source: Prometheus
358+
# prometheus:
359+
# query: instance:node_cpu:rate:sum
343360
evictionLimits:
344361
node: 5
345362
plugins:

cmd/descheduler/app/options/options.go

+2
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ import (
2121
"strings"
2222
"time"
2323

24+
promapi "github.com/prometheus/client_golang/api"
2425
"github.com/spf13/pflag"
2526

2627
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
@@ -54,6 +55,7 @@ type DeschedulerServer struct {
5455
Client clientset.Interface
5556
EventClient clientset.Interface
5657
MetricsClient metricsclient.Interface
58+
PrometheusClient promapi.Client
5759
SecureServing *apiserveroptions.SecureServingOptionsWithLoopback
5860
SecureServingInfo *apiserver.SecureServingInfo
5961
DisableMetrics bool

go.mod

+4-2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ go 1.23.3
55
require (
66
github.com/client9/misspell v0.3.4
77
github.com/google/go-cmp v0.6.0
8+
github.com/prometheus/client_golang v1.19.1
9+
github.com/prometheus/common v0.55.0
810
github.com/spf13/cobra v1.8.1
911
github.com/spf13/pflag v1.0.5
1012
go.opentelemetry.io/otel v1.28.0
@@ -71,17 +73,17 @@ require (
7173
github.com/grpc-ecosystem/grpc-gateway/v2 v2.20.0 // indirect
7274
github.com/inconshreveable/mousetrap v1.1.0 // indirect
7375
github.com/josharian/intern v1.0.0 // indirect
76+
github.com/jpillora/backoff v1.0.0 // indirect
7477
github.com/json-iterator/go v1.1.12 // indirect
7578
github.com/mailru/easyjson v0.7.7 // indirect
7679
github.com/mmarkdown/mmark v2.0.40+incompatible // indirect
7780
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
7881
github.com/modern-go/reflect2 v1.0.2 // indirect
7982
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
83+
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f // indirect
8084
github.com/openshift/custom-resource-status v1.1.2 // indirect
8185
github.com/pkg/errors v0.9.1 // indirect
82-
github.com/prometheus/client_golang v1.19.1 // indirect
8386
github.com/prometheus/client_model v0.6.1 // indirect
84-
github.com/prometheus/common v0.55.0 // indirect
8587
github.com/prometheus/procfs v0.15.1 // indirect
8688
github.com/russross/blackfriday/v2 v2.1.0 // indirect
8789
github.com/stoewer/go-strcase v1.3.0 // indirect

go.sum

+4
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,8 @@ github.com/jonboulle/clockwork v0.4.0 h1:p4Cf1aMWXnXAUh8lVfewRBx1zaTSYKrKMF2g3ST
177177
github.com/jonboulle/clockwork v0.4.0/go.mod h1:xgRqUGwRcjKCO1vbZUEtSLrqKoPSsUpK7fnezOII0kc=
178178
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
179179
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
180+
github.com/jpillora/backoff v1.0.0 h1:uvFg412JmmHBHw7iwprIxkPMI+sGQ4kzOWsMeHnm2EA=
181+
github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4=
180182
github.com/json-iterator/go v1.1.6/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU=
181183
github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
182184
github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
@@ -209,6 +211,8 @@ github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjY
209211
github.com/munnerz/goautoneg v0.0.0-20120707110453-a547fc61f48d/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
210212
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
211213
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
214+
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f h1:KUppIJq7/+SVif2QVs3tOP0zanoHgBEVAwHxUSIzRqU=
215+
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
212216
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f/go.mod h1:ZdcZmHo+o7JKHSa8/e818NopupXU1YMK5fe1lsApnBw=
213217
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
214218
github.com/nxadm/tail v1.4.4/go.mod h1:kenIhsEOeOJmVchQTgglprH7qJGnHDVpk1VPCcaMI8A=

kubernetes/base/rbac.yaml

+22
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,15 @@ rules:
3636
resources: ["nodes", "pods"]
3737
verbs: ["get", "list"]
3838
---
39+
kind: Role
40+
apiVersion: rbac.authorization.k8s.io/v1
41+
metadata:
42+
name: descheduler-role
43+
rules:
44+
- apiGroups: [""]
45+
resources: ["secrets"]
46+
verbs: ["get", "list", "watch"]
47+
---
3948
apiVersion: v1
4049
kind: ServiceAccount
4150
metadata:
@@ -54,3 +63,16 @@ subjects:
5463
- name: descheduler-sa
5564
kind: ServiceAccount
5665
namespace: kube-system
66+
---
67+
apiVersion: rbac.authorization.k8s.io/v1
68+
kind: RoleBinding
69+
metadata:
70+
name: descheduler-role-binding
71+
roleRef:
72+
apiGroup: rbac.authorization.k8s.io
73+
kind: Role
74+
name: descheduler-role
75+
subjects:
76+
- name: descheduler-sa
77+
kind: ServiceAccount
78+
namespace: kube-system

pkg/api/types.go

+28
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,9 @@ const (
115115
// KubernetesMetrics enables metrics from a Kubernetes metrics server.
116116
// Please see https://kubernetes-sigs.github.io/metrics-server/ for more.
117117
KubernetesMetrics MetricsSource = "KubernetesMetrics"
118+
119+
// KubernetesMetrics enables metrics from a Prometheus metrics server.
120+
PrometheusMetrics MetricsSource = "Prometheus"
118121
)
119122

120123
// MetricsCollector configures collection of metrics about actual resource utilization
@@ -128,7 +131,32 @@ type MetricsCollector struct {
128131
type MetricsProvider struct {
129132
// Source enables metrics from Kubernetes metrics server.
130133
Source MetricsSource
134+
135+
// Prometheus enables metrics collection through Prometheus
136+
Prometheus *Prometheus
131137
}
132138

133139
// ReferencedResourceList is an adaption of v1.ResourceList with resources as references
134140
type ReferencedResourceList = map[v1.ResourceName]*resource.Quantity
141+
142+
type Prometheus struct {
143+
URL string
144+
// authToken used for authentication with the prometheus server.
145+
// If not set the in cluster authentication token for the descheduler service
146+
// account is read from the container's file system.
147+
AuthToken *AuthToken
148+
}
149+
150+
type AuthToken struct {
151+
// secretReference references an authentication token.
152+
// secrets are expected to be created under the descheduler's namespace.
153+
SecretReference *SecretReference
154+
}
155+
156+
// SecretReference holds a reference to a Secret
157+
type SecretReference struct {
158+
// namespace is the namespace of the secret.
159+
Namespace string
160+
// name is the name of the secret.
161+
Name string
162+
}

pkg/api/v1alpha2/types.go

+28
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,9 @@ const (
9090
// KubernetesMetrics enables metrics from a Kubernetes metrics server.
9191
// Please see https://kubernetes-sigs.github.io/metrics-server/ for more.
9292
KubernetesMetrics MetricsSource = "KubernetesMetrics"
93+
94+
// KubernetesMetrics enables metrics from a Prometheus metrics server.
95+
PrometheusMetrics MetricsSource = "Prometheus"
9396
)
9497

9598
// MetricsCollector configures collection of metrics about actual resource utilization
@@ -103,4 +106,29 @@ type MetricsCollector struct {
103106
type MetricsProvider struct {
104107
// Source enables metrics from Kubernetes metrics server.
105108
Source MetricsSource `json:"source,omitempty"`
109+
110+
// Prometheus enables metrics collection through Prometheus
111+
Prometheus *Prometheus `json:"prometheus,omitempty"`
112+
}
113+
114+
type Prometheus struct {
115+
URL string `json:"url,omitempty"`
116+
// authToken used for authentication with the prometheus server.
117+
// If not set the in cluster authentication token for the descheduler service
118+
// account is read from the container's file system.
119+
AuthToken *AuthToken `json:"authToken,omitempty"`
120+
}
121+
122+
type AuthToken struct {
123+
// secretReference references an authentication token.
124+
// secrets are expected to be created under the descheduler's namespace.
125+
SecretReference *SecretReference `json:"secretReference,omitempty"`
126+
}
127+
128+
// SecretReference holds a reference to a Secret
129+
type SecretReference struct {
130+
// namespace is the namespace of the secret.
131+
Namespace string `json:"namespace,omitempty"`
132+
// name is the name of the secret.
133+
Name string `json:"name,omitempty"`
106134
}

pkg/api/v1alpha2/zz_generated.conversion.go

+96
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)