Skip to content

Commit be4abe1

Browse files
authored
Merge pull request #1614 from ingvagabund/nodeutilization-metrics-source
[nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration
2 parents a4d6119 + e14b86e commit be4abe1

17 files changed

+226
-49
lines changed

README.md

+22-8
Original file line numberDiff line numberDiff line change
@@ -124,11 +124,22 @@ These are top level keys in the Descheduler Policy that you can use to configure
124124
| `maxNoOfPodsToEvictPerNode` | `int` | `nil` | Maximum number of pods evicted from each node (summed through all strategies). |
125125
| `maxNoOfPodsToEvictPerNamespace` | `int` | `nil` | Maximum number of pods evicted from each namespace (summed through all strategies). |
126126
| `maxNoOfPodsToEvictTotal` | `int` | `nil` | Maximum number of pods evicted per rescheduling cycle (summed through all strategies). |
127-
| `metricsCollector` | `object` | `nil` | Configures collection of metrics for actual resource utilization. |
127+
| `metricsCollector` (deprecated) | `object` | `nil` | Configures collection of metrics for actual resource utilization. |
128128
| `metricsCollector.enabled` | `bool` | `false` | Enables Kubernetes [Metrics Server](https://kubernetes-sigs.github.io/metrics-server/) collection. |
129+
| `metricsProviders` | `[]object` | `nil` | Enables various metrics providers like Kubernetes [Metrics Server](https://kubernetes-sigs.github.io/metrics-server/) |
129130
| `evictionFailureEventNotification` | `bool` | `false` | Enables eviction failure event notification. |
130131
| `gracePeriodSeconds` | `int` | `0` | The duration in seconds before the object should be deleted. The value zero indicates delete immediately. |
131132

133+
The descheduler currently allows to configure a metric collection of Kubernetes Metrics through `metricsProviders` field.
134+
The previous way of setting `metricsCollector` field is deprecated. There is currently one source to configure:
135+
```
136+
metricsProviders:
137+
- source: KubernetesMetrics
138+
```
139+
The list can be extended with other metrics providers in the future.
140+
In general, each plugin can consume metrics from a different provider so multiple distinct providers can be configured in parallel.
141+
142+
132143
### Evictor Plugin configuration (Default Evictor)
133144

134145
The Default Evictor Plugin is used by default for filtering pods before processing them in an strategy plugin, or for applying a PreEvictionFilter of pods before eviction. You can also create your own Evictor Plugin or use the Default one provided by Descheduler. Other uses for the Evictor plugin can be to sort, filter, validate or group pods by different criteria, and that's why this is handled by a plugin and not configured in the top level config.
@@ -163,8 +174,9 @@ maxNoOfPodsToEvictPerNode: 5000 # you don't need to set this, unlimited if not s
163174
maxNoOfPodsToEvictPerNamespace: 5000 # you don't need to set this, unlimited if not set
164175
maxNoOfPodsToEvictTotal: 5000 # you don't need to set this, unlimited if not set
165176
gracePeriodSeconds: 60 # you don't need to set this, 0 if not set
166-
metricsCollector:
167-
enabled: true # you don't need to set this, metrics are not collected if not set
177+
# you don't need to set this, Kubernetes metrics are not collected if not set
178+
metricsProviders:
179+
- source: KubernetesMetrics
168180
profiles:
169181
- name: ProfileName
170182
pluginConfig:
@@ -288,9 +300,10 @@ A resource consumption above (resp. below) this window is considered as overutil
288300
This approach is chosen in order to maintain consistency with the kube-scheduler, which follows the same
289301
design for scheduling pods onto nodes. This means that resource usage as reported by Kubelet (or commands
290302
like `kubectl top`) may differ from the calculated consumption, due to these components reporting
291-
actual usage metrics. Metrics-based descheduling can be enabled by setting `metricsUtilization.metricsServer` field.
292-
In order to have the plugin consume the metrics the metric collector needs to be configured as well.
293-
See `metricsCollector` field at [Top Level configuration](#top-level-configuration) for available options.
303+
actual usage metrics. Metrics-based descheduling can be enabled by setting `metricsUtilization.metricsServer` field (deprecated)
304+
or `metricsUtilization.source` field to `KubernetesMetrics`.
305+
In order to have the plugin consume the metrics the metric provider needs to be configured as well.
306+
See `metricsProviders` field at [Top Level configuration](#top-level-configuration) for available options.
294307

295308
**Parameters:**
296309

@@ -303,7 +316,8 @@ See `metricsCollector` field at [Top Level configuration](#top-level-configurati
303316
|`evictionLimits`|object|
304317
|`evictableNamespaces`|(see [namespace filtering](#namespace-filtering))|
305318
|`metricsUtilization`|object|
306-
|`metricsUtilization.metricsServer`|bool|
319+
|`metricsUtilization.metricsServer` (deprecated)|bool|
320+
|`metricsUtilization.source`|string|
307321

308322

309323
**Example:**
@@ -325,7 +339,7 @@ profiles:
325339
"memory": 50
326340
"pods": 50
327341
metricsUtilization:
328-
metricsServer: true
342+
source: KubernetesMetrics
329343
evictionLimits:
330344
node: 5
331345
plugins:

charts/descheduler/templates/clusterrole.yaml

+5-1
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,13 @@ rules:
3636
resourceNames: ["{{ .Values.leaderElection.resourceName | default "descheduler" }}"]
3737
verbs: ["get", "patch", "delete"]
3838
{{- end }}
39-
{{- if and .Values.deschedulerPolicy .Values.deschedulerPolicy.metricsCollector .Values.deschedulerPolicy.metricsCollector.enabled }}
39+
{{- if and .Values.deschedulerPolicy }}
40+
{{- range .Values.deschedulerPolicy.metricsProviders }}
41+
{{- if and (hasKey . "source") (eq .source "KubernetesMetrics") }}
4042
- apiGroups: ["metrics.k8s.io"]
4143
resources: ["pods", "nodes"]
4244
verbs: ["get", "list"]
4345
{{- end }}
46+
{{- end }}
47+
{{- end }}
4448
{{- end -}}

charts/descheduler/values.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -96,8 +96,8 @@ deschedulerPolicy:
9696
# nodeSelector: "key1=value1,key2=value2"
9797
# maxNoOfPodsToEvictPerNode: 10
9898
# maxNoOfPodsToEvictPerNamespace: 10
99-
# metricsCollector:
100-
# enabled: true
99+
# metricsProviders:
100+
# - source: KubernetesMetrics
101101
# ignorePvcPods: true
102102
# evictLocalStoragePods: true
103103
# evictDaemonSetPods: true

pkg/api/types.go

+21-3
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,11 @@ type DeschedulerPolicy struct {
4848
EvictionFailureEventNotification *bool
4949

5050
// MetricsCollector configures collection of metrics about actual resource utilization
51-
MetricsCollector MetricsCollector
51+
// Deprecated. Use MetricsProviders field instead.
52+
MetricsCollector *MetricsCollector
53+
54+
// MetricsProviders configure collection of metrics about actual resource utilization from various sources
55+
MetricsProviders []MetricsProvider
5256

5357
// GracePeriodSeconds The duration in seconds before the object should be deleted. Value must be non-negative integer.
5458
// The value zero indicates delete immediately. If this value is nil, the default grace period for the
@@ -105,12 +109,26 @@ type PluginSet struct {
105109
Disabled []string
106110
}
107111

112+
type MetricsSource string
113+
114+
const (
115+
// KubernetesMetrics enables metrics from a Kubernetes metrics server.
116+
// Please see https://kubernetes-sigs.github.io/metrics-server/ for more.
117+
KubernetesMetrics MetricsSource = "KubernetesMetrics"
118+
)
119+
108120
// MetricsCollector configures collection of metrics about actual resource utilization
109121
type MetricsCollector struct {
110-
// Enabled metrics collection from kubernetes metrics.
111-
// Later, the collection can be extended to other providers.
122+
// Enabled metrics collection from Kubernetes metrics.
123+
// Deprecated. Use MetricsProvider.Source field instead.
112124
Enabled bool
113125
}
114126

127+
// MetricsProvider configures collection of metrics about actual resource utilization from a given source
128+
type MetricsProvider struct {
129+
// Source enables metrics from Kubernetes metrics server.
130+
Source MetricsSource
131+
}
132+
115133
// ReferencedResourceList is an adaption of v1.ResourceList with resources as references
116134
type ReferencedResourceList = map[v1.ResourceName]*resource.Quantity

pkg/api/v1alpha2/types.go

+21-3
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,11 @@ type DeschedulerPolicy struct {
4646
EvictionFailureEventNotification *bool `json:"evictionFailureEventNotification,omitempty"`
4747

4848
// MetricsCollector configures collection of metrics for actual resource utilization
49-
MetricsCollector MetricsCollector `json:"metricsCollector,omitempty"`
49+
// Deprecated. Use MetricsProviders field instead.
50+
MetricsCollector *MetricsCollector `json:"metricsCollector,omitempty"`
51+
52+
// MetricsProviders configure collection of metrics about actual resource utilization from various sources
53+
MetricsProviders []MetricsProvider `json:"metricsProviders,omitempty"`
5054

5155
// GracePeriodSeconds The duration in seconds before the object should be deleted. Value must be non-negative integer.
5256
// The value zero indicates delete immediately. If this value is nil, the default grace period for the
@@ -80,9 +84,23 @@ type PluginSet struct {
8084
Disabled []string `json:"disabled"`
8185
}
8286

87+
type MetricsSource string
88+
89+
const (
90+
// KubernetesMetrics enables metrics from a Kubernetes metrics server.
91+
// Please see https://kubernetes-sigs.github.io/metrics-server/ for more.
92+
KubernetesMetrics MetricsSource = "KubernetesMetrics"
93+
)
94+
8395
// MetricsCollector configures collection of metrics about actual resource utilization
8496
type MetricsCollector struct {
85-
// Enabled metrics collection from kubernetes metrics.
86-
// Later, the collection can be extended to other providers.
97+
// Enabled metrics collection from Kubernetes metrics server.
98+
// Deprecated. Use MetricsProvider.Source field instead.
8799
Enabled bool `json:"enabled,omitempty"`
88100
}
101+
102+
// MetricsProvider configures collection of metrics about actual resource utilization from a given source
103+
type MetricsProvider struct {
104+
// Source enables metrics from Kubernetes metrics server.
105+
Source MetricsSource `json:"source,omitempty"`
106+
}

pkg/api/v1alpha2/zz_generated.conversion.go

+34-6
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pkg/api/v1alpha2/zz_generated.deepcopy.go

+26-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pkg/api/zz_generated.deepcopy.go

+26-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pkg/descheduler/descheduler.go

+3-3
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ func newDescheduler(ctx context.Context, rs *options.DeschedulerServer, deschedu
166166
}
167167

168168
var metricsCollector *metricscollector.MetricsCollector
169-
if deschedulerPolicy.MetricsCollector.Enabled {
169+
if (deschedulerPolicy.MetricsCollector != nil && deschedulerPolicy.MetricsCollector.Enabled) || (len(deschedulerPolicy.MetricsProviders) > 0 && deschedulerPolicy.MetricsProviders[0].Source == api.KubernetesMetrics) {
170170
nodeSelector := labels.Everything()
171171
if deschedulerPolicy.NodeSelector != nil {
172172
sel, err := labels.Parse(*deschedulerPolicy.NodeSelector)
@@ -332,7 +332,7 @@ func Run(ctx context.Context, rs *options.DeschedulerServer) error {
332332
return err
333333
}
334334

335-
if deschedulerPolicy.MetricsCollector.Enabled {
335+
if (deschedulerPolicy.MetricsCollector != nil && deschedulerPolicy.MetricsCollector.Enabled) || (len(deschedulerPolicy.MetricsProviders) > 0 && deschedulerPolicy.MetricsProviders[0].Source == api.KubernetesMetrics) {
336336
metricsClient, err := client.CreateMetricsClient(clientConnection, "descheduler")
337337
if err != nil {
338338
return err
@@ -448,7 +448,7 @@ func RunDeschedulerStrategies(ctx context.Context, rs *options.DeschedulerServer
448448
sharedInformerFactory.WaitForCacheSync(ctx.Done())
449449
descheduler.podEvictor.WaitForEventHandlersSync(ctx)
450450

451-
if deschedulerPolicy.MetricsCollector.Enabled {
451+
if (deschedulerPolicy.MetricsCollector != nil && deschedulerPolicy.MetricsCollector.Enabled) || (len(deschedulerPolicy.MetricsProviders) > 0 && deschedulerPolicy.MetricsProviders[0].Source == api.KubernetesMetrics) {
452452
go func() {
453453
klog.V(2).Infof("Starting metrics collector")
454454
descheduler.metricsCollector.Run(ctx)

pkg/descheduler/descheduler_test.go

+7-3
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,10 @@ func removeDuplicatesPolicy() *api.DeschedulerPolicy {
136136
}
137137

138138
func lowNodeUtilizationPolicy(thresholds, targetThresholds api.ResourceThresholds, metricsEnabled bool) *api.DeschedulerPolicy {
139+
var metricsSource api.MetricsSource = ""
140+
if metricsEnabled {
141+
metricsSource = api.KubernetesMetrics
142+
}
139143
return &api.DeschedulerPolicy{
140144
Profiles: []api.DeschedulerProfile{
141145
{
@@ -146,8 +150,8 @@ func lowNodeUtilizationPolicy(thresholds, targetThresholds api.ResourceThreshold
146150
Args: &nodeutilization.LowNodeUtilizationArgs{
147151
Thresholds: thresholds,
148152
TargetThresholds: targetThresholds,
149-
MetricsUtilization: nodeutilization.MetricsUtilization{
150-
MetricsServer: metricsEnabled,
153+
MetricsUtilization: &nodeutilization.MetricsUtilization{
154+
Source: metricsSource,
151155
},
152156
},
153157
},
@@ -837,7 +841,7 @@ func TestLoadAwareDescheduling(t *testing.T) {
837841
},
838842
true, // enabled metrics utilization
839843
)
840-
policy.MetricsCollector.Enabled = true
844+
policy.MetricsProviders = []api.MetricsProvider{{Source: api.KubernetesMetrics}}
841845

842846
ctxCancel, cancel := context.WithCancel(ctx)
843847
_, descheduler, _ := initDescheduler(

0 commit comments

Comments
 (0)