Skip to content

Commit 4440c7d

Browse files
Tim BannisterChris Negus
Tim Bannister
and
Chris Negus
committed
Improve docs for HorizontalPodAutoscaler
Co-authored-by: Chris Negus <[email protected]>
1 parent 6f7f981 commit 4440c7d

File tree

2 files changed

+261
-187
lines changed

2 files changed

+261
-187
lines changed

content/en/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough.md

+105-55
Original file line numberDiff line numberDiff line change
@@ -4,50 +4,68 @@ reviewers:
44
- jszczepkowski
55
- justinsb
66
- directxman12
7-
title: Horizontal Pod Autoscaler Walkthrough
7+
title: HorizontalPodAutoscaler Walkthrough
88
content_type: task
99
weight: 100
10+
min-kubernetes-server-version: 1.23
1011
---
1112

1213
<!-- overview -->
1314

14-
Horizontal Pod Autoscaler automatically scales the number of Pods
15-
in a replication controller, deployment, replica set or stateful set based on observed CPU utilization
16-
(or, with beta support, on some other, application-provided metrics).
15+
A [HorizontalPodAutoscaler](/docs/tasks/run-application/horizontal-pod-autoscale/)
16+
(HPA for short)
17+
automatically updates a workload resource (such as
18+
a {{< glossary_tooltip text="Deployment" term_id="deployment" >}} or
19+
{{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}), with the
20+
aim of automatically scaling the workload to match demand.
1721

18-
This document walks you through an example of enabling Horizontal Pod Autoscaler for the php-apache server.
19-
For more information on how Horizontal Pod Autoscaler behaves, see the
20-
[Horizontal Pod Autoscaler user guide](/docs/tasks/run-application/horizontal-pod-autoscale/).
22+
Horizontal scaling means that the response to increased load is to deploy more
23+
{{< glossary_tooltip text="Pods" term_id="pod" >}}.
24+
This is different from _vertical_ scaling, which for Kubernetes would mean
25+
assigning more resources (for example: memory or CPU) to the Pods that are already
26+
running for the workload.
27+
28+
If the load decreases, and the number of Pods is above the configured minimum,
29+
the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet,
30+
or other similar resource) to scale back down.
31+
32+
This document walks you through an example of enabling HorizontalPodAutoscaler to
33+
automatically manage scale for an example web app. This example workload is Apache
34+
httpd running some PHP code.
2135

2236
## {{% heading "prerequisites" %}}
2337

24-
This example requires a running Kubernetes cluster and kubectl, version 1.2 or later.
25-
[Metrics server](https://github.com/kubernetes-sigs/metrics-server) monitoring needs to be deployed
26-
in the cluster to provide metrics through the [Metrics API](https://github.com/kubernetes/metrics).
27-
Horizontal Pod Autoscaler uses this API to collect metrics. To learn how to deploy the metrics-server,
28-
see the [metrics-server documentation](https://github.com/kubernetes-sigs/metrics-server#deployment).
38+
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}} If you're running an older
39+
release of Kubernetes, refer to the version of the documentation for that release (see
40+
[available documentation versions](/docs/home/supported-doc-versions/).
41+
42+
To follow this walkthrough, you also need to use a cluster that has a
43+
[Metrics Server](https://github.com/kubernetes-sigs/metrics-server#readme) deployed and configured.
44+
The Kubernetes Metrics Server collects resource metrics from
45+
the {{<glossary_tooltip term_id="kubelet" text="kubelets">}} in your cluster, and exposes those metrics
46+
through the [Kubernetes API](/docs/concepts/overview/kubernetes-api/),
47+
using an [APIService](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/) to add
48+
new kinds of resource that represent metric readings.
2949

30-
To specify multiple resource metrics for a Horizontal Pod Autoscaler, you must have a
31-
Kubernetes cluster and kubectl at version 1.6 or later. To make use of custom metrics, your cluster
32-
must be able to communicate with the API server providing the custom Metrics API.
33-
Finally, to use metrics not related to any Kubernetes object you must have a
34-
Kubernetes cluster at version 1.10 or later, and you must be able to communicate
35-
with the API server that provides the external Metrics API.
36-
See the [Horizontal Pod Autoscaler user guide](/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics) for more details.
50+
To learn how to deploy the Metrics Server, see the
51+
[metrics-server documentation](https://github.com/kubernetes-sigs/metrics-server#deployment).
3752

3853
<!-- steps -->
3954

4055
## Run and expose php-apache server
4156

42-
To demonstrate Horizontal Pod Autoscaler we will use a custom docker image based on the php-apache image. The Dockerfile has the following content:
57+
To demonstrate a HorizontalPodAutoscaler, you will first make a custom container image that uses
58+
the `php-apache` image from Docker Hub as its starting point. The `Dockerfile` is ready-made for you,
59+
and has the following content:
4360

4461
```dockerfile
4562
FROM php:5-apache
4663
COPY index.php /var/www/html/index.php
4764
RUN chmod a+rx index.php
4865
```
4966

50-
It defines an index.php page which performs some CPU intensive computations:
67+
This code defines a simple `index.php` page that performs some CPU intensive computations,
68+
in order to simulate load in your cluster.
5169

5270
```php
5371
<?php
@@ -59,12 +77,13 @@ It defines an index.php page which performs some CPU intensive computations:
5977
?>
6078
```
6179

62-
First, we will start a deployment running the image and expose it as a service
63-
using the following configuration:
80+
Once you have made that container image, start a Deployment that runs a container using the
81+
image you made, and expose it as a {{< glossary_tooltip term_id="service">}}
82+
using the following manifest:
6483

6584
{{< codenew file="application/php-apache.yaml" >}}
6685

67-
Run the following command:
86+
To do so, run the following command:
6887

6988
```shell
7089
kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
@@ -75,16 +94,27 @@ deployment.apps/php-apache created
7594
service/php-apache created
7695
```
7796

78-
## Create Horizontal Pod Autoscaler
97+
## Create the HorizontalPodAutoscaler {#create-horizontal-pod-autoscaler}
98+
99+
Now that the server is running, create the autoscaler using `kubectl`. There is
100+
[`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand,
101+
part of `kubectl`, that helps you do this.
102+
103+
You will shortly run a command that creates a HorizontalPodAutoscaler that maintains
104+
between 1 and 10 replicas of the Pods controlled by the php-apache Deployment that
105+
you created in the first step of these instructions.
106+
107+
Roughly speaking, the HPA {{<glossary_tooltip text="controller" term_id="controller">}} will increase and decrease
108+
the number of replicas (by updating the Deployment) to maintain an average CPU utilization across all Pods of 50%.
109+
The Deployment then updates the ReplicaSet - this is part of how all Deployments work in Kubernetes -
110+
and then the ReplicaSet either adds or removes Pods based on the change to its `.spec`.
79111

80-
Now that the server is running, we will create the autoscaler using
81-
[kubectl autoscale](/docs/reference/generated/kubectl/kubectl-commands#autoscale).
82-
The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods
83-
controlled by the php-apache deployment we created in the first step of these instructions.
84-
Roughly speaking, HPA will increase and decrease the number of replicas
85-
(via the deployment) to maintain an average CPU utilization across all Pods of 50%.
86112
Since each pod requests 200 milli-cores by `kubectl run`, this means an average CPU usage of 100 milli-cores.
87-
See [here](/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) for more details on the algorithm.
113+
See [Algorithm details](/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) for more details
114+
on the algorithm.
115+
116+
117+
Create the HorizontalPodAutoscaler:
88118

89119
```shell
90120
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
@@ -94,47 +124,64 @@ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
94124
horizontalpodautoscaler.autoscaling/php-apache autoscaled
95125
```
96126

97-
We may check the current status of autoscaler by running:
127+
You can check the current status of the newly-made HorizontalPodAutoscaler, by running:
98128

99129
```shell
130+
# You can use "hpa" or "horizontalpodautoscaler"; either name works OK.
100131
kubectl get hpa
101132
```
102133

134+
The output is similar to:
103135
```
104136
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE
105137
php-apache Deployment/php-apache/scale 0% / 50% 1 10 1 18s
106138
```
107139

108-
Please note that the current CPU consumption is 0% as we are not sending any requests to the server
109-
(the ``TARGET`` column shows the average across all the pods controlled by the corresponding deployment).
140+
(if you see other HorizontalPodAutoscalers with different names, that means they already existed,
141+
and isn't usually a problem).
142+
143+
Please note that the current CPU consumption is 0% as there are no clients sending requests to the server
144+
(the ``TARGET`` column shows the average across all the Pods controlled by the corresponding deployment).
110145

111-
## Increase load
146+
## Increase the load {#increase-load}
112147

113-
Now, we will see how the autoscaler reacts to increased load.
114-
We will start a container, and send an infinite loop of queries to the php-apache service (please run it in a different terminal):
148+
Next, see how the autoscaler reacts to increased load.
149+
To do this, you'll start a different Pod to act as a client. The container within the client Pod
150+
runs in an infinite loop, sending queries to the php-apache service.
115151

116152
```shell
153+
# Run this in a separate terminal
154+
# so that the load generation continues and you can carry on with the rest of the steps
117155
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
118156
```
119157

120-
Within a minute or so, we should see the higher CPU load by executing:
121-
158+
Now run:
122159
```shell
123-
kubectl get hpa
160+
# type Ctrl+C to end the watch when you're ready
161+
kubectl get hpa php-apache --watch
124162
```
125163

164+
Within a minute or so, you should see the higher CPU load; for example:
165+
126166
```
127167
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE
128168
php-apache Deployment/php-apache/scale 305% / 50% 1 10 1 3m
129169
```
130170

171+
and then, more replicas. For example:
172+
```
173+
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE
174+
php-apache Deployment/php-apache/scale 305% / 50% 1 10 7 3m
175+
```
176+
131177
Here, CPU consumption has increased to 305% of the request.
132-
As a result, the deployment was resized to 7 replicas:
178+
As a result, the Deployment was resized to 7 replicas:
133179

134180
```shell
135181
kubectl get deployment php-apache
136182
```
137183

184+
You should see the replica count matching the figure from the HorizontalPodAutoscaler
138185
```
139186
NAME READY UP-TO-DATE AVAILABLE AGE
140187
php-apache 7/7 7 7 19m
@@ -146,24 +193,29 @@ of load is not controlled in any way it may happen that the final number of repl
146193
will differ from this example.
147194
{{< /note >}}
148195

149-
## Stop load
196+
## Stop generating load {#stop-load}
150197

151-
We will finish our example by stopping the user load.
198+
To finish the example, stop sending the load.
152199

153-
In the terminal where we created the container with `busybox` image, terminate
200+
In the terminal where you created the Pod that runs a `busybox` image, terminate
154201
the load generation by typing `<Ctrl> + C`.
155202

156-
Then we will verify the result state (after a minute or so):
203+
Then verify the result state (after a minute or so):
157204

158205
```shell
159-
kubectl get hpa
206+
# type Ctrl+C to end the watch when you're ready
207+
kubectl get hpa php-apache --watch
160208
```
161209

210+
The output is similar to:
211+
162212
```
163213
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE
164214
php-apache Deployment/php-apache/scale 0% / 50% 1 10 1 11m
165215
```
166216

217+
and the Deployment also shows that it has scaled down:
218+
167219
```shell
168220
kubectl get deployment php-apache
169221
```
@@ -173,11 +225,9 @@ NAME READY UP-TO-DATE AVAILABLE AGE
173225
php-apache 1/1 1 1 27m
174226
```
175227

176-
Here CPU utilization dropped to 0, and so HPA autoscaled the number of replicas back down to 1.
228+
Once CPU utilization dropped to 0, the HPA automatically scaled the number of replicas back down to 1.
177229

178-
{{< note >}}
179230
Autoscaling the replicas may take a few minutes.
180-
{{< /note >}}
181231

182232
<!-- discussion -->
183233

@@ -444,7 +494,7 @@ Conditions:
444494
Events:
445495
```
446496
447-
For this HorizontalPodAutoscaler, we can see several conditions in a healthy state. The first,
497+
For this HorizontalPodAutoscaler, you can see several conditions in a healthy state. The first,
448498
`AbleToScale`, indicates whether or not the HPA is able to fetch and update scales, as well as
449499
whether or not any backoff-related conditions would prevent scaling. The second, `ScalingActive`,
450500
indicates whether or not the HPA is enabled (i.e. the replica count of the target is not zero) and
@@ -454,7 +504,7 @@ was capped by the maximum or minimum of the HorizontalPodAutoscaler. This is an
454504
you may wish to raise or lower the minimum or maximum replica count constraints on your
455505
HorizontalPodAutoscaler.
456506
457-
## Appendix: Quantities
507+
## Quantities
458508
459509
All metrics in the HorizontalPodAutoscaler and metrics APIs are specified using
460510
a special whole-number notation known in Kubernetes as a
@@ -464,16 +514,16 @@ will return whole numbers without a suffix when possible, and will generally ret
464514
quantities in milli-units otherwise. This means you might see your metric value fluctuate
465515
between `1` and `1500m`, or `1` and `1.5` when written in decimal notation.
466516
467-
## Appendix: Other possible scenarios
517+
## Other possible scenarios
468518
469519
### Creating the autoscaler declaratively
470520
471521
Instead of using `kubectl autoscale` command to create a HorizontalPodAutoscaler imperatively we
472-
can use the following file to create it declaratively:
522+
can use the following manifest to create it declaratively:
473523
474524
{{< codenew file="application/hpa/php-apache.yaml" >}}
475525
476-
We will create the autoscaler by executing the following command:
526+
Then, create the autoscaler by executing the following command:
477527
478528
```shell
479529
kubectl create -f https://k8s.io/examples/application/hpa/php-apache.yaml

0 commit comments

Comments
 (0)