-
Notifications
You must be signed in to change notification settings - Fork 15k
Improve docs for HorizontalPodAutoscaler #30711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -4,50 +4,68 @@ reviewers: | |||||||||
- jszczepkowski | ||||||||||
- justinsb | ||||||||||
- directxman12 | ||||||||||
title: Horizontal Pod Autoscaler Walkthrough | ||||||||||
title: HorizontalPodAutoscaler Walkthrough | ||||||||||
content_type: task | ||||||||||
weight: 100 | ||||||||||
min-kubernetes-server-version: 1.23 | ||||||||||
--- | ||||||||||
|
||||||||||
<!-- overview --> | ||||||||||
|
||||||||||
Horizontal Pod Autoscaler automatically scales the number of Pods | ||||||||||
in a replication controller, deployment, replica set or stateful set based on observed CPU utilization | ||||||||||
(or, with beta support, on some other, application-provided metrics). | ||||||||||
A [HorizontalPodAutoscaler](/docs/tasks/run-application/horizontal-pod-autoscale/) | ||||||||||
(HPA for short) | ||||||||||
automatically updates a workload resource (such as | ||||||||||
a {{< glossary_tooltip text="Deployment" term_id="deployment" >}} or | ||||||||||
{{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}), with the | ||||||||||
aim of automatically scaling the workload to match demand. | ||||||||||
|
||||||||||
This document walks you through an example of enabling Horizontal Pod Autoscaler for the php-apache server. | ||||||||||
For more information on how Horizontal Pod Autoscaler behaves, see the | ||||||||||
[Horizontal Pod Autoscaler user guide](/docs/tasks/run-application/horizontal-pod-autoscale/). | ||||||||||
Horizontal scaling means that the response to increased load is to deploy more | ||||||||||
{{< glossary_tooltip text="Pods" term_id="pod" >}}. | ||||||||||
This is different from _vertical_ scaling, which for Kubernetes would mean | ||||||||||
assigning more resources (for example: memory or CPU) to the Pods that are already | ||||||||||
running for the workload. | ||||||||||
|
||||||||||
If the load decreases, and the number of Pods is above the configured minimum, | ||||||||||
the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, | ||||||||||
or other similar resource) to scale back down. | ||||||||||
|
||||||||||
This document walks you through an example of enabling HorizontalPodAutoscaler to | ||||||||||
automatically manage scale for an example web app. This example workload is Apache | ||||||||||
httpd running some PHP code. | ||||||||||
|
||||||||||
## {{% heading "prerequisites" %}} | ||||||||||
|
||||||||||
This example requires a running Kubernetes cluster and kubectl, version 1.2 or later. | ||||||||||
[Metrics server](https://github.com/kubernetes-sigs/metrics-server) monitoring needs to be deployed | ||||||||||
in the cluster to provide metrics through the [Metrics API](https://github.com/kubernetes/metrics). | ||||||||||
Horizontal Pod Autoscaler uses this API to collect metrics. To learn how to deploy the metrics-server, | ||||||||||
see the [metrics-server documentation](https://github.com/kubernetes-sigs/metrics-server#deployment). | ||||||||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}} If you're running an older | ||||||||||
release of Kubernetes, refer to the version of the documentation for that release (see | ||||||||||
[available documentation versions](/docs/home/supported-doc-versions/). | ||||||||||
|
||||||||||
To follow this walkthrough, you also need to use a cluster that has a | ||||||||||
[Metrics Server](https://github.com/kubernetes-sigs/metrics-server#readme) deployed and configured. | ||||||||||
The Kubernetes Metrics Server collects resource metrics from | ||||||||||
the {{<glossary_tooltip term_id="kubelet" text="kubelets">}} in your cluster, and exposes those metrics | ||||||||||
through the [Kubernetes API](/docs/concepts/overview/kubernetes-api/), | ||||||||||
using an [APIService](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/) to add | ||||||||||
new kinds of resource that represent metric readings. | ||||||||||
|
||||||||||
To specify multiple resource metrics for a Horizontal Pod Autoscaler, you must have a | ||||||||||
Kubernetes cluster and kubectl at version 1.6 or later. To make use of custom metrics, your cluster | ||||||||||
must be able to communicate with the API server providing the custom Metrics API. | ||||||||||
Finally, to use metrics not related to any Kubernetes object you must have a | ||||||||||
Kubernetes cluster at version 1.10 or later, and you must be able to communicate | ||||||||||
with the API server that provides the external Metrics API. | ||||||||||
See the [Horizontal Pod Autoscaler user guide](/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics) for more details. | ||||||||||
Comment on lines
-33
to
-36
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we move this anywhere else as this still applies given custom and external metrics continue to be entirely optional? |
||||||||||
To learn how to deploy the Metrics Server, see the | ||||||||||
[metrics-server documentation](https://github.com/kubernetes-sigs/metrics-server#deployment). | ||||||||||
|
||||||||||
<!-- steps --> | ||||||||||
|
||||||||||
## Run and expose php-apache server | ||||||||||
|
||||||||||
To demonstrate Horizontal Pod Autoscaler we will use a custom docker image based on the php-apache image. The Dockerfile has the following content: | ||||||||||
To demonstrate a HorizontalPodAutoscaler, you will first make a custom container image that uses | ||||||||||
the `php-apache` image from Docker Hub as its starting point. The `Dockerfile` is ready-made for you, | ||||||||||
and has the following content: | ||||||||||
|
||||||||||
```dockerfile | ||||||||||
FROM php:5-apache | ||||||||||
COPY index.php /var/www/html/index.php | ||||||||||
RUN chmod a+rx index.php | ||||||||||
``` | ||||||||||
|
||||||||||
It defines an index.php page which performs some CPU intensive computations: | ||||||||||
This code defines a simple `index.php` page that performs some CPU intensive computations, | ||||||||||
in order to simulate load in your cluster. | ||||||||||
|
||||||||||
```php | ||||||||||
<?php | ||||||||||
|
@@ -59,12 +77,13 @@ It defines an index.php page which performs some CPU intensive computations: | |||||||||
?> | ||||||||||
``` | ||||||||||
|
||||||||||
First, we will start a deployment running the image and expose it as a service | ||||||||||
using the following configuration: | ||||||||||
Once you have made that container image, start a Deployment that runs a container using the | ||||||||||
image you made, and expose it as a {{< glossary_tooltip term_id="service">}} | ||||||||||
using the following manifest: | ||||||||||
|
||||||||||
{{< codenew file="application/php-apache.yaml" >}} | ||||||||||
|
||||||||||
Run the following command: | ||||||||||
To do so, run the following command: | ||||||||||
|
||||||||||
```shell | ||||||||||
kubectl apply -f https://k8s.io/examples/application/php-apache.yaml | ||||||||||
|
@@ -75,16 +94,27 @@ deployment.apps/php-apache created | |||||||||
service/php-apache created | ||||||||||
``` | ||||||||||
|
||||||||||
## Create Horizontal Pod Autoscaler | ||||||||||
## Create the HorizontalPodAutoscaler {#create-horizontal-pod-autoscaler} | ||||||||||
|
||||||||||
Now that the server is running, create the autoscaler using `kubectl`. There is | ||||||||||
[`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand, | ||||||||||
part of `kubectl`, that helps you do this. | ||||||||||
Comment on lines
+100
to
+101
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
You will shortly run a command that creates a HorizontalPodAutoscaler that maintains | ||||||||||
between 1 and 10 replicas of the Pods controlled by the php-apache Deployment that | ||||||||||
you created in the first step of these instructions. | ||||||||||
|
||||||||||
Roughly speaking, the HPA {{<glossary_tooltip text="controller" term_id="controller">}} will increase and decrease | ||||||||||
the number of replicas (by updating the Deployment) to maintain an average CPU utilization across all Pods of 50%. | ||||||||||
The Deployment then updates the ReplicaSet - this is part of how all Deployments work in Kubernetes - | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you mean an em dash here, which I believe is typically used without spaces before and after it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant an en dash with spaces round it (but, I'm not very familiar with American-style punctuation). Could we tweak the punctuation in a follow up PR though? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As you like. We Americans do things a little different sometimes. |
||||||||||
and then the ReplicaSet either adds or removes Pods based on the change to its `.spec`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I added the "and" to the previous line so the em dash could butt up against it. |
||||||||||
|
||||||||||
Now that the server is running, we will create the autoscaler using | ||||||||||
[kubectl autoscale](/docs/reference/generated/kubectl/kubectl-commands#autoscale). | ||||||||||
The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods | ||||||||||
controlled by the php-apache deployment we created in the first step of these instructions. | ||||||||||
Roughly speaking, HPA will increase and decrease the number of replicas | ||||||||||
(via the deployment) to maintain an average CPU utilization across all Pods of 50%. | ||||||||||
Since each pod requests 200 milli-cores by `kubectl run`, this means an average CPU usage of 100 milli-cores. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this not via |
||||||||||
See [here](/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) for more details on the algorithm. | ||||||||||
See [Algorithm details](/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) for more details | ||||||||||
on the algorithm. | ||||||||||
|
||||||||||
|
||||||||||
Create the HorizontalPodAutoscaler: | ||||||||||
|
||||||||||
```shell | ||||||||||
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 | ||||||||||
|
@@ -94,47 +124,64 @@ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 | |||||||||
horizontalpodautoscaler.autoscaling/php-apache autoscaled | ||||||||||
``` | ||||||||||
|
||||||||||
We may check the current status of autoscaler by running: | ||||||||||
You can check the current status of the newly-made HorizontalPodAutoscaler, by running: | ||||||||||
|
||||||||||
```shell | ||||||||||
# You can use "hpa" or "horizontalpodautoscaler"; either name works OK. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
kubectl get hpa | ||||||||||
``` | ||||||||||
|
||||||||||
The output is similar to: | ||||||||||
``` | ||||||||||
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE | ||||||||||
php-apache Deployment/php-apache/scale 0% / 50% 1 10 1 18s | ||||||||||
``` | ||||||||||
|
||||||||||
Please note that the current CPU consumption is 0% as we are not sending any requests to the server | ||||||||||
(the ``TARGET`` column shows the average across all the pods controlled by the corresponding deployment). | ||||||||||
(if you see other HorizontalPodAutoscalers with different names, that means they already existed, | ||||||||||
and isn't usually a problem). | ||||||||||
|
||||||||||
Please note that the current CPU consumption is 0% as there are no clients sending requests to the server | ||||||||||
(the ``TARGET`` column shows the average across all the Pods controlled by the corresponding deployment). | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an area we might need to expand on given the support for container-based scaling, though I don't think it should block this PR, especially given they're still in alpha. |
||||||||||
|
||||||||||
## Increase load | ||||||||||
## Increase the load {#increase-load} | ||||||||||
|
||||||||||
Now, we will see how the autoscaler reacts to increased load. | ||||||||||
We will start a container, and send an infinite loop of queries to the php-apache service (please run it in a different terminal): | ||||||||||
Next, see how the autoscaler reacts to increased load. | ||||||||||
To do this, you'll start a different Pod to act as a client. The container within the client Pod | ||||||||||
runs in an infinite loop, sending queries to the php-apache service. | ||||||||||
|
||||||||||
```shell | ||||||||||
# Run this in a separate terminal | ||||||||||
# so that the load generation continues and you can carry on with the rest of the steps | ||||||||||
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done" | ||||||||||
``` | ||||||||||
|
||||||||||
Within a minute or so, we should see the higher CPU load by executing: | ||||||||||
|
||||||||||
Now run: | ||||||||||
```shell | ||||||||||
kubectl get hpa | ||||||||||
# type Ctrl+C to end the watch when you're ready | ||||||||||
kubectl get hpa php-apache --watch | ||||||||||
``` | ||||||||||
|
||||||||||
Within a minute or so, you should see the higher CPU load; for example: | ||||||||||
|
||||||||||
``` | ||||||||||
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE | ||||||||||
php-apache Deployment/php-apache/scale 305% / 50% 1 10 1 3m | ||||||||||
``` | ||||||||||
|
||||||||||
and then, more replicas. For example: | ||||||||||
``` | ||||||||||
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE | ||||||||||
php-apache Deployment/php-apache/scale 305% / 50% 1 10 7 3m | ||||||||||
``` | ||||||||||
|
||||||||||
Here, CPU consumption has increased to 305% of the request. | ||||||||||
As a result, the deployment was resized to 7 replicas: | ||||||||||
As a result, the Deployment was resized to 7 replicas: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? |
||||||||||
|
||||||||||
```shell | ||||||||||
kubectl get deployment php-apache | ||||||||||
``` | ||||||||||
|
||||||||||
You should see the replica count matching the figure from the HorizontalPodAutoscaler | ||||||||||
``` | ||||||||||
NAME READY UP-TO-DATE AVAILABLE AGE | ||||||||||
php-apache 7/7 7 7 19m | ||||||||||
|
@@ -146,24 +193,29 @@ of load is not controlled in any way it may happen that the final number of repl | |||||||||
will differ from this example. | ||||||||||
{{< /note >}} | ||||||||||
|
||||||||||
## Stop load | ||||||||||
## Stop generating load {#stop-load} | ||||||||||
|
||||||||||
We will finish our example by stopping the user load. | ||||||||||
To finish the example, stop sending the load. | ||||||||||
|
||||||||||
In the terminal where we created the container with `busybox` image, terminate | ||||||||||
In the terminal where you created the Pod that runs a `busybox` image, terminate | ||||||||||
the load generation by typing `<Ctrl> + C`. | ||||||||||
|
||||||||||
Then we will verify the result state (after a minute or so): | ||||||||||
Then verify the result state (after a minute or so): | ||||||||||
|
||||||||||
```shell | ||||||||||
kubectl get hpa | ||||||||||
# type Ctrl+C to end the watch when you're ready | ||||||||||
kubectl get hpa php-apache --watch | ||||||||||
``` | ||||||||||
|
||||||||||
The output is similar to: | ||||||||||
|
||||||||||
``` | ||||||||||
NAME REFERENCE TARGET MINPODS MAXPODS REPLICAS AGE | ||||||||||
php-apache Deployment/php-apache/scale 0% / 50% 1 10 1 11m | ||||||||||
``` | ||||||||||
|
||||||||||
and the Deployment also shows that it has scaled down: | ||||||||||
|
||||||||||
```shell | ||||||||||
kubectl get deployment php-apache | ||||||||||
``` | ||||||||||
|
@@ -173,11 +225,9 @@ NAME READY UP-TO-DATE AVAILABLE AGE | |||||||||
php-apache 1/1 1 1 27m | ||||||||||
``` | ||||||||||
|
||||||||||
Here CPU utilization dropped to 0, and so HPA autoscaled the number of replicas back down to 1. | ||||||||||
Once CPU utilization dropped to 0, the HPA automatically scaled the number of replicas back down to 1. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
though this may be something to add to the line below instead. |
||||||||||
|
||||||||||
{{< note >}} | ||||||||||
Autoscaling the replicas may take a few minutes. | ||||||||||
{{< /note >}} | ||||||||||
|
||||||||||
<!-- discussion --> | ||||||||||
|
||||||||||
|
@@ -444,7 +494,7 @@ Conditions: | |||||||||
Events: | ||||||||||
``` | ||||||||||
|
||||||||||
For this HorizontalPodAutoscaler, we can see several conditions in a healthy state. The first, | ||||||||||
For this HorizontalPodAutoscaler, you can see several conditions in a healthy state. The first, | ||||||||||
`AbleToScale`, indicates whether or not the HPA is able to fetch and update scales, as well as | ||||||||||
whether or not any backoff-related conditions would prevent scaling. The second, `ScalingActive`, | ||||||||||
indicates whether or not the HPA is enabled (i.e. the replica count of the target is not zero) and | ||||||||||
|
@@ -454,7 +504,7 @@ was capped by the maximum or minimum of the HorizontalPodAutoscaler. This is an | |||||||||
you may wish to raise or lower the minimum or maximum replica count constraints on your | ||||||||||
HorizontalPodAutoscaler. | ||||||||||
|
||||||||||
## Appendix: Quantities | ||||||||||
## Quantities | ||||||||||
|
||||||||||
All metrics in the HorizontalPodAutoscaler and metrics APIs are specified using | ||||||||||
a special whole-number notation known in Kubernetes as a | ||||||||||
|
@@ -464,16 +514,16 @@ will return whole numbers without a suffix when possible, and will generally ret | |||||||||
quantities in milli-units otherwise. This means you might see your metric value fluctuate | ||||||||||
between `1` and `1500m`, or `1` and `1.5` when written in decimal notation. | ||||||||||
|
||||||||||
## Appendix: Other possible scenarios | ||||||||||
## Other possible scenarios | ||||||||||
|
||||||||||
### Creating the autoscaler declaratively | ||||||||||
|
||||||||||
Instead of using `kubectl autoscale` command to create a HorizontalPodAutoscaler imperatively we | ||||||||||
can use the following file to create it declaratively: | ||||||||||
can use the following manifest to create it declaratively: | ||||||||||
|
||||||||||
{{< codenew file="application/hpa/php-apache.yaml" >}} | ||||||||||
|
||||||||||
We will create the autoscaler by executing the following command: | ||||||||||
Then, create the autoscaler by executing the following command: | ||||||||||
|
||||||||||
```shell | ||||||||||
kubectl create -f https://k8s.io/examples/application/hpa/php-apache.yaml | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably end up referring to
any object implementing the scale subresource
, but I don't think we currently have any great place to link users to for more information on this, so best to leave it as is, and make improving the docs on the scale subresource a follow-up action.