kubernetes · k8s-ci-robot · Dec 4, 2021 · Nov 23, 2021 · gjtempleton · Dec 3, 2021
diff --git a/content/en/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough.md b/content/en/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough.md
@@ -4,50 +4,68 @@ reviewers:
 - jszczepkowski
 - justinsb
 - directxman12
-title: Horizontal Pod Autoscaler Walkthrough
+title: HorizontalPodAutoscaler Walkthrough
 content_type: task
 weight: 100
+min-kubernetes-server-version: 1.23
 ---
 
 <!-- overview -->
 
-Horizontal Pod Autoscaler automatically scales the number of Pods
-in a replication controller, deployment, replica set or stateful set based on observed CPU utilization
-(or, with beta support, on some other, application-provided metrics).
+A [HorizontalPodAutoscaler](/docs/tasks/run-application/horizontal-pod-autoscale/)
+(HPA for short)
+automatically updates a workload resource (such as
+a {{< glossary_tooltip text="Deployment" term_id="deployment" >}} or
+{{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}), with the
+aim of automatically scaling the workload to match demand.
 
-This document walks you through an example of enabling Horizontal Pod Autoscaler for the php-apache server.
-For more information on how Horizontal Pod Autoscaler behaves, see the
-[Horizontal Pod Autoscaler user guide](/docs/tasks/run-application/horizontal-pod-autoscale/).
+Horizontal scaling means that the response to increased load is to deploy more
+{{< glossary_tooltip text="Pods" term_id="pod" >}}.
+This is different from _vertical_ scaling, which for Kubernetes would mean
+assigning more resources (for example: memory or CPU) to the Pods that are already
+running for the workload.
+
+If the load decreases, and the number of Pods is above the configured minimum,
+the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet,
+or other similar resource) to scale back down.
+
+This document walks you through an example of enabling HorizontalPodAutoscaler to
+automatically manage scale for an example web app. This example workload is Apache
+httpd running some PHP code.
 
 ## {{% heading "prerequisites" %}}
 
-This example requires a running Kubernetes cluster and kubectl, version 1.2 or later.
-[Metrics server](https://github.com/kubernetes-sigs/metrics-server) monitoring needs to be deployed
-in the cluster to provide metrics through the [Metrics API](https://github.com/kubernetes/metrics).
-Horizontal Pod Autoscaler uses this API to collect metrics. To learn how to deploy the metrics-server,
-see the [metrics-server documentation](https://github.com/kubernetes-sigs/metrics-server#deployment).
+{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}} If you're running an older
+release of Kubernetes, refer to the version of the documentation for that release (see
+[available documentation versions](/docs/home/supported-doc-versions/).
+
+To follow this walkthrough, you also need to use a cluster that has a
+[Metrics Server](https://github.com/kubernetes-sigs/metrics-server#readme) deployed and configured.
+The Kubernetes Metrics Server collects resource metrics from
+the {{<glossary_tooltip term_id="kubelet" text="kubelets">}} in your cluster, and exposes those metrics
+through the [Kubernetes API](/docs/concepts/overview/kubernetes-api/),
+using an [APIService](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/) to add
+new kinds of resource that represent metric readings.
 
-To specify multiple resource metrics for a Horizontal Pod Autoscaler, you must have a
-Kubernetes cluster and kubectl at version 1.6 or later. To make use of custom metrics, your cluster
-must be able to communicate with the API server providing the custom Metrics API.
-Finally, to use metrics not related to any Kubernetes object you must have a
-Kubernetes cluster at version 1.10 or later, and you must be able to communicate
-with the API server that provides the external Metrics API.
-See the [Horizontal Pod Autoscaler user guide](/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics) for more details.
+To learn how to deploy the Metrics Server, see the
+[metrics-server documentation](https://github.com/kubernetes-sigs/metrics-server#deployment).
 
 <!-- steps -->
 
 ## Run and expose php-apache server
 
-To demonstrate Horizontal Pod Autoscaler we will use a custom docker image based on the php-apache image. The Dockerfile has the following content:
+To demonstrate a HorizontalPodAutoscaler, you will first make a custom container image that uses
+the `php-apache` image from Docker Hub as its starting point. The `Dockerfile` is ready-made for you,
+and has the following content:
 
 ```dockerfile
 FROM php:5-apache
 COPY index.php /var/www/html/index.php
 RUN chmod a+rx index.php
 ```
 
-It defines an index.php page which performs some CPU intensive computations:
+This code defines a simple `index.php` page that performs some CPU intensive computations,
+in order to simulate load in your cluster.
 
 ```php
 <?php
@@ -59,12 +77,13 @@ It defines an index.php page which performs some CPU intensive computations:
 ?>
 ```
 
-First, we will start a deployment running the image and expose it as a service
-using the following configuration:
+Once you have made that container image, start a Deployment that runs a container using the
+image you made, and expose it as a {{< glossary_tooltip term_id="service">}}
+using the following manifest:
 
 {{< codenew file="application/php-apache.yaml" >}}
 
-Run the following command:
+To do so, run the following command:
 
 ```shell
 kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
@@ -75,16 +94,27 @@ deployment.apps/php-apache created
 service/php-apache created
 ```
 
-## Create Horizontal Pod Autoscaler
+## Create the HorizontalPodAutoscaler {#create-horizontal-pod-autoscaler}
+
+Now that the server is running, create the autoscaler using `kubectl`. There is
+[`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand,
+part of `kubectl`, that helps you do this.
-[`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand,
-part of `kubectl`, that helps you do this.
+the [`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand,
+part of `kubectl`, which helps you do this.
-[`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand,
-part of `kubectl`, that helps you do this.
+the [`kubectl autoscale`](/docs/reference/generated/kubectl/kubectl-commands#autoscale) subcommand,
+part of `kubectl`, which helps you do this.
+
+You will shortly run a command that creates a HorizontalPodAutoscaler that maintains
+between 1 and 10 replicas of the Pods controlled by the php-apache Deployment that
+you created in the first step of these instructions.
+
+Roughly speaking, the HPA {{<glossary_tooltip text="controller" term_id="controller">}} will increase and decrease
+the number of replicas (by updating the Deployment) to maintain an average CPU utilization across all Pods of 50%.
+The Deployment then updates the ReplicaSet - this is part of how all Deployments work in Kubernetes -
-The Deployment then updates the ReplicaSet - this is part of how all Deployments work in Kubernetes -
+The Deployment then updates the ReplicaSet---this is part of how all Deployments work in Kubernetes---and
-The Deployment then updates the ReplicaSet - this is part of how all Deployments work in Kubernetes -
+The Deployment then updates the ReplicaSet---this is part of how all Deployments work in Kubernetes---and
+and then the ReplicaSet either adds or removes Pods based on the change to its `.spec`.
-and then the ReplicaSet either adds or removes Pods based on the change to its `.spec`.
+then the ReplicaSet either adds or removes Pods based on the change to its `.spec`.
-and then the ReplicaSet either adds or removes Pods based on the change to its `.spec`.
+then the ReplicaSet either adds or removes Pods based on the change to its `.spec`.
 
-Now that the server is running, we will create the autoscaler using
-[kubectl autoscale](/docs/reference/generated/kubectl/kubectl-commands#autoscale).
-The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods
-controlled by the php-apache deployment we created in the first step of these instructions.
-Roughly speaking, HPA will increase and decrease the number of replicas
-(via the deployment) to maintain an average CPU utilization across all Pods of 50%.
 Since each pod requests 200 milli-cores by `kubectl run`, this means an average CPU usage of 100 milli-cores.
-See [here](/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) for more details on the algorithm.
+See [Algorithm details](/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) for more details
+on the algorithm.
+
+
+Create the HorizontalPodAutoscaler:
 
 ```shell
 kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
@@ -94,47 +124,64 @@ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
 horizontalpodautoscaler.autoscaling/php-apache autoscaled
 ```
 
-We may check the current status of autoscaler by running:
+You can check the current status of the newly-made HorizontalPodAutoscaler, by running:
 
 ```shell
+# You can use "hpa" or "horizontalpodautoscaler"; either name works OK.
-# You can use "hpa" or "horizontalpodautoscaler"; either name works OK.
+# You can use "hpa" or "horizontalpodautoscaler"; either name will work.
-# You can use "hpa" or "horizontalpodautoscaler"; either name works OK.
+# You can use "hpa" or "horizontalpodautoscaler"; either name will work.
 kubectl get hpa
 ```
 
+The output is similar to:
 ```
 NAME         REFERENCE                     TARGET    MINPODS   MAXPODS   REPLICAS   AGE
 php-apache   Deployment/php-apache/scale   0% / 50%  1         10        1          18s
 ```
 
-Please note that the current CPU consumption is 0% as we are not sending any requests to the server
-(the ``TARGET`` column shows the average across all the pods controlled by the corresponding deployment).
+(if you see other HorizontalPodAutoscalers with different names, that means they already existed,
+and isn't usually a problem).
+
+Please note that the current CPU consumption is 0% as there are no clients sending requests to the server
+(the ``TARGET`` column shows the average across all the Pods controlled by the corresponding deployment).
 
-## Increase load
+## Increase the load {#increase-load}
 
-Now, we will see how the autoscaler reacts to increased load.
-We will start a container, and send an infinite loop of queries to the php-apache service (please run it in a different terminal):
+Next, see how the autoscaler reacts to increased load.
+To do this, you'll start a different Pod to act as a client. The container within the client Pod
+runs in an infinite loop, sending queries to the php-apache service.
 
 ```shell
+# Run this in a separate terminal
+# so that the load generation continues and you can carry on with the rest of the steps
 kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
 ```
 
-Within a minute or so, we should see the higher CPU load by executing:
-
+Now run:
 ```shell
-kubectl get hpa
+# type Ctrl+C to end the watch when you're ready
+kubectl get hpa php-apache --watch
 ```
 
+Within a minute or so, you should see the higher CPU load; for example:
+
 ```
 NAME         REFERENCE                     TARGET      MINPODS   MAXPODS   REPLICAS   AGE
 php-apache   Deployment/php-apache/scale   305% / 50%  1         10        1          3m
 ```
 
+and then, more replicas. For example:
+```
+NAME         REFERENCE                     TARGET      MINPODS   MAXPODS   REPLICAS   AGE
+php-apache   Deployment/php-apache/scale   305% / 50%  1         10        7          3m
+```
+
 Here, CPU consumption has increased to 305% of the request.
-As a result, the deployment was resized to 7 replicas:
+As a result, the Deployment was resized to 7 replicas:
-As a result, the Deployment was resized to 7 replicas:
+As a result, the Deployment was resized to 7 replicas to attempt to match the target
-As a result, the Deployment was resized to 7 replicas:
+As a result, the Deployment was resized to 7 replicas to attempt to match the target
 
 ```shell
 kubectl get deployment php-apache
 ```
 
+You should see the replica count matching the figure from the HorizontalPodAutoscaler
 ```
 NAME         READY   UP-TO-DATE   AVAILABLE   AGE
 php-apache   7/7      7           7           19m
@@ -146,24 +193,29 @@ of load is not controlled in any way it may happen that the final number of repl
 will differ from this example.
 {{< /note >}}
 
-## Stop load
+## Stop generating load {#stop-load}
 
-We will finish our example by stopping the user load.
+To finish the example, stop sending the load.
 
-In the terminal where we created the container with `busybox` image, terminate
+In the terminal where you created the Pod that runs a `busybox` image, terminate
 the load generation by typing `<Ctrl> + C`.
 
-Then we will verify the result state (after a minute or so):
+Then verify the result state (after a minute or so):
 
 ```shell
-kubectl get hpa
+# type Ctrl+C to end the watch when you're ready
+kubectl get hpa php-apache --watch
 ```
 
+The output is similar to:
+
 ```
 NAME         REFERENCE                     TARGET       MINPODS   MAXPODS   REPLICAS   AGE
 php-apache   Deployment/php-apache/scale   0% / 50%     1         10        1          11m
 ```
 
+and the Deployment also shows that it has scaled down:
+
 ```shell
 kubectl get deployment php-apache
 ```
@@ -173,11 +225,9 @@ NAME         READY   UP-TO-DATE   AVAILABLE   AGE
 php-apache   1/1     1            1           27m
 ```
 
-Here CPU utilization dropped to 0, and so HPA autoscaled the number of replicas back down to 1.
+Once CPU utilization dropped to 0, the HPA automatically scaled the number of replicas back down to 1.
-Once CPU utilization dropped to 0, the HPA automatically scaled the number of replicas back down to 1.
+Once CPU utilization dropped to 0 (and any [downscale stabilization](/docs/reference/command-line-tools-reference/kube-controller-manager/) expired), the HPA automatically scaled the number of replicas back down to 1.
-Once CPU utilization dropped to 0, the HPA automatically scaled the number of replicas back down to 1.
+Once CPU utilization dropped to 0 (and any [downscale stabilization](/docs/reference/command-line-tools-reference/kube-controller-manager/) expired), the HPA automatically scaled the number of replicas back down to 1.
 
-{{< note >}}
 Autoscaling the replicas may take a few minutes.
-{{< /note >}}
 
 <!-- discussion -->
 
@@ -444,7 +494,7 @@ Conditions:
 Events:
 ```
 
-For this HorizontalPodAutoscaler, we can see several conditions in a healthy state.  The first,
+For this HorizontalPodAutoscaler, you can see several conditions in a healthy state.  The first,
 `AbleToScale`, indicates whether or not the HPA is able to fetch and update scales, as well as
 whether or not any backoff-related conditions would prevent scaling.  The second, `ScalingActive`,
 indicates whether or not the HPA is enabled (i.e. the replica count of the target is not zero) and
@@ -454,7 +504,7 @@ was capped by the maximum or minimum of the HorizontalPodAutoscaler.  This is an
 you may wish to raise or lower the minimum or maximum replica count constraints on your
 HorizontalPodAutoscaler.
 
-## Appendix: Quantities
+## Quantities
 
 All metrics in the HorizontalPodAutoscaler and metrics APIs are specified using
 a special whole-number notation known in Kubernetes as a
@@ -464,16 +514,16 @@ will return whole numbers without a suffix when possible, and will generally ret
 quantities in milli-units otherwise.  This means you might see your metric value fluctuate
 between `1` and `1500m`, or `1` and `1.5` when written in decimal notation.
 
-## Appendix: Other possible scenarios
+## Other possible scenarios
 
 ### Creating the autoscaler declaratively
 
 Instead of using `kubectl autoscale` command to create a HorizontalPodAutoscaler imperatively we
-can use the following file to create it declaratively:
+can use the following manifest to create it declaratively:
 
 {{< codenew file="application/hpa/php-apache.yaml" >}}
 
-We will create the autoscaler by executing the following command:
+Then, create the autoscaler by executing the following command:
 
 ```shell
 kubectl create -f https://k8s.io/examples/application/hpa/php-apache.yaml