Install Metrics server by running the following commands:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yamlInstallation output:
serviceaccount/metrics-server created<br>clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created<br>clusterrole.rbac.authorization.k8s.io/system:metrics-server created<br>rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created<br>clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created<br>clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created<br>service/metrics-server created<br>deployment.apps/metrics-server created<br>apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io createdCheck metrics server deployment in yaml output:
kubectl -n kube-system get deployment metrics-server -o yamlWe will patch the deployment to have the following settings:
- Add –kubelet-insecure-tls argument to containers args – used to skip verifying Kubelet CA certificates.
- Change the container port from 10250 to port 4443
- Add hostNetwork: true1
kubectl patch deployment metrics-server -n kube-system --type='json' -p='[
{
"op": "add",
"path": "/spec/template/spec/hostNetwork",
"value": true
},
{
"op": "replace",
"path": "/spec/template/spec/containers/0/args",
"value": [
"--cert-dir=/tmp",
"--secure-port=4443",
"--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname",
"--kubelet-use-node-status-port",
"--metric-resolution=15s",
"--kubelet-insecure-tls"
]
},
{
"op": "replace",
"path": "/spec/template/spec/containers/0/ports/0/containerPort",
"value": 4443
}
]'Check update output:
deployment.apps/metrics-server patchedAfter a few seconds the pod status should be running and active:
kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-5478cc86f5-2h247 1/1 Running 0 2m25sCheck the metrics API status:
kubectl get apiservices -l k8s-app=metrics-server
NAME SERVICE AVAILABLE AGE
v1beta1.metrics.k8s.io kube-system/metrics-server True 79mTest if the metrics server is running by checking the kubernetes nodes utilization
kubectl top nodes
NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
desktop-control-plane 192m 0% 584Mi 3%
desktop-worker 26m 0% 130Mi 0%
desktop-worker2 34m 0% 159Mi 1%Run and expose php-apache server
kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
deployment.apps/php-apache created
service/php-apache createdCreate the HorizontalPodAutoscaler
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
Flag --cpu-percent has been deprecated, Use --cpu with percentage or resource quantity format (e.g., '70%' for utilization or '500m' for milliCPU).
horizontalpodautoscaler.autoscaling/php-apache autoscaledYou can check the current status of the newly created Horizontal Pod Autoscaler, by running:
(You can use “hpa” or “horizontalpodautoscaler”; either name works OK.)
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache cpu: 0%/50% 1 10 1 48sIf you see other HorizontalPodAutoscalers with different names, it means they already existed and this is usually not an issue.
Please note that the current CPU utilization is 0% because no clients are sending requests to the server. The TARGET column reflects the average CPU usage across all Pods managed by the corresponding deployment.
Let’s increase the load
Next, observe how the autoscaler responds to increased load. To do this, you will start a separate Pod that acts as a client. The container inside this client Pod runs an infinite loop, continuously sending requests to the php-apache service.
Note: It’s better to run this in a separate terminal so that the load generation continues and you can carry on with the rest of the steps
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
All commands and output from this session will be recorded in container logs, including credentials and sensitive information passed through the command prompt.
If you don't see a command prompt, try pressing enter.
OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OKNow run: (type Ctrl+C to end the watch when you’re ready)
kubectl get hpa php-apache --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache cpu: 129%/50% 1 10 5 18mWithin a minute or so, you should see the higher CPU load; for example:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache cpu: 0%/50% 1 10 1 113sand then, more replicas. see example:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache cpu: 129%/50% 1 10 5 18m
php-apache Deployment/php-apache cpu: 59%/50% 1 10 5 18m
php-apache Deployment/php-apache cpu: 66%/50% 1 10 5 18m
php-apache Deployment/php-apache cpu: 59%/50% 1 10 7 19mHere, CPU consumption has increased to 305% of the request. As a result, the Deployment was resized to 7 replicas
and also, you should see the replica count matching the figure from the HorizontalPodAutoscaler
kubectl get deployment php-apache
NAME READY UP-TO-DATE AVAILABLE AGE
php-apache 7/7 7 7 23mNote:
It may take a few minutes for the number of replicas to stabilize. Because the load is not explicitly controlled, the final replica count may differ from the example shown.
Stop generating load
To complete the example, stop sending traffic to the service.
In the terminal where you started the Pod running the BusyBox image, stop the load generation by pressing Ctrl + C.
Then, after about a minute, verify the resulting state:
kubectl get hpa php-apache --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache cpu: 0%/50% 1 10 1 45mand the Deployment also shows that it has scaled down:
kubectl get deployment php-apache
NAME READY UP-TO-DATE AVAILABLE AGE
php-apache 1/1 1 1 50mOnce CPU utilization dropped to 0, the HPA automatically scaled the replicas back down to 1. Please note that autoscaling adjustments may take a few minutes to complete.
Clean Up Resources
Once you have finished the demo, delete the resources:
kubectl delete deployment php-apache
kubectl delete service php-apache
kubectl delete hpa php-apache