Skip to content

tests/k8s: fix kbs installation on Azure AKS #11164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 29, 2025

Conversation

wainersm
Copy link
Contributor

The Azure AKS addon-http-application-routing add-on is deprecated and cannot be enabled on new clusters which has caused some CI jobs to fail.

Migrated our code to use approuting instead. Unlike addon-http-application-routing, this add-on doesn't configure a managed cluster DNS zone, but the created ingress has a public IP. To avoid having to deal with DNS setup, we will be using that address from now on. Thus, some functions no longer used are deleted.

Fixes #11156
Signed-off-by: Wainer dos Santos Moschetta [email protected]


This is how I tested it locally:

$ export KBS_INGRESS=aks
$ ./tests/integration/kubernetes/gha-run.sh create-cluster
<output omitted>
$ ./tests/integration/kubernetes/gha-run.sh get-cluster-credentials
<output omitted>
$ ./tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
<output omitted>
::endgroup::
/tmp/trustee/kbs/config/kubernetes/overlays /tmp/trustee/kbs/config/kubernetes ~/src/github.com/kata-containers/kata-containers
::group::/tmp/trustee/kbs/config/kubernetes/overlays/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: kbs
  namespace: coco-tenant
  annotations:
    kubernetes.io/ingress.class: webapprouting.kubernetes.azure.com
spec:
  rules:
  - host: ""
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: kbs
            port:
              number: 8080
::endgroup::
/tmp/trustee/kbs/config/kubernetes ~/src/github.com/kata-containers/kata-containers
::group::Deploy the KBS
namespace/coco-tenant created
configmap/kbs-config-d655m9k8gk created
configmap/policy-config-tf9h6d9bb2 created
secret/kbs-auth-public-key-t2f5fd6t89 created
secret/keys-gmf2g75547 created
service/kbs created
deployment.apps/kbs created
Warning: annotation "kubernetes.io/ingress.class" is deprecated, please use 'spec.ingressClassName' instead
ingress.networking.k8s.io/kbs created
~/src/github.com/kata-containers/kata-containers
::endgroup::
::group::Check the service healthy
pod/kbs-checker-767919 created
pod "kbs-checker-767919" deleted
KBS service respond to requests
::endgroup::
::group::Check the kbs service is exposed
Trying to connect at http://51.8.197.252:80. Timeout=350
KBS service respond to requests at http://51.8.197.252:80
::endgroup::

The Warning: annotation "kubernetes.io/ingress.class" is deprecated, please use 'spec.ingressClassName' instead warning I've a fix that I will send to trustee. This doesn't break the installation, so we don't need to bump the kbs version used here.

@wainersm wainersm added area/ci Issues affecting the continuous integration area/testing Issues related to testing labels Apr 17, 2025
@wainersm wainersm requested review from sprt and stevenhorsman April 17, 2025 18:30
@katacontainersbot katacontainersbot added the size/medium Average sized task label Apr 17, 2025
@wainersm
Copy link
Contributor Author

Hi @Xynnn007 , let's see if it fixes in the CI.

@wainersm
Copy link
Contributor Author

ok, it still failed... I will change the code to wait for the IP address to show up.

@sprt
Copy link
Contributor

sprt commented Apr 22, 2025

JFYI I merged #11157

@wainersm
Copy link
Contributor Author

Latest CI execution showed that this fixed the kbs installation, however the job got "red" because tests failed. So the code is ready for review.

In meanwhile I will re-trigger the job, see if the tests failures were caused by infra or not.

@wainersm
Copy link
Contributor Author

Rebased to main and squashed commits. Ready to review.

@wainersm
Copy link
Contributor Author

KBS is installed correctly but tests still failing. I cannot think in a reason why my changes would break the tests, so it seems some instability on CI.

I suggest to merge this PR then keep on eyes on the next nightly jobs.

@sprt
Copy link
Contributor

sprt commented Apr 23, 2025

I wouldn't be so quick to dismiss this as infra instability - I'm seeing tests failing consistently that I haven't seen before.

It seems like this is the common failure mode now:

https://github.com/kata-containers/kata-containers/actions/runs/14625338669/job/41036764485?pr=11164#step:17:5498

#   Warning  FailedScheduling  90s   default-scheduler  0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Also this seems suspicious:

https://github.com/kata-containers/kata-containers/actions/runs/14625338669/job/41036764485?pr=11164#step:17:5498

app-routing-system   nginx-6564f49b47-c6xcs                          1/1     Running   0          23m
app-routing-system   nginx-6564f49b47-tdbd7                          0/1     Pending   0          23m
  • Two nginx pods in this app-routing-system namespace - are we creating these?
  • One of them is in the pending state

Is it possible that this PR is introducing new control plane containers that eat into our CPU limits?

@wainersm
Copy link
Contributor Author

Hi @sprt !

I wouldn't be so quick to dismiss this as infra instability - I'm seeing tests failing consistently that I haven't seen before.

You are right! Thanks for the heads up!

It seems like this is the common failure mode now:

https://github.com/kata-containers/kata-containers/actions/runs/14625338669/job/41036764485?pr=11164#step:17:5498

#   Warning  FailedScheduling  90s   default-scheduler  0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Also this seems suspicious:

https://github.com/kata-containers/kata-containers/actions/runs/14625338669/job/41036764485?pr=11164#step:17:5498

app-routing-system   nginx-6564f49b47-c6xcs                          1/1     Running   0          23m
app-routing-system   nginx-6564f49b47-tdbd7                          0/1     Pending   0          23m
* Two nginx pods in this app-routing-system namespace - are we creating these?

I followed the instructions in https://learn.microsoft.com/en-us/azure/aks/app-routing , and essentially I'm creating the ingress resource with ingressClassName set to webapprouting.kubernetes.azure.com likewise.

It will create a nginx pods as a result, under the hoods. I don't know why the pending one.

* One of them is in the pending state

Is it possible that this PR is introducing new control plane containers that eat into our CPU limits?

It is a sound hypothesis. I tested with the azure subscription granted to me by Red Hat, and I didn't notice that problem. Well, I also didn't bother to run the tests either... let me put this PR on draft and see if I can reproduce locally. In meanwhile, if you have any tips about the problem and solution, please, talk to your little brother here ;)

@wainersm wainersm added the do-not-merge PR has problems or depends on another label Apr 23, 2025
@wainersm
Copy link
Contributor Author

Ok, after further investigation my findings are:

  • the app-rounting-system add-on creates a managed nginx ingress controller
  • it's deployed an auto-escaler configured for the minimum of 2 replicas. That's why in tests/k8s: fix kbs installation on Azure AKS #11164 (comment) we see two nginx pods
    • I can manage to set 1 replica by patching the controller:
    kubectl patch nginxingresscontroller/default -n app-routing-system --type=json -p='[{"op": "add", "path": "/spec/scaling/minReplicas", "value": 1}]'
    
    • More info about the controller can be found here
  • Each nginx pod request 500m of cpu, which is too much for our cluster (single node). Reduce the replicas to one helps but it's still too much. Here is the problem: I could not find a way to lower the cpu requirements via controller. I can edit/patch the deployment/nginx object however the controller re-creates it with the default settings (i.e. new pods still get 500m of cpu)

I'm trying to avoid to deploy our own nginx ingress, but honestly I'm not seeing many alternatives. Ideas?

sprt added a commit that referenced this pull request Apr 24, 2025
The AKS CLI recently introduced a regression that prevents using
aks-preview extensions (Azure/azure-cli#31345), and hence create
CI clusters.

To address this, we temporarily hardcode the last known good version of
aks-preview.

Note that I removed the comment about this being a Mariner requirement,
as aks-preview is also a requirement of AKS App Routing, which will
be introduced soon in #11164.

Signed-off-by: Aurélien Bombo <[email protected]>
@sprt
Copy link
Contributor

sprt commented Apr 24, 2025

(Heads up: changing any label will trigger a rerun of the CI, so we're seeing different results right now.)

  • Each nginx pod request 500m of cpu, which is too much for our cluster (single node). Reduce the replicas to one helps but it's still too much. Here is the problem: I could not find a way to lower the cpu requirements via controller. I can edit/patch the deployment/nginx object however the controller re-creates it with the default settings (i.e. new pods still get 500m of cpu)

I'm trying to avoid to deploy our own nginx ingress, but honestly I'm not seeing many alternatives. Ideas?

If I'm understanding correctly, the added nginx pod maxes out the node CPU capacity. How about moving the affected tests to the normal VM SKU, instead of small?

@wainersm
Copy link
Contributor Author

(Heads up: changing any label will trigger a rerun of the CI, so we're seeing different results right now.)

  • Each nginx pod request 500m of cpu, which is too much for our cluster (single node). Reduce the replicas to one helps but it's still too much. Here is the problem: I could not find a way to lower the cpu requirements via controller. I can edit/patch the deployment/nginx object however the controller re-creates it with the default settings (i.e. new pods still get 500m of cpu)

I'm trying to avoid to deploy our own nginx ingress, but honestly I'm not seeing many alternatives. Ideas?

If I'm understanding correctly, the added nginx pod maxes out the node CPU capacity. How about moving the affected tests to the normal VM SKU, instead of small?

You got it correctly. Either we (somehow) lower the cpu requested of each nginx pod or increase the worker node size. I don't know how to lower the cpu requests of the azure managed nginx ingress, so we can try to increase the VM sku as you suggested @sprt. How? Is it configured on the subscription?

@sprt
Copy link
Contributor

sprt commented Apr 25, 2025

This is actually defined in the CI scripts below. The "normal" VM SKU has 4 vCPUs, the "small" one has 2.

I would:

  1. Decrease the number of nginx replicas to 1. (Assuming that, as you said, we can't reliably decrease the CPU requests)
  2. Move tests that are failing with Insufficient cpu from K8S_TEST_SMALL_HOST_UNION to K8S_TEST_NORMAL_HOST_UNION.
  3. Get a few passing runs to convince ourselves that this is a good solution.

K8S_TEST_SMALL_HOST_ATTESTATION_REQUIRED_UNION=( \
"k8s-guest-pull-image-encrypted.bats" \
"k8s-guest-pull-image-authenticated.bats" \
"k8s-guest-pull-image-signature.bats" \
"k8s-initdata.bats" \
"k8s-confidential-attestation.bats" \
)
K8S_TEST_SMALL_HOST_UNION=( \
"k8s-guest-pull-image.bats" \
"k8s-confidential.bats" \
"k8s-sealed-secret.bats" \
"k8s-attach-handlers.bats" \
"k8s-block-volume.bats" \
"k8s-caps.bats" \
"k8s-configmap.bats" \
"k8s-copy-file.bats" \
"k8s-cpu-ns.bats" \
"k8s-credentials-secrets.bats" \
"k8s-cron-job.bats" \
"k8s-custom-dns.bats" \
"k8s-empty-dirs.bats" \
"k8s-env.bats" \
"k8s-exec.bats" \
"k8s-file-volume.bats" \
"k8s-hostname.bats" \
"k8s-inotify.bats" \
"k8s-job.bats" \
"k8s-kill-all-process-in-container.bats" \
"k8s-limit-range.bats" \
"k8s-liveness-probes.bats" \
"k8s-measured-rootfs.bats" \
"k8s-memory.bats" \
"k8s-nested-configmap-secret.bats" \
"k8s-oom.bats" \
"k8s-optional-empty-configmap.bats" \
"k8s-optional-empty-secret.bats" \
"k8s-pid-ns.bats" \
"k8s-pod-quota.bats" \
"k8s-policy-hard-coded.bats" \
"k8s-policy-deployment.bats" \
"k8s-policy-job.bats" \
"k8s-policy-logs.bats" \
"k8s-policy-pod.bats" \
"k8s-policy-pvc.bats" \
"k8s-policy-rc.bats" \
"k8s-port-forward.bats" \
"k8s-projected-volume.bats" \
"k8s-qos-pods.bats" \
"k8s-replication.bats" \
"k8s-seccomp.bats" \
"k8s-sysctls.bats" \
"k8s-security-context.bats" \
"k8s-shared-volume.bats" \
"k8s-volume.bats" \
"k8s-nginx-connectivity.bats" \
)
K8S_TEST_NORMAL_HOST_UNION=( \
"k8s-number-cpus.bats" \
"k8s-parallel.bats" \
"k8s-sandbox-vcpus-allocation.bats" \
"k8s-scale-nginx.bats" \
)

function _print_instance_type() {
case "${K8S_TEST_HOST_TYPE}" in
small)
echo "Standard_D2s_v5"
;;
normal)
echo "Standard_D4s_v5"
;;
*)
echo "Unknown instance type '${K8S_TEST_HOST_TYPE}'" >&2
exit 1
esac
}

The Azure AKS addon-http-application-routing add-on is deprecated and
cannot be enabled on new clusters which has caused some CI jobs to fail.

Migrated our code to use approuting instead. Unlike
addon-http-application-routing, this add-on doesn't
configure a managed cluster DNS zone, but the created ingress has a
public IP. To avoid having to deal with DNS setup, we will be using that
address from now on. Thus, some functions no longer used are deleted.

Fixes kata-containers#11156
Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
It's used an AKS managed ingress controller which keeps two nginx pod
replicas where both request 500m of CPU. On small VMs like we've used on
CI for running the CoCo non-TEE tests, it left only a few amount of CPU
for the tests. Actually, one of these pod replicas won't even get
started. So let's patch the ingress controller to have only one replica
of nginx.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
_print_instance_type() returns the instance type of the AKS nodes, based
on the host type. Tests are grouped per host type in "small" and "normal"
sets based on the CPU requirements: "small" tests require few CPUs and
"normal" more.

There is an 3rd case: "all" host type maps to the union of "small"
and "normal" tests, which should be handled by _print_instance_type()
properly. In this case, it should return the largest instance type
possible because "normal" tests  will be executed too.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
By running on "all" host type there are two consequences:

1) run the "normal" tests too (until now, it's only "small" tests), so
   increasing the coverage
2) create AKS cluster with larger VMs. This is a new requirement due to
   the current ingress controller for the KBS service eating too much
   vCPUs and lefting only few for the tests (resulting on failures)

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
local ingress="${1:-}"

if [[ "${ingress}" = "aks" ]]; then
echo "Patch the ingress controller to have only one replica of nginx"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a one-line comment to quickly explain why this is needed 🙂

@wainersm
Copy link
Contributor Author

I've just added a few commits.... @sprt 's comment sparked some ideas. Here, I'm now exporting K8S_TEST_HOST_TYPE=all and tweaked the script to create a cluster with larger ("normal") VMs. One side effect of this, is on coco non-tee job it will begin to run a few tests from the "normal" subset, which increases coverage. Also I added the code to lower the amount of nginx replicas to one, although with a bigger cluster it would be able to launch the default two replicas.

As I changed the workflow, I pushed the changes to a topic branch to run tests...as soon I get the results I will post here.

@wainersm
Copy link
Contributor Author

In https://github.com/kata-containers/kata-containers/actions/runs/14713221646/job/41292948604 kBS got installed and all tests except for pid-ns passed. I'm going to spin the job again and see if same test fail.

@sprt
Copy link
Contributor

sprt commented Apr 28, 2025

@wainersm Seeing a bunch of failures here:

https://github.com/kata-containers/kata-containers/actions/runs/14713397922/job/41292694318?pr=11164#step:17:11250

[run_kubernetes_tests.sh:147] ERROR: Tests FAILED from suites: k8s-cpu-ns.bats k8s-credentials-secrets.bats k8s-limit-range.bats k8s-pod-quota.bats k8s-qos-pods.bats k8s-nginx-connectivity.bats

@wainersm
Copy link
Contributor Author

@wainersm Seeing a bunch of failures here:

https://github.com/kata-containers/kata-containers/actions/runs/14713397922/job/41292694318?pr=11164#step:17:11250

[run_kubernetes_tests.sh:147] ERROR: Tests FAILED from suites: k8s-cpu-ns.bats k8s-credentials-secrets.bats k8s-limit-range.bats k8s-pod-quota.bats k8s-qos-pods.bats k8s-nginx-connectivity.bats

Please disregard the jobs triggered on this PR. The change on the workflow matter for those tests stop failing, so I've tested using the topic/wainersm-fix_kbs_on_aks topic branch. The latest and greatest job is https://github.com/kata-containers/kata-containers/actions/runs/14713221646/job/41305482015

@wainersm
Copy link
Contributor Author

ok, the last job succeeded: https://github.com/kata-containers/kata-containers/actions/runs/14713221646/job/41305482015

It's ready to be merged if no other comments. In meanwhile I will re-trigger the job one more time just for the case...

@wainersm wainersm removed the do-not-merge PR has problems or depends on another label Apr 29, 2025
@sprt sprt self-requested a review April 29, 2025 01:15
@@ -419,13 +419,20 @@ function kbs_k8s_deploy() {
fi
}

# Return the kbs service host name in case ingress is configured
# Return the kbs service public IP in case ingress is configured
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should we make this "echos the..." as (unfortunately) we can't return a string value here?

while true; do
host=$(kubectl get ingress "${KBS_INGRESS_NAME}" -n "${KBS_NS}" -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
[[ -z "${host}" && ${SECONDS} -lt 30 ]] || break
sleep 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to do SECONDS += 5 here? At the moment I think it will loop forever if there is an issue with ingress coming up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rá! This is a magic that I learned with @ldoktor while ago: that's a bash variable which is updated each time it's referenced. Quoting bash's manpage:

SECONDS
              Each time this parameter is referenced, it expands to the number
              of  seconds  since  shell invocation.  If a value is assigned to
              SECONDS, the value returned upon subsequent  references  is  the
              number  of seconds since the assignment plus the value assigned.
              The number of seconds at shell invocation and the  current  time
              are  always determined by querying the system clock.  If SECONDS
              is unset, it loses its special properties, even if it is  subse‐
              quently reset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - so it's just a magic variable. TIL!

@wainersm
Copy link
Contributor Author

ok, the last job succeeded: https://github.com/kata-containers/kata-containers/actions/runs/14713221646/job/41305482015

It's ready to be merged if no other comments. In meanwhile I will re-trigger the job one more time just for the case...

The re-trigger from yesterday passed too: https://github.com/kata-containers/kata-containers/actions/runs/14713221646/job/41314671718

Now it's the fun part: merge to realize that on production it fails :D

Copy link
Contributor

@sprt sprt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wainersm!

@sprt sprt merged commit 19371e2 into kata-containers:main Apr 29, 2025
951 of 984 checks passed
@sprt
Copy link
Contributor

sprt commented Apr 29, 2025

Force-merged as a workflow change tested in a feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci Issues affecting the continuous integration area/testing Issues related to testing ok-to-test size/medium Average sized task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: failing to deploy KBS on AKS due recent change on Azure routing API
4 participants