Closed
Description
Environment details:
EKS cluster with 4 worker nodes, AWS Linux AMI. All of the nodes were labeled with invoker role. Kubernetes: 1.17
OpenWhisk deployed using helm, with invoker.containerFactory.impl: "docker"
.
Atfer the invoker pods are created and started, these are registered as healthy by the controller, even though the health check didn't execute successfully. When trying to invoke any action, the result is a 500.
Steps to reproduce the issue:
- Deploy OW to a cluster which uses containerd as a container runtime.
- Wait for the invokers to start up and execute the health checks.
Expected result:
The invoker pods should be in an invalid state when they are unable to run containers.
Actual result:
The invoker is registered as healthy by the controller.
invoker:
[2020-10-02T13:06:53.081Z] [INFO] [#tid_sid_invokerHealth] [] [InvokerReactive] [marker:invoker_activation_start:23452]
[2020-10-02T13:06:53.088Z] [WARN] [#tid_sid_invokerHealth] [] [InvokerReactive] revision was not provided for whisk.system/invokerHealthTestAction0
[2020-10-02T13:06:53.222Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [GET] 'test_whisks' finding document: 'id: whisk.system/invokerHealthTestAction0' [marker:database_getDocument_start:23593]
[2020-10-02T13:06:53.237Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [marker:database_getDocument_finish:23608:14]
[2020-10-02T13:06:53.342Z] [INFO] [#tid_sid_invokerHealth] [] [ContainerPool] containerStart containerState: cold container: None activations: 1 of max 1 action: invokerHealthTestAction0 namespace: whisk.system activationId: 7ead86477c5b40c1ad86477c5bb0c1b2 [marker:invoker_containerStart.cold_counter:23712]
[2020-10-02T13:06:53.392Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 11 --memory 128m --memory-swap 128m --network bridge -e __OW_API_HOST=http://openwhisk-nginx.*.svc.cluster.local -e __OW_ALLOW_CONCURRENT=false --dns 172.20.0.10 --dns-search *.svc.cluster.local --dns-search svc.cluster.local --dns-search cluster.local --dns-search ec2.internal --dns-option options --dns-option ndots:5 --name wskip-*.ec2.internal0_1_whisksystem_invokerHealthTestAction0 --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file openwhisk/action-nodejs-v10:nightly (timeout: 1 minute) [marker:invoker_docker.run_start:23763]
[2020-10-02T13:06:54.081Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:24452:689]
[2020-10-02T13:06:54.099Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] running /usr/bin/docker inspect --format {{.NetworkSettings.Networks.bridge.IPAddress}} 2b949d5d5653840c4b90d41b3795b275a4c409a073f98f00da54a367376e476e (timeout: 1 minute) [marker:invoker_docker.inspect_start:24470]
[2020-10-02T13:06:54.161Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] [marker:invoker_docker.inspect_finish:24532:62]
[2020-10-02T13:06:54.162Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] running /usr/bin/docker rm -f 2b949d5d5653840c4b90d41b3795b275a4c409a073f98f00da54a367376e476e (timeout: 1 minute) [marker:invoker_docker.rm_start:24533]
[2020-10-02T13:06:54.222Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [PUT] 'test_activations' saving document: 'id: whisk.system/7ead86477c5b40c1ad86477c5bb0c1b2, rev: null' [marker:database_saveDocument_start:24593]
[2020-10-02T13:06:54.225Z] [INFO] [#tid_sid_dbBatcher] [] [CouchDbRestStore] 'test_activations' saving 1 documents [marker:database_saveDocumentBulk_start:4672]
[2020-10-02T13:06:54.245Z] [INFO] [#tid_sid_invokerHealth] [] [MessagingActiveAck] posted combined of activation 7ead86477c5b40c1ad86477c5bb0c1b2
[2020-10-02T13:06:54.270Z] [INFO] [#tid_sid_dbBatcher] [] [CouchDbRestStore] [marker:database_saveDocumentBulk_finish:4718:46]
[2020-10-02T13:06:54.280Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [marker:database_saveDocument_finish:24651:58]
[2020-10-02T13:06:54.420Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] [marker:invoker_docker.rm_finish:24790:257]
controller:
[2020-10-02T13:06:58.833Z] [INFO] [#tid_sid_invokerHealth] [] [InvokerActor] invoker3 is up
[2020-10-02T13:06:58.833Z] [INFO] [#tid_sid_invokerHealth] [] [InvokerPool] invoker status changed to 0 -> Healthy, 1 -> Healthy, 2 -> Healthy, 3 -> Healthy
Logs attached:
controller.log
invoker.log