Sunday, 6 June 2021

Why won't Kubernetes kubelet come up ?

 After an unscheduled reboot of the VMs that host my K8s cluster, I was struggling to work out why the kubelet wasn't starting properly.

I ran systemctl start kubelet.service to start it and then checked the status with systemctl status kubelet.service which showed: -

● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Sun 2021-06-06 00:35:01 PDT; 3s ago
       Docs: https://kubernetes.io/docs/home/
   Main PID: 82478 (kubelet)
      Tasks: 7 (limit: 2279)
     Memory: 14.6M
     CGroup: /system.slice/kubelet.service
             └─82478 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/conf>
Jun 06 00:35:01 garble1.domain.com systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.836881   82478 server.go:197] "Warning: For remote container runtime, --pod-infra-container-image is i>
Jun 06 00:35:01 garble1.domain.com ubelet[82478]: I0606 00:35:01.866762   82478 server.go:440] "Kubelet version" kubeletVersion="v1.21.0"
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.867455   82478 server.go:851] "Client rotation is on, will bootstrap in background"
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.870367   82478 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-clie>
Jun 06 00:35:01 garble1.domain.com kubelet[82478]: I0606 00:35:01.873004   82478 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt

which looked OK.

I checked again: -

systemctl status kubelet.service

and saw: -

● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: activating (auto-restart) (Result: exit-code) since Sun 2021-06-06 00:35:22 PDT; 8s ago
       Docs: https://kubernetes.io/docs/home/
    Process: 82505 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
   Main PID: 82505 (code=exited, status=1/FAILURE)
Jun 06 00:35:22 garble1.domain.com systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Jun 06 00:35:22 garble1.domain.com systemd[1]: kubelet.service: Failed with result 'exit-code'.

which looked not so good.

I then checked the syslog with: -

tail -f /var/log/syslog

and saw, amongst many other things, this: -

Jun  6 00:40:27 garble1 kubelet[83211]: E0606 00:40:27.104582   83211 server.go:292] "Failed to run kubelet" err="failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: [Filename\t\t\t\tType\t\tSize\tUsed\tPriority /swap.img                               file\t\t4194300\t0\t-2]"


Of course, the VMs were rebooted ... so swap is still on ....

A quick trip to swapoff with: -

swapoff -a

and we're back in the game.

kubectl get nodes

NAME                     STATUS   ROLES                  AGE     VERSION
garble1.domain.com   Ready    control-plane,master   3d13h   v1.21.0
garble2.domain.com   Ready    <none>                 3d13h   v1.21.0

crictl pods

POD ID              CREATED             STATE               NAME                                             NAMESPACE           ATTEMPT
c3969548182d6       17 seconds ago      Ready               calico-node-nl2g2                                kube-system         0
bd06ccb126620       18 seconds ago      Ready               kube-proxy-ht4mq                                 kube-system         0
5a31b04c1d01a       18 seconds ago      Ready               kube-scheduler-garble1.domain.com            kube-system         0
ac6e59ccb87f1       25 seconds ago      Ready               kube-controller-manager-garble1.domain.com   kube-system         0
d2ece5d26441e       35 seconds ago      Ready               kube-apiserver-garble1.domain.com            kube-system         0
10019ac4de96d       45 seconds ago      Ready               etcd-garble1.domain.com                      kube-system         0

No comments:

Yay, VMware Fusion and macOS Big Sur - no longer "NAT good friends" - forgive the double negative and the terrible pun ...

After macOS 11 Big Sur was released in 2020, VMware updated their Fusion product to v12 and, sadly, managed to break Network Address Trans...