Liveness Probe
What happens if the application in the pod is running but it can’t serve its main purpose, for whatever reason? Also applications that run for a long time might transition to broken states, and if this is the case the last thing you want is a call reporting a problem in an application that could be easily fixed with restarting the pod. Liveness probes are a Kubernetes feature made specifically for this kind of situation. Liveness probes send a pre-defined request to the pod on a regular basis then restart the pod if the request fails. The most commonly used liveness probe is HTTP GET request, but it can also open the TCP socket or even issue a command.
Next is an HTTP GET request probe example, where the initialDelaySeconds
is the waiting time before the
first try to HTTP GET request to port 80, then it will run the probe
every 20 seconds as specified in periodSeconds
. If this fails the pod will restart automatically. You have the
option to specify the path, which here is just the main website. Also
you can send the probe with a customized header. Take a quick look:
apiVersion: v1 kind: Pod metadata: name: liveness-pod labels: app: tcpsocket-test spec: containers: - name: liveness-pod image: contrailk8sdayone/ubuntu ports: - containerPort: 80 securityContext: privileged: true capabilities: add: - NET_ADMIN livenessProbe: httpGet: path: / port: 80 httpHeaders: - name: some-header value: Running initialDelaySeconds: 15 periodSeconds: 20
Now let’s launch this pod then log in to it to terminate the process that handles the HTTP GET request:
[root@cent11 ~]# kubectl get pod NAME READY STATUS RESTARTS AGE liveness-pod 1/1 Running 0 114s [root@cent11 ~]# kubectl exec -it liveness-pod bash root@liveness-pod:/# sudo netstat -tulpn Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 111/apache2 tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 45/sshd tcp6 0 0 :::22 :::* LISTEN 45/sshd root@liveness-pod:/# service apache2 stop * Stopping web server apache2 * root@liveness-pod:/# sudo netstat -tulpn Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 45/sshd tcp6 0 0 :::22 :::* LISTEN 45/sshd [root@cent11 ~]# kubectl get pod NAME READY STATUS RESTARTS AGE liveness-pod 1/1 Running 1 5m33s
You can see that the pod was automatically restarted, and you can also see the reason for that restart in the event:
Killing container with id docker://liveness-pod:Container failed liveness probe. Container will be killed and recreated. [root@cent11 ~]# kubectl describe pod liveness-pod Name: liveness-pod Namespace: default Priority: 0 PriorityClassName: <none> Node: cent22/10.85.188.17 Start Time: Fri, 05 Jul 2019 16:39:12 -0400 Labels: app=tcpsocket-test Annotations: k8s.v1.cni.cncf.io/network-status: [ { "ips": "10.47.255.249", "mac": "02:c2:59:4a:82:9f", "name": "cluster-wide-default" } ] Status: Running IP: 10.47.255.249 Containers: liveness-pod: Container ID: docker://01969f51d32f38a15baab18487b85c54cee4125f55c8c7667236722084e4df06 Image: virtualhops/ato-ubuntu:latest Image ID: docker-pullable://virtualhops/ato-ubuntu@sha256:fa2930cb8f4b766e5b335dfa42de510ecd30af6433ceada14cdaae8de9065d2a Port: 80/TCP Host Port: 0/TCP State: Running Started: Fri, 05 Jul 2019 16:41:35 -0400 Last State: Terminated Reason: Error Exit Code: 137 Started: Fri, 05 Jul 2019 16:39:20 -0400 Finished: Fri, 05 Jul 2019 16:41:34 -0400 Ready: True Restart Count: 1 Liveness: http-get http://:80/ delay=15s timeout=1s period=20s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-m75c5 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-m75c5: Type : Secret (a volume populated by a Secret) SecretName: default-token-m75c5 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message Normal Scheduled 7m19s default-scheduler Successfully assigned default/liveness-pod to cent22 Warning Unhealthy 4m6s (x3 over 4m46s) kubelet, cent22 Liveness probe failed: Get http://10.47.255.249:80/: dial tcp 10.47.255.249:80: connect: connection refused Normal Pulling 3m36s (x2 over 5m53s) kubelet, cent22 pulling image "virtualhops/ato-ubuntu:latest" Normal Killing 3m36s kubelet, cent22 Killing container with id docker://liveness-pod:Container failed liveness probe.. Container will be killed and recreated. Normal Pulled 3m35s (x2 over 5m50s) kubelet, cent22 Successfully pulled image "virtualhops/ato-ubuntu:latest" Normal Created 3m35s (x2 over 5m50s) kubelet, cent22 Created container Normal Started 3m35s (x2 over 5m50s) kubelet, cent22 Started container
This is a TCP socket probe example. A TCP socket probe is similar to the HTTP GET request probes, but it will open the TCP socket:
apiVersion: v1 kind: Pod metadata: name: liveness-pod labels: app: tcpsocket-test spec: containers: - name: liveness-pod image: contrailk8sdayone/ubuntu ports: - containerPort: 80 securityContext: privileged: true capabilities: add: - NET_ADMIN livenessProbe: tcpSocket: port: 80 initialDelaySeconds: 15 periodSeconds: 20
The command is like HTTP GET and TCP socket probes. But the probe will execute the command in the container:
apiVersion: v1 kind: Pod metadata: name: liveness-pod labels: app: command-test spec: containers: - name: liveness-pod image: k8s.gcr.io/busybox args: - /bin/sh - -c - touch /tmp/healthy; while true; do sleep 600;done; livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5