k8s探针
  qw1dHH2kI2RK 2023年11月02日 79 0

一.基础概念

探针 是由 kubelet 对容器执行的定期诊断,具体调用由容器实现的 Handler (处理程序)

针对运行中的容器,kubelet 可以选择是否执行以下三种探针,以及如何针对探测结果作出反应:

  • livenessProbe:针对pod运行状态存活检测。即Pod是否为running状态,如果LivenessProbe探针探测到容器不健康,则kubelet将kill掉容器,并根据容器的重启策略是否重启,如果一个容器不包含LivenessProbe探针,则Kubelet认为容器的LivenessProbe探针的返回值永远成功,当我们执行kubectl get pods命令,输出信息中STATUS一列我们可以看到Pod是否处于Running状态
  • readinessProbe:针对容器提供服务,就绪探测。如果ReadinessProbe探测失败,则容器的Ready将为False,控制器将此Pod的Endpoint从对应的service的Endpoint列表中移除,从此不再将任何请求调度此Pod上,直到下次探测成功,与livenessProbe不同的是,kubelet不会对readinessProbe的探测情况有重启操作。当我们执行kubectl get pods命令,输出信息中READY一列我们可以看到Pod的READY状态是否为True
  • startupProbe:针对容器服务端口探测,容器内应用是否已启动。如果启用startupProbe,则禁用其他探测,知道它成功为止。探测失败,kubelet将杀死容器,容器服从重启策略

二.应用背景

启动时间长的用startupProbe探针

Pod已经成功启动,但是 Pod 的的容器中应用程序还在启动中导致发生错误,可以使用readinessProbe,保证服务的高可用

无法在遇到问题后,自行崩溃的程序使用livenessProbe

三.3种探针的异同

startupProbe探针的使用方法跟 ReadinessProbe 和 livenessProbe 相同,对 Pod 的处置跟livenessProbe 方式相同,失败重启,只在容器启动时运行一次

readinessProbe 当检测失败后,将 Pod 的 IP:Port 从对应的 EndPoint 列表中删除,运行于容器整个生命周期

livenessProbe 当检测失败后,将杀死容器并根据 Pod 的重启策略来决定作出对应的措施。运行于容器整个生命周期

四.3种探测方式

k8s探针_探针

五.实操

liveness探针

1.存活探针liveness ExecAction

容器启动5秒后,kubelet每隔5秒检测是否有/tmp/health文件,35秒后文件被删除,则探针失败,被kubelet删除重启,并遵从重启策略

[root@k8s-master1 ~]# cat exec-liveness.yaml
apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: busybox
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5   # 延迟探测时间,容器启动后第一次执行探测是需要等待多少秒。
      periodSeconds: 5         # 执行探测的频率。默认是10秒,最小1秒。
      timeoutSeconds: 1          # 超时时间
      successThreshold: 1        # 健康阀值,成功一次,标记成功
      failureThreshold: 6        # 失败后重试次数,连续6次失败,标记失败



initialDelaySeconds +  periodSeconds * failureThreshold = 实际启动时间



[root@k8s-master1 ~]# kubectl describe pod liveness-exec
Events:
  Type     Reason     Age   From                Message
  ----     ------     ----  ----                -------
  Normal   Scheduled  51s   default-scheduler   Successfully assigned default/liveness-exec to k8s-node2
  Normal   Pulling    50s   kubelet, k8s-node2  Pulling image "busybox"
  Normal   Pulled     35s   kubelet, k8s-node2  Successfully pulled image "busybox"
  Normal   Created    35s   kubelet, k8s-node2  Created container liveness
  Normal   Started    35s   kubelet, k8s-node2  Started container liveness
  Warning  Unhealthy  4s    kubelet, k8s-node2  Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory


[root@k8s-master1 ~]# kubectl get pods
NAME                       READY   STATUS    RESTARTS   AGE
busybox-5d7b4b65d6-6gfbb   1/1     Running   0          129m
liveness-exec              1/1     Running   5          8m37s
myweb-7ccb985444-cqbft     1/1     Running   0          129m

2.liveness HTTP请求

在容器启动15s后开始探测。timeout仅设置为1秒,因此容器必须在1秒内进行响应, 不然这次 探测记作失败。每10秒探测一次容器(period=10s), 并在探测连续三次失败 (#failure= 3)后重启容器

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-error
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-error
  template:
    metadata:
      name: nginx-error
      labels:
        app: nginx-error
    spec:
      containers:
      - name: nginx-error
        image: nginx-error
        imagePullPolicy: Never
        ports:
        - containerPort: 8080
        livenessProbe:               
          httpGet:
            path: /
            port: 8080
            httpHeaders:
            - name: X-Custom-Header
              value: Awesome
          initialDelaySeconds: 15



[root@k8s-master1 ~]# kubectl describe deployment/nginx-error
Name:                   nginx-error
Namespace:              default
CreationTimestamp:      Mon, 03 Jan 2022 03:35:00 -0500
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 1
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"nginx-error","namespace":"default"},"spec":{"replicas":1,...
Selector:               app=nginx-error
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=nginx-error
  Containers:
   nginx-error:
    Image:        nginx
    Port:         8080/TCP
    Host Port:    0/TCP
    Liveness:     http-get http://:8080/ delay=15s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   nginx-error-54c58ddb79 (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  88s   deployment-controller  Scaled up replica set nginx-error-54c58ddb79 to 1

[root@k8s-master1 ~]# kubectl get pods -o wide
NAME                           READY   STATUS             RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
busybox-5d7b4b65d6-6gfbb       1/1     Running            0          148m    10.244.1.2    k8s-node1   <none>           <none>
liveness-exec                  1/1     Running            10         27m     10.244.2.3    k8s-node2   <none>           <none>
myweb-7ccb985444-cqbft         1/1     Running            0          148m    10.244.2.10   k8s-node2   <none>           <none>
nginx-error-54c58ddb79-6h9mx   0/1     CrashLoopBackOff   5          5m36s   10.244.2.4    k8s-node2   <none>           <none>

3.liveness  tcpSocket请求

[root@k8s-master1 ~]# cat nginx-error2_deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-error2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-error2
  template:
    metadata:
      name: nginx-error2
      labels:
        app: nginx-error2
    spec:
      containers:
      - name: nginx-error2
        image: nginx
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
        #readinessProbe:
        #  tcpSocket:
        #    port: 8080
        #  initialDelaySeconds: 5
        #  periodSeconds: 10
        livenessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 20
        #livenessProbe:               
        #  httpGet:
        #    path: /
        #    port: 8080
        #  initialDelaySeconds: 15



[root@k8s-master1 ~]# kubectl describe pod/nginx-error2-65bb47c758-d9gkb
Name:         nginx-error2-65bb47c758-d9gkb
Namespace:    default
Priority:     0
Node:         k8s-node2/192.168.255.144
Start Time:   Mon, 03 Jan 2022 03:46:10 -0500
Labels:       app=nginx-error2
              pod-template-hash=65bb47c758
Annotations:  <none>
Status:       Running
IP:           10.244.2.5
IPs:
  IP:           10.244.2.5
Controlled By:  ReplicaSet/nginx-error2-65bb47c758
Containers:
  nginx-error2:
    Container ID:   docker://96e949a0c05ad384da75a33130341a1cea3f10bf784493e57347cd15d809a5b7
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:0d17b565c37bcbd895e9d92315a05c1c3c9a29f762b011a10c54a66cd53c9b31
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Mon, 03 Jan 2022 03:47:43 -0500
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 03 Jan 2022 03:46:27 -0500
      Finished:     Mon, 03 Jan 2022 03:47:27 -0500
    Ready:          True
    Restart Count:  1
    Liveness:       tcp-socket :8080 delay=15s timeout=1s period=20s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vf9dn (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-vf9dn:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vf9dn
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                 From                Message
  ----     ------     ----                ----                -------
  Normal   Scheduled  106s                default-scheduler   Successfully assigned default/nginx-error2-65bb47c758-d9gkb to k8s-node2
  Normal   Pulling    29s (x2 over 105s)  kubelet, k8s-node2  Pulling image "nginx"
  Warning  Unhealthy  29s (x3 over 69s)   kubelet, k8s-node2  Liveness probe failed: dial tcp 10.244.2.5:8080: connect: connection refused
  Normal   Killing    29s                 kubelet, k8s-node2  Container nginx-error2 failed liveness probe, will be restarted
  Normal   Pulled     13s (x2 over 90s)   kubelet, k8s-node2  Successfully pulled image "nginx"
  Normal   Created    13s (x2 over 89s)   kubelet, k8s-node2  Created container nginx-error2
  Normal   Started    13s (x2 over 89s)   kubelet, k8s-node2  Started container nginx-error2

readnessProbe探针

[root@k8s-master1 ~]# cat nginx-error2_deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-error2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-error2
  template:
    metadata:
      name: nginx-error2
      labels:
        app: nginx-error2
    spec:
      containers:
      - name: nginx-error2
        image: nginx
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
        readinessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10
        readinessProbe: 
          exec: 
            command: 
            - cat 
            - /tmp/healthy 
          initialDelaySeconds: 5 
          periodSeconds: 5
        readnessProbe:               
          httpGet:
            path: /
            port: 8080
          initialDelaySeconds: 15

startupprobe探针

startupProbe:
  httpGet:
    path: /
    port: 8080
  failureThreshold: 10  # 失败后重试次数
  initialDelaySeconds: 10      # 等待5秒探测
  periodSeconds: 10     # 每隔10秒进行探测

六.优先级顺序

startupprobe > livenessprobe > readnessprobe

【版权声明】本文内容来自摩杜云社区用户原创、第三方投稿、转载,内容版权归原作者所有。本网站的目的在于传递更多信息,不拥有版权,亦不承担相应法律责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@moduyun.com

  1. 分享:
最后一次编辑于 2023年11月08日 0

暂无评论

qw1dHH2kI2RK