摘要:
csi-vol<->pvc;节点NotReady;
*:容器根本没有配置。它在运行时按需从发现机制(例如键/值数据库、文件系统卷、服务发现机制等)中获取所需的配置。(推荐的方法)
1、ceph-dashboard Block-Images的RBD卷名:'csi-vol-d4790152-5c50-11ed-bac2-4a739a0d9a39'与pvc的对应关系。
csi-vol-d4790152-5c50-11ed-bac2-4a739a0d9a39——rbd_data.11c97643f663a6——pvc-96d54e35-3fb7-4973-a760-2f2d2e42c58b
Name csi-vol-e428283c-5d1a-11ed-bac2-4a739a0d9a39
Block name prefix rbd_data.11c9765069d59f
[root@rook-ceph-tools-897d6797f-bgmwk /]# rbd info replicapool/csi-vol-d4790152-5c50-11ed-bac2-4a739a0d9a39
rbd image 'csi-vol-d4790152-5c50-11ed-bac2-4a739a0d9a39':
size 8 GiB in 2048 objects
id: 11c97643f663a6
block_name_prefix: rbd_data.11c97643f663a6
# k -n rook-ceph get pv pvc-96d54e35-3fb7-4973-a760-2f2d2e42c58b -oyaml |grep imageName
imageName: csi-vol-d44cc06f-5c50-11ed-bac2-4a739a0d9a39
# k -n rook-ceph get pv
NAME CAPACITY CLAIM
pvc-96d54e35-3fb7-4973-a760-2f2d2e42c58b 8Gi public-service/data-kafka-0
2、对应关系:csi-rbdplugin-8prn2——k8s-node05(/dev/rbd0)——elasticsearch-logging——PVC:elasticsearch-logging-elasticsearch-logging-0(storageClassName: rook-ceph-block)——PV:pvc-09cf7469-be54-4ad2-a5a2-0128378d3cd2(driver: rook-ceph.rbd.csi.ceph.com)(K8s CSI & ceph 构架图解 - 知乎 (zhihu.com))
[root@k8s-master01 /]# k -n rook-ceph get pod -owide
csi-rbdplugin-8prn2 3/3 Running 18 (4d21h ago) 292d 172.17.54.162 k8s-node05
[root@k8s-master01 ~]# k -n rook-ceph exec -ti csi-rbdplugin-8prn2 -c csi-rbdplugin -- /bin/bash
[root@csi-rbdplugin-8prn2 /]# rbd showmapped
id pool namespace image snap device
0 replicapool csi-vol-682b9db3-5c4b-11ed-bac2-4a739a0d9a39 - /dev/rbd0
1 replicapool csi-vol-d44cc06f-5c50-11ed-bac2-4a739a0d9a39 - /dev/rbd1
2 replicapool csi-vol-e428283c-5d1a-11ed-bac2-4a739a0d9a39 - /dev/rbd2
[root@k8s-node05 ~]# mount |grep rbd0
/dev/rbd0 on /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/e935f1166bd83d05ea2dca0950ac255080f53cd97967aaebe2e43db6d396fe74/globalmount/0001-0009-rook-ceph-0000000000000002-682b9db3-5c4b-11ed-bac2-4a739a0d9a39 type ext4 (rw,relatime,stripe=1024,_netdev)
/dev/rbd0 on /var/lib/kubelet/pods/e080648c-7c0f-4132-a7e7-b262fa9bfe99/volumes/kubernetes.io~csi/pvc-61db7137-9925-4d2f-bc84-48dc629e63ed/mount type ext4 (rw,relatime,stripe=1024,_netdev)
[root@k8s-node05 containers]# ll /var/lib/kubelet/pods/7d4e27d1-9264-4ad2-ae4b-c555bf67fed9/containers
drwxr-x--- 2 root root 22 7月 26 10:00 elasticsearch-logging
drwxr-x--- 2 root root 22 7月 26 10:00 elasticsearch-logging-init
[root@k8s-node05 ~]# more /var/lib/kubelet/pods/e080648c-7c0f-4132-a7e7-b262fa9bfe99/volumes/kubernetes.io~csi/pvc-61db7137-9925-4d2f-bc84-48dc629e63ed/vol_data.json
{"attachmentID":"csi-a6f88a1fd14fd9537727283dcd04fd68f5c91333d1c558eb618232ec664bee23","driverName":"rook-ceph.rbd.csi.ceph.com","nodeName":"k8s-node05","specVolID":"pvc-61db7137-9925-4d2f-bc84-48dc629e63ed","volumeHandle":"0001-0009-rook-ceph-0000000000000002-682b9db3-5c4b-11ed-bac2-4a739a0d9a39","volumeLifecycleMode":"Persistent"}
[root@k8s-master01 /]# k -n logging get pod -owide
NAME IP NODE
elasticsearch-logging-0 172.17.54.175 k8s-node05
[root@k8s-master01 /]# k -n logging get pod elasticsearch-logging-0 -oyaml
volumes:
- name: elasticsearch-logging
persistentVolumeClaim:
claimName: elasticsearch-logging-elasticsearch-logging-0
[root@k8s-master01 ~]# k -n logging get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
elasticsearch-logging-elasticsearch-logging-0 Bound pvc-09cf7469-be54-4ad2-a5a2-0128378d3cd2 250Gi RWO rook-ceph-block 268d
elasticsearch-logging-elasticsearch-logging-1 Bound pvc-69beae6f-6a86-4b72-ac10-df6d182b28dd 250Gi RWO rook-ceph-block 268d
[root@k8s-master01 ~]# k -n logging get pv |grep pvc-09cf7469-be54-4ad2-a5a2-0128378d3cd2
pvc-09cf7469-be54-4ad2-a5a2-0128378d3cd2 250Gi WO Delete Bound logging/elasticsearch-logging-elasticsearch-logging-0 rook-ceph-block 268d
[root@k8s-master01 ~]# k -n rook-ceph get pv pvc-61db7137-9925-4d2f-bc84-48dc629e63ed -oyaml |grep imageName
imageName: csi-vol-682b9db3-5c4b-11ed-bac2-4a739a0d9a39
3、Keepalived问题(实际是IP冲突)导致K8s节点NotReady(ping VIP地址慢,ping VIP[virtual_ipaddress192.168.31.235]对应的mcast_src_ip192.168.31.211正常)——报错:kubelet.go:2419] "Error getting node" err="node \"k8s-master03\" not found"
[root@k8s-master02 ~]# k get node
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready <none> 424d v1.24.0
k8s-master02 NotReady <none> 424d v1.24.0
......k8s-master03、k8s-node01至k8s-node08都是NotReady,ping VIP慢
[root@k8s-node03 ~]# ping 192.168.31.235
64 bytes from 192.168.31.235: icmp_seq=14 ttl=64 time=134 ms
[root@k8s-node02 ~]# ping 192.168.31.235
64 bytes from 192.168.31.235: icmp_seq=1 ttl=64 time=0.437 ms
测试:关闭k8s-master01,VIP
[root@k8s-master03 ~]# ping 192.168.31.235
PING 192.168.31.235 (192.168.31.235) 56(84) bytes of data.
64 bytes from 192.168.31.235: icmp_seq=3 ttl=64 time=109 ms
......关闭k8s-master01-自动切换到k8s-master02
64 bytes from 192.168.31.235: icmp_seq=13 ttl=64 time=0.066 ms
......ping VIP地址慢,vip对应的IP正常-------------
[root@k8s-master02 ~]# ping 192.168.31.235
64 bytes from 192.168.31.235: icmp_seq=1 ttl=64 time=67.4 ms
[root@k8s-master02 ~]# ping 192.168.31.211
64 bytes from 192.168.31.211: icmp_seq=2 ttl=64 time=0.220 ms
- 假如某个节点NotReady,该节点systemctl restart network后,节点Ready。
- 多个节点NotReady,master节点systemctl restart network后,多个节点Ready。
4、LVS(Linux Virtual Server)工作在网络层,可以实现高性能、高可用的服务器集群技术。由于是通用组件,因此不能对特定业务进行针对优化。对于长连接无法进行负载均衡。自身没有健康状态检查,需要结合脚本或者Keepalived等软件实现。LVS (Linux Virtual Server)跟SLB(Server Load Balancing)最大的区别就是,LVS是在网络层起作用的,而SLB是在应用层起作用的。
Keepalived除了可以监控和转移LVS资源,它还可以直接配置LVS而不需要直接使用ipvsadm命令,因为它可以调用,也就是说在LVS+KEEPALIVED模型中,你所有的工作在Keepalived中配置就可以了,而且它还有对后端应用服务器健康检查的功能。一句话Keepalived就是VRRP协议的实现,该协议是虚拟冗余路由协议。 Keepalived本质就是为ipvs服务的,IPVS其实就是一些规则,Keepalived主要的任务就是去调用ipvsadm命令,来生成规则。
VRRP的工作过程:虚拟路由器状态切换时,Master路由器由一台设备切换会另外一台设备,新的Master路由器只是简单的发送一个携带虚拟MAC地址和虚拟IP的免费ARP报文,这样就可以更新其他设备中缓存的ARP信息,网络中的主机感知不到 Master 路由器已经切换为另外一台设备。
- VIP所在主机k8s-master01网卡enp4s0f0的MAC:e4:11:5b:0c:82:ae
- k8s-node03的arp显示IP地址与MAC地址的映射:192.168.31.235<---> 02:f1:6c:96:36:21
[root@k8s-master01 ~]# ip a
4: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether e4:11:5b:0c:82:ae brd ff:ff:ff:ff:ff:ff
inet 192.168.31.211/24 brd 192.168.31.255 scope global enp4s0f0
inet 192.168.31.235/32 scope global enp4s0f0
[root@k8s-node03 ~]# arp -n
Address HWtype HWaddress Flags Mask Iface
192.168.31.211 ether e4:11:5b:0c:82:ae C enp4s0f0
192.168.31.235 ether 02:f1:6c:96:36:21 C enp4s0f0
root@k8s-node03 ~]# ip neigh
192.168.31.235 dev enp4s0f0 lladdr e4:11:5b:0c:82:ae REACHABLE
192.168.31.211 dev enp4s0f0 lladdr e4:11:5b:0c:82:ae REACHABLE
某节点NotReady时,192.168.31.235对应的MAC是02:f1:6c:96:36:21,arping(或# arp -d 192.168.31.235)后192.168.31.235对应的MAC是e4:11:5b:0c:82:ae,节点恢复Ready。
------------------k8s-node03 NotReady------------------
[root@k8s-master01 ~]# k get node
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready <none> 424d v1.24.0
k8s-node03 NotReady <none> 423d v1.24.0
[root@k8s-node03 ~]# ip neigh show
192.168.31.235 dev enp4s0f0 lladdr 02:f1:6c:96:36:21 REACHABLE
192.168.31.211 dev enp4s0f0 lladdr e4:11:5b:0c:82:ae REACHABLE
[root@k8s-node03 ~]# arp -n
Address HWtype HWaddress Flags Mask Iface
192.168.31.235 ether 02:f1:6c:96:36:21 C enp4s0f0
192.168.31.211 ether e4:11:5b:0c:82:ae C enp4s0f0
[root@k8s-node03 ~]# arping -I enp4s0f0 192.168.31.235
ARPING 192.168.31.235 from 192.168.31.216 enp4s0f0
Unicast reply from 192.168.31.235 [E4:11:5B:0C:82:AE] 0.745ms
[root@k8s-node03 ~]# ip neigh show
192.168.31.235 dev enp4s0f0 lladdr e4:11:5b:0c:82:ae REACHABLE
192.168.31.211 dev enp4s0f0 lladdr e4:11:5b:0c:82:ae REACHABLE
[root@k8s-node03 ~]# arp -n
Address HWtype HWaddress Flags Mask Iface
192.168.31.235 ether e4:11:5b:0c:82:ae C enp4s0f0
192.168.31.211 ether e4:11:5b:0c:82:ae C enp4s0f0
以下可见,由于异常的配对192.168.31.235 [02:F1:6C:96:36:21]导致慢822.584ms。当同一个ip返回是同一个mac地址时,没有冲突;如果同时返回多个mac地址时,表示地址冲突。
[root@k8s-master02 ~]# arping -I enp4s0f0 192.168.31.235
ARPING 192.168.31.235 from 192.168.31.212 enp4s0f0
Unicast reply from 192.168.31.235 [E4:11:5B:0C:82:AE] 0.798ms
Unicast reply from 192.168.31.235 [02:F1:6C:96:36:21] 822.584ms
测试,只要网线连接TL-WR886N路由器的LAN口(WAN口网线断开,而且LAN口只连这一根网线),过一会儿就会出现192.168.31.235 [02:F1:6C:96:36:21],更换LAN口也一样。TL-WR886N路由器DHCP服务器是关闭状态。TL-WR886N路由器开启无线功能,某个手机设置静态IP为192.168.31.235导致的(当时手机使用随机MAC)。
*:假如当时手机使用设备MAC,在MAC地址查询网,可以查询到设备信息。
*:之前排查的现象都可以解释了
- 异常时ping 192.168.31.235时间67.4 ms(正常0.745ms)长,是因为走的无线WIFI;
- arping时MAC地址改变,是因为手机进入WIFI作用范围内后造成IP冲突;
- MAC地址查询网查询不到02:F1:6C:96:36:21地址信息是因为手机使用了随机MAC;
- 造成IP冲突是因为TL-WR886N路由器的LAN口和WIFI网络属于一个子网;
- 192.168.31.211和192.168.31.235的MAC地址不一样是当时192.168.31.235的MAC是手机通过WIFI进入同一个子网造成的;
- 重启机器和重启网络服务恢复正常Ready是因为会导致更新arp为正确的MAC。
5、arping主要就是查看ip的MAC地址及IP占用的问题。 arping有两个版本,一个版本是Thomas Habets写的,可以arping <MAC地址>,可以通过MAC地址得到IP;还有一个版本是Linux iputils suite的,这个版本不能通过MAC地址,解析出IP地址了。(浅谈arping)
# arping
ARPing 2.11, by Thomas Habets <thomas@habets.se>
# arping -i enp4s0f0 -c 1 -t e8:39:35:20:8b:68 192.168.31.212 //确定指定IP绑在了指定的网卡上
ARPING 192.168.31.212
60 bytes from e8:39:35:20:8b:68 (192.168.31.212): index=0 time=11.179 msec
# arping -i enp4s0f0 -c 1 -T 192.168.31.212 e8:39:35:20:8b:68 //确定指定的网卡绑定了指定的IP
ARPING e8:39:35:20:8b:68
60 bytes from 192.168.31.212 (e8:39:35:20:8b:68): icmp_seq=0 time=3.206 msec
# tcpdump -i enp4s0f0 -nec 2 icmp
18:08:04.001870 e4:11:5b:0c:82:ae > e8:39:35:20:8b:68, ethertype IPv4 (0x0800), length 42: 192.168.31.211 > 192.168.31.212: ICMP echo request, id 17767, seq 0, length 8
18:08:04.002106 e8:39:35:20:8b:68 > e4:11:5b:0c:82:ae, ethertype IPv4 (0x0800), length 60: 192.168.31.212 > 192.168.31.211: ICMP echo reply, id 17767, seq 0, length 8
以下失败,原因待查?
# arping -i enp4s0f0 -c 1 e8:39:35:20:8b:68 //查看某个MAC地址的IP
ARPING e8:39:35:20:8b:68
--- e8:39:35:20:8b:68 statistics ---
1 packets transmitted, 0 packets received, 100% unanswered (0 extra)
----广播询问没有回应-------------
# tcpdump -i enp4s0f0 -nec 2 icmp
18:10:18.899135 e4:11:5b:0c:82:ae > e8:39:35:20:8b:68, ethertype IPv4 (0x0800), length 42: 192.168.31.211 > 255.255.255.255: ICMP echo request, id 17767, seq 0, length 8