高可用集群keepalived
keepalived里面用到的核心技术是什么?
VRRP:虚拟路由冗余协议,解决静态网关单点风险
物理层包括:路由器(主设备、备用设备)、三层交换机(用于连接主备设备)
软件层:keepalived
相关技术:
主备设备连接--心跳线
主备设备优先级
主备设备工作方式:抢占式、非抢占式
主备设备安全认证
工作模式:主备 主主 备主(虚拟路由器2)
linux里面配置集群有哪些
LB:负载均衡集群 典型代表:LVS/HAProxy/nginx
HA:高可用集群 典型代表:keepalived实现无状态应用的高可用。mysql、redis等有状态应用通过自身的高可用解决方案 如mysql的组复制、pxc。
HPC:高性能集群
Keepalived 架构和安装
包安装
#CentOS
[root@centos ~]#yum -y install keepalived
#ubuntu
[root@ubuntu2004 ~]#apt -y install keepalived
Ubuntu 安装 keepalived
[root@ubuntu2004 ~]#apt -y install keepalived
#默认没有配置文件无法启动
#利用范例生成配置文件
[root@ubuntu2004 ~]#cp /usr/share/doc/keepalived/samples/keepalived.conf.sample /etc/keepalived/keepalived.conf
[root@ubuntu2004 ~]#systemctl start keepalived.service
查看进程
[root@ubuntu2004 ~]#ps auxf |grep keepalived
脚本编译安装
[root@ubuntu2004 ~]#cat install_keepalived.sh
#!/bin/bash
KEEPALIVED_VERSION=2.2.7
#KEEPALIVED_VERSION=2.2.2
#KEEPALIVED_VERSION=2.0.20
KEEPALIVED_FILE=keepalived-${KEEPALIVED_VERSION}.tar.gz
KEEPALIVED_INSTALL_DIR=/apps/keepalived
SRC_DIR=/usr/local/src
KEEPALIVED_URL=https://keepalived.org/software/
CPUS=`grep -c processor /proc/cpuinfo`
. /etc/os-release
color () {
RES_COL=60
MOVE_TO_COL="echo -en \\033[${RES_COL}G"
SETCOLOR_SUCCESS="echo -en \\033[1;32m"
SETCOLOR_FAILURE="echo -en \\033[1;31m"
SETCOLOR_WARNING="echo -en \\033[1;33m"
SETCOLOR_NORMAL="echo -en \E[0m"
echo -n "$1" && $MOVE_TO_COL
echo -n "["
if [ $2 = "success" -o $2 = "0" ] ;then
${SETCOLOR_SUCCESS}
echo -n $" OK "
elif [ $2 = "failure" -o $2 = "1" ] ;then
${SETCOLOR_FAILURE}
echo -n $"FAILED"
else
${SETCOLOR_WARNING}
echo -n $"WARNING"
fi
${SETCOLOR_NORMAL}
echo -n "]"
echo
}
download_file (){
cd ${SRC_DIR}
if [ $ID = 'centos' -o $ID = 'rocky' ];then
rpm -q wget &> /dev/null || yum -y install wget
elif [ $ID = 'ubuntu' ];then
dpkg -l |grep wget || { apt update; apt install -y wget; }
else
color "不支持此操作系统,退出!" 1
exit
fi
if [ ! -e ${KEEPALIVED_FILE} ];then
wget --no-check-certificate ${KEEPALIVED_URL}${KEEPALIVED_FILE}
[ $? -ne 0 ] && { color "KEEPALIVED源码包下载失败" 1 ; exit; }
fi
}
install_keepalived () {
if [ $ID = 'centos' -o $ID = 'rocky' ];then
yum -y install make gcc ipvsadm autoconf automake openssl-devel libnl3-devel iptables-devel net-snmp-devel glib2-devel pcre2-devel libmnl-devel systemd-devel &> /dev/null
elif [ $ID = 'ubuntu' ];then
apt update
apt -y install make gcc ipvsadm build-essential pkg-config automake autoconf libipset-dev libnl-3-dev libnl-genl-3-dev libssl-dev libxtables-dev libip4tc-dev libip6tc-dev libipset-dev libmagic-dev libsnmp-dev libglib2.0-dev libpcre2-dev libnftnl-dev libmnl-dev libsystemd-dev
else
color "不支持此操作系统,退出!" 1
fi
tar xf ${KEEPALIVED_FILE}
cd keepalived-${KEEPALIVED_VERSION}
./configure --prefix=${KEEPALIVED_INSTALL_DIR} --disable-fwmark
make -j $CPUS && make install
if [ $? -eq 0 ];then
color "KEEPALIVED编译安装成功" 0
else
color "KEEPALIVED编译安装失败,退出!" 1
exit
fi
[ -d /etc/keepalived ] || mkdir -p /etc/keepalived
cp ${KEEPALIVED_INSTALL_DIR}/etc/keepalived/keepalived.conf.sample /etc/keepalived/keepalived.conf
cp ./keepalived/keepalived.service /lib/systemd/system/
}
start_keepalived () {
systemctl daemon-reload
systemctl enable --now keepalived &> /dev/null
systemctl is-active keepalived
if [ $? -eq 0 ] ;then
color "Keepalived 服务安装成功!" 0
else
color "Keepalived 服务安装失败!" 1
exit 1
fi
}
download_file
install_keepalived
start_keepalived
安装后查看状态
[root@ubuntu2004 ~]#systemctl status keepalived.service
查看IP
inet 192.168.200.16/32 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.200.17/32 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.200.18/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe69:a082/64 scope link
ping生成的这些IPping不通,要把配置文件中的vrrp_strict取消掉
[root@ubuntu2004 ~]#ping 192.168.200.16
PING 192.168.200.16 (192.168.200.16) 56(84) bytes of data.
ping: sendmsg: 不允许的操作
[root@ubuntu2004 ~]#vim /etc/keepalived/keepalived.conf
#vrrp_strict
[root@ubuntu2004 ~]#systemctl restart keepalived.service
[root@ubuntu2004 ~]#ping 192.168.200.16
PING 192.168.200.16 (192.168.200.16) 56(84) bytes of data.
64 字节,来自 192.168.200.16: icmp_seq=1 ttl=64 时间=0.027 毫秒
keepalived配置说明
配置文件组成有三块,最主要的是前两块
GLOBAL CONFIGURATION #全局配置(集群服务共享的)
Global definitions:定义邮件配置,route_id,vrrp配置,多播地址等
VRRP CONFIGURATION #VRRP配置(配置vip)
VRRP instance(s):定义每个vrrp虚拟路由器
LVS CONFIGURATION #LVS配置(集成在keepalived,如果配合haproxy、nginx使用,这个可删除)
Virtual server group(s)
Virtual server(s):LVS集群的VS和RS
全局配置
[root@ubuntu2004 ~]#vim /etc/keepalived/keepalived.conf
global_defs {
notification_email { #邮件服务:发不了,可删除
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL #核心内容。每个keepalived主机唯一标识,建议使用当前主机名,如果多节点重名会影响切换脚本执行
vrrp_skip_check_adv_addr #对所有通告报文都检查,会比较消耗性能,启用此配置后,如果收到的通告报文和上一个报文是同一个路由器,则跳过检查,默认值为全检查
#vrrp_strict #严格遵守VRRP协议,启用此项后以下状况将无法启动服务:1.无VIP地址 2.配置了单播邻 居 3.在VRRP版本2中有IPv6地址,开启动此项并且没有配置vrrp_iptables时会自动开启iptables防火 墙规则,默认导致VIP无法访问,建议不加此项配置
vrrp_garp_interval 0
vrrp_gna_interval 0
vrrp_mcast_group4 224.0.0.18 #指定组播IP地址范围,默认224.0.0.18,tcpdump -i eth0 host 224.0.0.18 抓包测试
vrrp_iptables #此项和vrrp_strict同时开启时,则不会添加防火墙规则,如果无配置 vrrp_strict项,则无需启用此项配置,注意:新版加此项仍有iptables规则
}
简化全局配置(只保留一个即可)
[root@ubuntu2004 ~]#vim /etc/keepalived/keepalived.conf
global_defs { #(全局简化配置)
router_id ka1
}
vrrp_instance VI_1 { #(虚拟路由器)
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.200.16
192.168.200.17
192.168.200.18
}
}
两台实例配置虚拟路由器(一个业务)
[root@ubuntu2004 ~]#vim /etc/keepalived/keepalived.conf
global_defs {
router_id ka1 #每台实例名字不一样 ka1和ka2
}
vrrp_instance VI_1 { #区分多个业务的实例名,如果有多个业务,每个有每个的名字
state MASTER #主写MASTER 备写BACKUP(写上不起作用,主从由priority值确定)
interface eth0 #通过eth0接口往外发广播消息,和VIP地址不一定在一个网卡。
virtual_router_id 51 #虚拟路由器标识,要求编号一样,表示在一个路由器里面 0-255之间
priority 100 #优先级,错开 例如:master 100 backup 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111 #避免和其他的机器也配同样的虚拟路由器标识后抢占优先级,
} #tcpdump -i eth0 -nn host 224.0.0.18 -vvv
virtual_ipaddress { #配的VIP,默认绑在eth0上,有别的网卡可以绑定别的网卡
10.0.0.100/24 dev eth0 label eth0:1 #一个业务,绑定一个vip
#10.0.0.200/24 dev eth0 label eth0:2 #如果多个业务,可以绑定多个vip
}
}
范例:keepalived一主一从配置架构
第一台MASTER:配置文件的全局配置与虚拟路由器配置
[root@ka1 ~]#vim /etc/keepalived/keepalived.conf
global_defs {
router_id ka1
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 66
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
10.0.0.100/24 dev eth0 label eth0:1
}
}
[root@ka1 ~]#systemctl restart keepalived.service
第二台backup:配置文件的全局配置与虚拟路由器配置
[root@ka2 ~]#vim /etc/keepalived/keepalived.conf
global_defs {
router_id ka2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 66
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
10.0.0.100/24 dev eth0 label eth0:1
}
}
[root@ka2 ~]#systemctl restart keepalived.service
启动后,在其他机器使用tcpdump -i eth0 -nn host 224.0.0.18 -vvv 可观察到10.0.0.101的eth0网卡的多播
再其他机器上ping10.0.0.100,其拿到的地址是优先级高的第一台机器的MAC地址
[root@ubuntu2004 ~]#ping 10.0.0.100
PING 10.0.0.100 (10.0.0.100) 56(84) bytes of data.
64 字节,来自 10.0.0.100: icmp_seq=1 ttl=64 时间=1.34 毫秒
[root@ubuntu2004 ~]#arp -n
10.0.0.100 ether 00:0c:29:69:a0:82 C eth0
查看MAC地址来源于那个机器
[root@ka1 ~]#ip a
第一台link/ether 00:0c:29:69:a0:82 brd ff:ff:ff:ff:ff:ff
[root@ka2 ~]#ip a
第二台link/ether 00:0c:29:51:27:c8 brd ff:ff:ff:ff:ff:ff
关闭第一个节点,VIP飘到第二个节点,再观察MAC地址
[root@ubuntu2004 ~]#systemctl stop keepalived.service
[root@ubuntu2004 ~]#arp -n
10.0.0.100 ether 00:0c:29:51:27:c8 C eth0
MAC地址为第二台机器的地址
启动第一个节点,再访问,会显示第一个节点的MAC,因为优先级高
[root@ubuntu2004 ~]#systemctl start keepalived.service
[root@ubuntu2004 ~]#arp -n
10.0.0.100 ether 00:0c:29:69:a0:82 C eth0
也可抓包观察现象(多播,VIP一直发送自己的状态供其他机器知道,当第一个节点停止时,他的优先级为0,并向外发送自己的状态,这是第二台机器的优先级高,则第二台机器接替第一台工作)
[root@ubuntu2004 ~]#tcpdump -i eth0 -nn host 224.0.0.18
启用 Keepalived 日志功能(所有节点都启用)
[root@ubuntu2004 ~]#vim /apps/keepalived/etc/sysconfig/keepalived
KEEPALIVED_OPTIONS="-D -S 6"
[root@ubuntu2004 ~]#vim /etc/rsyslog.d/keepalived.conf
local6.* /var/log/keepalived.log
重启服务
[root@ubuntu2004 ~]#systemctl restart keepalived.service
[root@ubuntu2004 ~]#systemctl restart rsyslog.service
查看日志
[root@ubuntu2004 ~]#ls /var/log/keepalived.log
/var/log/keepalived.log
实现 Keepalived 独立子配置文件
生产中,业务可能不是一个业务,就需要在配置文件中写很多地址
当生产环境复杂时, /etc/keepalived/keepalived.conf 文件中保存所有集群的配置会导致内容过多,不易管理 可以将不同集群的配置,
比如:不同集群的VIP配置放在独立的子配置文件中 利用include指令可以实现包含子配置文件
范例(keepalived服务端机器都重复此过程)
第一步:把主配置文件中的VIP相关信息切走并加include,指定VIP存放路径
[root@ubuntu2004 ~]#vim /etc/keepalived/keepalived.conf
global_defs {
router_id ka2
}
include /etc/keepalived/conf.d/*.conf
第二步:把切走的VIP相关信息存放在/etc/keepalived/conf.d/*.conf
[root@ubuntu2004 ~]#mkdir /etc/keepalived/conf.d/
[root@ubuntu2004 ~]#vim /etc/keepalived/conf.d/www.meng.org.conf
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 66
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
10.0.0.100/24 dev eth0 label eth0:1
}
}
[root@ubuntu2004 ~]#systemctl restart keepalived.service
有多少个业务,就可以创建多少个文件夹来存放VIP的相关信息,各不干扰,各配各的
Keepalived配置谁的优先级高,谁就会抢占VIP,但是当节点之间的连接断了之后,优先级低的节点收不到优先级高的节点发送的多播地址信息,优先级低的节点就会启用VIP,会造成节点都有VIP地址,都会组播自己的地址,这就是脑裂(心跳线断了也会产生脑裂)
示例(低优先级的节点阻止高优先级节点发来地址信息)iptables阻断优先级高的组播地址信息
[root@ubuntu2004 ~]#iptables -A INPUT -s 10.0.0.101 -j DROP
可以查看两个节点都在声明自己拥有VIP地址
[root@ubuntu2004 ~]#tcpdump -i eth0 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
21:00:41.654551 IP 10.0.0.101 > 224.0.0.18: VRRPv2, Advertisement, vrid 66, prio 80, authtype simple, intvl 1s, length 20
21:00:41.683218 IP 10.0.0.102 > 224.0.0.18: VRRPv2, Advertisement, vrid 66, prio 100, authtype simple, intvl 1s, length 20
查看各自IP,发现都带VIP地址
[root@ubuntu2004 ~]#hostname -I
10.0.0.102 10.0.0.100
[root@ubuntu2004 ~]#hostname -I
10.0.0.101 10.0.0.100
测试完清除iptables
[root@ubuntu2004 ~]#iptables -F
增加keepalived服务端网卡eth1,更改配置选项
增加eth1网卡
一节点:(192.168.10.100)
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:69:a0:8c brd ff:ff:ff:ff:ff:ff
inet 192.168.10.100/24 brd 192.168.10.255 scope global eth1
二节点:(192.168.10.101)
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:51:27:d2 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.101/24 brd 192.168.10.255 scope global eth1
更改配置选项
主节点一
[root@ubuntu2004 ~]#vim /etc/keepalived/conf.d/www.meng.org.conf
vrrp_instance VI_1 {
state MASTER
interface eth1
virtual_router_id 66
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
10.0.0.100/24 dev eth0 label eth0:1
}
}
[root@ubuntu2004 ~]#systemctl restart keepalived.service
查看IP
[root@ubuntu2004 ~]#hostname -I
10.0.0.101 10.0.0.100 192.168.10.100
主节点二
[root@ubuntu2004 ~]#vim /etc/keepalived/conf.d/www.meng.org.conf
vrrp_instance VI_1 {
state BACKUP
interface eth1
virtual_router_id 66
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
10.0.0.100/24 dev eth0 label eth0:1
}
}
[root@ubuntu2004 ~]#systemctl restart keepalived.service
[root@ubuntu2004 ~]#hostname -I
10.0.0.102 192.168.10.101
更改网卡后,10.0.0.0网段便收不到各节点组播的地址信息,可增加192.168.10网段网卡来接受组播地址信息
[root@ubuntu2004 ~]#tcpdump -i eth1 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
21:41:19.605659 IP 192.168.10.100 > 224.0.0.18: VRRPv2, Advertisement, vrid 66, prio 100, authtype simple, intvl 1s, length 20
配置文件中,eth1网卡目前是对外组播地址信息的网卡,eth0网卡是业务网卡,当交换机故障或者eth1网卡不是一个模式了(比如一个网卡是仅主机模式,一个是NAT模式或桥接模式,不在一个网段,模拟网线断了),节点之间就没有办法VIP通讯,低优先级节点会以为高优先级节点挂了,他会组播自己的地址并且启动VIP,这种情况下,高优先级节点与低优先级节点都会有VIP地址,造成脑裂
脑裂
主备节点同时拥有VIP,此时为脑裂理象
注意:脑裂现象原因
心跳线故障
防火墙错误配置
Keepalived 配置错误 (比如各配各自的虚拟路由标识编号)
面试题: 在工作中如何避免脑裂问题?
进行监控,发现脑裂现象,及时反馈
用arping -I eth1 10.0.0.100的方式,如果出现两个MAC地址就是脑裂
例如:正常情况下是出现一个MAC地址
[root@ubuntu2004 ~]#arping -I eth1 10.0.0.00
ARPING 10.0.0.150
60 bytes from 00:0c:29:69:a0:8c (10.0.0.150): index=0 time=498.002 usec
60 bytes from 00:0c:29:69:a0:8c (10.0.0.150): index=1 time=534.680 usec
用iptables规则限制节点之间的通讯
[root@ubuntu2004 ~]#iptables -A INPUT -s 192.168.10.100 -j DROP
[root@ubuntu2004 ~]#arping -I eth1 -c1 10.0.0.150
ARPING 10.0.0.150
60 bytes from 00:0c:29:69:a0:8c (10.0.0.150): index=0 time=625.363 usec
60 bytes from 00:0c:29:51:27:d2 (10.0.0.150): index=1 time=1.215 msec
--- 10.0.0.150 statistics ---
1 packets transmitted, 2 packets received, 0% unanswered (1 extra)
rtt min/avg/max/std-dev = 0.625/0.920/1.215/0.295 ms
发一个包,有两个回应,都有这个地址,就有脑裂。