pg_auto_failover的一主两备部署
  IE5LYMWlmdvL 2023年11月22日 20 0

适用范围

本文档适用于在 CentOS/RHEL 7.X 环境下安装部署 PostgreSQL 14.X 的 pg_auto_failover 高可用环境。

方案概述

一、总体概述

1.1、pg_auto_failover 的简单介绍

1.1.1、pg_auto_failover 的参考资料
1.1.2、pg_auto_failover 的三种高可用架构:
  • Single Standby Architecture
  • Multiple Standby Architecture,本文使用的架构
  • Multiple Standbys Architecture with 3 standby nodes, one async
1.1.3、流复制同步模式:
  • 异步:性能比较好,Standby上的数据落后于主库上的数据,如果使用Hot Standby做读写分离,就会存在数据不一致的问题。极端情况下,主库如果宕机,被库被激活成主库,部分 WAL 没有发送到备库,可能造成数据丢失。
  • 同步:备库保证数据完全一致,但是可用性比较差,数据写入备库后,事务的commit才返回,所以备库出现问题时,会导致主库被Hang住。经常会受到网络波动的影响。
  • quorum:基于规定数量的同步模式,事务提交将会等待后备节点中任意规定数量的节点返回回复信息就认为同步完成。

1.2、节点基本信息

主库(读写)

备节点1(只读)

备节点2(只读)

监控节点

操作系统

RHEL 7.9

RHEL 7.9

RHEL 7.9

RHEL 7.9

主机名

pgtest1

pgtest2

pgtest3

pg_monitor

ip地址

192.168.0.31

192.168.0.32

192.168.0.33

192.168.0.29

1.3、软件基本信息

软件

版本

端口

服务名

PostgreSQL

14.6

5432

pg_auto_failover

2.0

pg-auto-failover.service

multissh(选用)

0.4.0

1.4、目录规划信息

用途

目录

软连接

PostgreSQL的软件目录

/data/app/pg14/14.6

/data/app/pgsql

PostgreSQL的数据目录

/data/pgdata/pgdata_14_5432

/data/pgdata/pgdata

PostgreSQL的WAL目录

/data/pgdata/pgwal_14_5432

/data/pgdata/pgwal

PostgreSQL的归档目录

/data/pgdata/pgarch

PostgreSQL的日志目录

/data/app_log/pglog

PostgreSQL的源码目录

/data/soft/postgresql-14.6

PostgreSQL的插件目录

/data/soft/postgresql-14.6/contrib

pg_auto_failover的源码目录

/data/soft/pg_auto_failover-2.0

实施步骤

二、操作系统配置(所有节点,包括监控节点)

2.1、配置系统服务



# 关闭防火墙,关闭 SELINUX
systemctl stop firewalld.service
systemctl disable firewalld.service
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config 
setenforce 0



2.2、修改主机名



hostnamectl set-hostname pgtest1
hostnamectl set-hostname pgtest2
hostnamectl set-hostname pgtest3
hostnamectl set-hostname pg_monitor



2.3、修改操作系统语言



localectl status
localectl set-locale LANG=en_US.UTF-8



2.4、安装软件包



# 配置yum的repo文件
cat > /etc/yum.repos.d/enmo.repo << EOF
[Server]
name=Server
baseurl=file:///media/cdrom
enabled=yes
gpgcheck=0
EOF

# 挂载操作系统镜像
mkdir /media/cdrom
mount /dev/cdrom /media/cdrom
mount /root/rhel-server-7.9-x86_64-dvd.iso /media/cdrom

# 安装软件包
yum -y install libevent libevent-devel sudo vim acl less curl wget flex libselinux-devel readline-devel zlib zlib-devel gettext-devel openssl openssl-devel pam pam-devel libxml2 libxml2-devel libxslt libxslt-devel perl perl-devel perl-ExtUtils* tcl-devel systemd-devel pcre-devel gcc gcc-c++ make tree psmisc net-tools libnl* ntp uuid-devel



2.5、修改主机时间



# 确保节点间时间和时区同步,有条件的同步时间服务器
timedatectl
timedatectl list-timezones
timedatectl set-timezone Asia/Shanghai
date -s "20211001 00:00:00"
ntpdate time.windows.com && hwclock -w



2.6、配置/etc/hosts解析文件



cat >> /etc/hosts <<EOF
192.168.0.31   pgtest1
192.168.0.32   pgtest2
192.168.0.33   pgtest3
192.168.0.29   pg_monitor
EOF



2.7、创建用户和组



groupadd -g 5432 postgres
useradd -u 5432 -g postgres postgres
echo postgres |passwd --stdin postgres



2.8、创建目录和软连接



mkdir -p /data/app/pg14/14.6
mkdir -p /data/pgdata/pgdata_14_5432
mkdir -p /data/pgdata/pgwal_14_5432
mkdir -p /data/pgdata/pgarch
mkdir -p /data/app_log/pglog
mkdir -p /data/soft
chown -R postgres:postgres /data
chmod 0700 /data/pgdata/pgdata_14_5432 /data/pgdata/pgwal_14_5432 /data/pgdata/pgarch
ln -s /data/app/pg14/14.6 /data/app/pgsql
ln -s /data/pgdata/pgdata_14_5432 /data/pgdata/pgdata
ln -s /data/pgdata/pgwal_14_5432 /data/pgdata/pgwal



2.9、配置系统内核参数



v_shmmax=`free |grep Mem|awk '{print $2 *1024}'`
v_shmall=$(expr `free |grep Mem|awk '{print $2 *1024}'` / `getconf PAGE_SIZE`)
cat >> /etc/sysctl.conf <<EOF
# for PostgreSQL
kernel.shmmax = $v_shmmax   # free |grep Mem|awk '{print $2 *1024}'
kernel.shmall = $v_shmall   # expr `free |grep Mem|awk '{print $2 *1024}'` / `getconf PAGE_SIZE`
kernel.shmmni = 4096
kernel.sem = 50100 64128000 50100 1280
kernel.numa_balancing=0
kernel.sched_migration_cost_ns=5000000
kernel.sched_autogroup_enabled=0
fs.file-max = 76724200
fs.aio-max-nr = 40960000
net.ipv4.ip_local_port_range = 9000 65535
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 8192   # 4096
net.ipv4.tcp_timestamps = 0
net.ipv4.ip_nonlocal_bind=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_reuse=1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.core.rmem_default = 1048576
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 4194304
net.core.netdev_max_backlog = 10000
net.core.somaxconn=65535
vm.overcommit_memory = 2  # 0,1,2
vm.dirty_ratio=20
vm.dirty_background_ratio=3
vm.dirty_writeback_centisecs=100
vm.dirty_expire_centisecs=1000
vm.dirty_background_bytes=67108864
vm.dirty_bytes=536870912
vm.swappiness=1
vm.min_free_kbytes=524288
vm.zone_reclaim_mode=0
EOF
sysctl -p



2.10、配置系统资源限制



cat >> /etc/security/limits.conf << EOF
# for PostgreSQL
postgres    soft    nofile   1048576
postgres    hard    nofile   1048576
postgres    soft    nproc    unlimited
postgres    hard    nproc    unlimited
postgres    soft    stack    unlimited
postgres    hard    stack    unlimited
postgres    soft    core    unlimited
postgres    hard    core    unlimited
postgres    soft    memlock   unlimited
postgres    hard    memlock   unlimited
EOF

cat >> /etc/pam.d/login << EOF
# for PostgreSQL
session    required     pam_limits.so
EOF



2.11、禁用IPV6网络



cat >> /etc/sysconfig/network << EOF
# for PostgreSQL
NETWORKING_IPV6=NO
EOF



2.12、配置环境变量



cat >> /etc/profile << EOF
# for PostgreSQL
export LANG=en_US.UTF-8
export PGHOME=/data/app/pgsql
export PGUSER=postgres
export PGPORT=5432
export PGDATA=/data/pgdata/pgdata
export PATH=\$PGHOME/bin:\$PATH:\$HOME/bin
export LD_LIBRARY_PATH=\$PGHOME/lib:/usr/local/libevent/lib:\$LD_LIBRARY_PATH
EOF
source /etc/profile



2.13、禁用透明大页内存



[root@pgtest1 ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS
UEFI

[root@pgtest1 ~]# vi /etc/grub2-efi.cfg
linuxefi ... quiet LANG=en_US.UTF-8 transparent_hugepage=never
...
linuxefi ... quiet transparent_hugepage=never



三、源码编译安装软件(所有节点,包括监控节点)

3.1、安装 PostgreSQL 14.6



# 解压缩软件包
tar -xzvf /data/soft/postgresql-14.6.tar.gz -C /data/soft
# 编译安装
cd /data/soft/postgresql-14.6
./configure --prefix=/data/app/pg14/14.6 --with-pgport=5432 --with-openssl
make -j 8 && make install
# 安装工具集
cd /data/soft/postgresql-14.6/contrib
make -j 8 && make install
# 查询版本,确认安装成功
postgres --version       # postgres (PostgreSQL) 14.6



注意: pg_auto_failover 插件 pg_autoctl create monitor/postgres 命令会自动检测指定的 --pgdata 命令下是否存在数据库,如果不存在数据库就会自动 initdb 出来一个,如果存在数据库就会接管现有的数据库,但是 pg_autoctl 命令 initdb 出来的数据库不能定制化修改属性,比如修改 --wal-segsize。所以下面对数据库实例的初始化方式存在两种,第一种是使用 pg_autoctl create monitor/postgres 命令自动创建,使用这种方式的忽略下面 3.2 到 3.5 步骤。第二种是自己初始化数据库,只需要在主节点上初始化数据库即可,后面部署 pg_auto_failover 的时候会自动建立流复制关系,这种情况也可以模拟对现有数据库搭建 pg_auto_failover 高可用的情况。

3.2、主节点初始化数据库(使用pg_autoctl自动建库的可忽略此步骤)



echo "postgres" > /home/postgres/pwfile
su - postgres -c "initdb --pgdata=/data/pgdata/pgdata --waldir=/data/pgdata/pgwal --encoding=UTF8 --lc-collate=C --lc-ctype=C --allow-group-access --data-checksums --username=postgres --pwfile=/home/postgres/pwfile --wal-segsize=32"



3.3、主节点配置数据库参数(使用pg_autoctl自动建库的可忽略此步骤)



echo "listen_addresses = '*'" >> $PGDATA/postgresql.conf
echo "password_encryption = 'scram-sha-256'" >> $PGDATA/postgresql.conf



3.4、主节点配置PostgreSQL的访问策略文件(使用pg_autoctl自动建库的可忽略此步骤)



vi $PGDATA/pg_hba.conf
host    all             all             192.168.0.0/24          trust



3.5、主节点启动数据库(使用pg_autoctl自动建库的可忽略此步骤)



[root@pgtest1 ~]# su - postgres
Last login: Fri Nov 18 20:03:42 CST 2022 on pts/1
[postgres@pgtest1 ~]$ pg_ctl start
waiting for server to start....2022-11-18 20:19:22.176 CST [2876] LOG:  starting PostgreSQL 14.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2022-11-18 20:19:22.176 CST [2876] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-11-18 20:19:22.176 CST [2876] LOG:  listening on IPv6 address "::", port 5432
2022-11-18 20:19:22.177 CST [2876] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-11-18 20:19:22.201 CST [2877] LOG:  database system was shut down at 2022-11-18 20:03:43 CST
2022-11-18 20:19:22.242 CST [2876] LOG:  database system is ready to accept connections
 done
server started
[postgres@pgtest1 ~]$ psql
psql (14.6)
Type "help" for help.

postgres=# \l
                             List of databases
   Name    |  Owner   | Encoding | Collate | Ctype |   Access privileges
-----------+----------+----------+---------+-------+-----------------------
 postgres  | postgres | UTF8     | C       | C     |
 template0 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
 template1 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
(3 rows)

postgres=# create table tt (id int);
CREATE TABLE
postgres=# insert into tt values (1);
INSERT 0 1



3.6、安装pg_auto_failover 2.0



tar -xvf /data/soft/pg_auto_failover-2.0.tar.gz -C /data/soft/
cd /data/soft/pg_auto_failover-2.0
make -s clean && make -s -j12 all
make -s install

# pg_autoctl --version
pg_autoctl version 2.0
pg_autoctl extension version 2.0
compiled with PostgreSQL 14.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
compatible with Postgres 10, 11, 12, 13, and 14



3.7、安装配置multissh(选做)

本次实验使用了一个开源工具 multissh ,multissh 是一个简单的并行 SSH 工具,可以批量的对主机通过 SSH 执行命令组合。网址https://github.com/shanghai-edu/multissh



# 安装multissh
mkdir -p /data/app/multissh
tar -xvf /data/soft/multissh-linux64-release-0.4.0.tar.gz -C /data/app/multissh/

# 配置环境变量
echo "# for multissh" >> /etc/profile
echo "export PATH=/data/app/multissh:\$PATH" >> /etc/profile
echo "alias multissh='multissh -l -u root -p root'" >> /etc/profile
source /etc/profile

# 测试
[root@pgtest1 ~]# multissh -hosts 192.168.0.31,192.168.0.32,192.168.0.33 -cmds "uname -a"
2022/11/17 15:57:22 Multissh start
2022/11/17 15:57:22 Multissh finished. Process time 359.713658ms. Number of active ip is 3
host:  192.168.0.31
========= Result =========
Linux pgtest1 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux

host:  192.168.0.32
========= Result =========
Linux pgtest2 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux

host:  192.168.0.33
========= Result =========
Linux pgtest3 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux



3.8、安装配置jq(选做)



cp /data/soft/jq-linux64 /usr/bin/jq
chmod +x /usr/bin/jq



3.9、安装配置java 1.8(选做)



tar -xvf /data/soft/jdk-8u261-linux-x64.tar.gz -C /data/app
cat >> /etc/profile << EOF
# for Java
export JAVA_HOME=/data/app/jdk1.8.0_261
export JRE_HOME=/data/app/jdk1.8.0_261/jre
export CLASS_PATH=\$JAVA_HOME/lib:\$JRE_HOME/lib:\$CLASS_PATH
export PATH=\$JAVA_HOME/bin:\$JRE_HOME/bin:\$PATH
EOF
source /etc/profile

[root@pgtest3 jdk1.8.0_261]# java -version
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)



四、配置 pg_auto_failover 插件(pg_autoctl create)



[postgres@pg_monitor ~]$ pg_autoctl create --help
pg_autoctl create: Create a pg_auto_failover node, or formation

Available commands:
  pg_autoctl create
    monitor      Initialize a pg_auto_failover monitor node
    postgres     Initialize a pg_auto_failover standalone postgres node
    coordinator  Initialize a pg_auto_failover citus coordinator node
    worker       Initialize a pg_auto_failover citus worker node
    formation    Create a new formation on the pg_auto_failover monitor



4.1、初始化监控节点(在监控节点上操作)



[root@pg_monitor ~]# su - postgres
[postgres@pg_monitor ~]$ pg_autoctl create monitor --help
pg_autoctl create monitor: Initialize a pg_auto_failover monitor node
usage: pg_autoctl create monitor  [ --pgdata --pgport --pgctl --hostname ]

  --pgctl           path to pg_ctl
  --pgdata          path to data directory
  --pgport          PostgreSQL's port number
  --hostname        hostname by which postgres is reachable
  --auth            authentication method for connections from data nodes
  --skip-pg-hba     skip editing pg_hba.conf rules
  --run             create node then run pg_autoctl service
  --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
  --ssl-mode        use that sslmode in connection strings
  --ssl-ca-file     set the Postgres ssl_ca_file to that file path
  --ssl-crl-file    set the Postgres ssl_crl_file to that file path
  --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
  --server-key      set the Postgres ssl_key_file to that file path
  --server-cert     set the Postgres ssl_cert_file to that file path

$ pg_autoctl create monitor --pgdata=/data/pgdata/pgdata --pgport 5432 --hostname pg_monitor --auth trust --ssl-self-signed --run



4.2、创建Primary节点(在主节点上操作)



[root@pgtest1 ~]# su - postgres
[postgres@pg_monitor ~]$ pg_autoctl create postgres --help
pg_autoctl create postgres: Initialize a pg_auto_failover standalone postgres node
usage: pg_autoctl create postgres

  --pgctl           path to pg_ctl
  --pgdata          path to data directory
  --pghost          PostgreSQL's hostname
  --pgport          PostgreSQL's port number
  --listen          PostgreSQL's listen_addresses
  --username        PostgreSQL's username
  --dbname          PostgreSQL's database name
  --name            pg_auto_failover node name
  --hostname        hostname used to connect from the other nodes
  --formation       pg_auto_failover formation
  --monitor         pg_auto_failover Monitor Postgres URL
  --auth            authentication method for connections from monitor
  --skip-pg-hba     skip editing pg_hba.conf rules
  --pg-hba-lan      edit pg_hba.conf rules for --dbname in detected LAN
  --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
  --ssl-mode        use that sslmode in connection strings
  --ssl-ca-file     set the Postgres ssl_ca_file to that file path
  --ssl-crl-file    set the Postgres ssl_crl_file to that file path
  --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
  --server-key      set the Postgres ssl_key_file to that file path
  --server-cert     set the Postgres ssl_cert_file to that file path
  --candidate-priority    priority of the node to be promoted to become primary
  --replication-quorum    true if node participates in write quorum
  --maximum-backup-rate   maximum transfer rate of data transferred from the server during initial sync

$ pg_autoctl create postgres --hostname pgtest1 --auth trust --ssl-self-signed --monitor 'postgres://autoctl_node@pg_monitor:5432/pg_auto_failover?sslmode=require' --run

# 监控节点的连接uri的获取方式
[postgres@pg_monitor ~]$ pg_autoctl show uri



4.3、创建Secondary节点(在所有备节点上操作)



[root@pgtest2 ~]# su - postgres
$ pg_autoctl create postgres --hostname pgtest2 --auth trust --ssl-self-signed --monitor 'postgres://autoctl_node@pg_monitor:5432/pg_auto_failover?sslmode=require' --run
[root@pgtest3 ~]# su - postgres
$ pg_autoctl create postgres --hostname pgtest3 --auth trust --ssl-self-signed --monitor 'postgres://autoctl_node@pg_monitor:5432/pg_auto_failover?sslmode=require' --run



4.4、查看集群状态(在监控节点上操作)



[postgres@pg_monitor ~]$ pg_autoctl show state
  Name |  Node |    Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
-------+-------+--------------+----------------+--------------+---------------------+--------------------
node_1 |     1 | pgtest1:5432 |   1: 0/5000060 |   read-write |             primary |             primary
node_2 |     2 | pgtest2:5432 |   1: 0/5000060 |    read-only |           secondary |           secondary
node_3 |     3 | pgtest3:5432 |   1: 0/5000060 |    read-only |           secondary |           secondary



4.5、创建服务(在所有节点上操作)



[root@pgtest1 ~]# su - postgres -c 'pg_autoctl -q show systemd --pgdata /data/pgdata/pgdata > /home/postgres/pg-auto-failover.service'
# 修改生成的文件/home/postgres/pg-auto-failover.service,把路径改成软连接的路径
[root@pgtest1 ~]# cat /home/postgres/pg-auto-failover.service
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/data/pgdata/pgdata'
User = postgres
ExecStart = /data/app/pgsql/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /data/app/pgsql/bin/pg_autoctl reload

[Install]
WantedBy = multi-user.target
# 将文件复制到/usr/lib/systemd/system/
[root@pgtest1 ~]# mv /home/postgres/pg-auto-failover.service /usr/lib/systemd/system/
[root@pgtest1 ~]# systemctl daemon-reload
[root@pgtest1 ~]# systemctl enable pg-auto-failover.service



4.6、使用服务启动pg_auto_failover(在所有节点上操作)



# 停止监控节点的pg_auto_failover
su - postgres -c 'pg_autoctl stop monitor'
# 停止所有数据库节点的pg_auto_failover
su - postgres -c 'pg_autoctl stop'
# 在监控节点和所有数据库节点启动pg_auto_failover服务
systemctl start pg-auto-failover.service



到此,pg_auto_failover的一主两备高可用环境部署完成。

参考文档

https://pg-auto-failover.readthedocs.io/en/main/

【版权声明】本文内容来自摩杜云社区用户原创、第三方投稿、转载,内容版权归原作者所有。本网站的目的在于传递更多信息,不拥有版权,亦不承担相应法律责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@moduyun.com

  1. 分享:
最后一次编辑于 2023年11月22日 0

暂无评论

推荐阅读
IE5LYMWlmdvL