Patroni and Stolon are two of the most famous and advanced solutions for PostgreSQL orchestration and high availability (auto-failover) Leader-Followers configuration clusters. However, engineers migrating from old proven solutions (Corosync & Pacemaker) and embedded from other DBMSs face difficulties in installing these tools and a lack of understanding of the roles of each of the components. In this master class, the typical process of installing Patroni and Stolon clusters on virtual machines (not in containers) will be considered, as well as the behavior of these clusters with various failures in the infrastructure. The whole process will be demonstrated on three virtual machines running vagrant using pre-built images. If desired, the listener can follow the process, having previously prepared his surroundings.
! . Ozon . . Postgres Pro Patroni Stolon. .
-. , Stolon, Patroni . .
, Ansible , Postgres Pro , .
Patroni , , — https://github.com/vitabaks/postgresql_cluster. .
, .
- PostgreSQL – shared-nothing, .
- . , .
- hot standby, . . .
- :
- pg_basebackup , , .
- . standby .
- pg_rewind, standby.
- , .
https://eng.uber.com/mysql-migration/
https://github.com/sorintlab/stolon/issues/519
https://github.com/zalando/patroni/issues/538
10- PostgreSQL . , , , . , , , . Write amplification, - , , WAL full page images, checkpoint. hit beat . . WAL. « PostgreSQL MySQL» .
.
, , DDL, sequence, , , . WAL. WAL -. GTID MySQL, CSN MS SQL Server.
pg_rewind.
Stolon Patroni , , , rolling upgrade Postgres .
, ? , . . - , health checks - .
, , – promote . .
, , , .
, ? , promote . .
, split brain . - , .
, , , .
, . .
? Postgres , . , , , , .
? , , , - .
– . , read only. .
fail. , . , .
https://github.com/citusdata/pg_auto_failover
https://github.com/citusdata/pg_auto_failover/issues/12#issuecomment-490551255
. . pg_auto_failover Citus Data.
. , . pg_stat_replication.
, . . , , . primary ( ) , .
, , . , , .
fail. , .
, , .
, . .
, . , , .
.
, . DCS (Distributed Configuration System – ). IP , .
DCS – Consul, Etcd, Raft Zookeeper, Zab. Zab – Paxos.
, DCS.
Patroni/ Stolon.
Postgres Postgres .
, Patroni/ Stolon.
- -, autofailover. - .
- . PostgreQSL.
- , Kubernetes.
- DBaaS (database as a service).
- – . , - . , - .
(DCS) Etcd
. DCS. . , «» . DCS, , .
? . , Postgres, , DCS , , split , split brain. , fail DCS .
, DCS 3-5-7 , , 3- . ? . net split, , DCS.
Etcd RAFT . .
DCS , follower PostgreSQL. RAFT.
. . .
, . follower, . . - RTT fsync.
, follower, . , , . . .
, - .
14 42 .
vagrant status
Current machine states:
node1 running (virtualbox)
node2 running (virtualbox)
node3 running (virtualbox)
This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
. vagrant.
: , . , , . . .
. . , .
Etcd . Etcd , Etcd.
config Etcd. , Etcd, , IP , . . ETCD_LISTEN_CLIENT_URLS . ETCD_LISTEN_PEER_URLS .
ETCD_ADVERTISE_CLIENT_URLS ETCD_INITIAL_ADVERTISE_PEER_URLS. . discovery, .
: ETCD_HEARTBEAT_INTERVAL ETCD_ELECTION_TIMEOUT.
. . . Ansible. . , .
. Etcd .
, term 2. Term – timeline PostgreSQL. term .
etcdctl member list. , () , followers.
sudo pkill -STOP etcd
. , fail , . Etcd , . . .
. . , term.
, , .
«etcdctl cluster-health». , . .
Etcd. , . term follower’.
- . . ? – . Etcd . «comcast». API tables Etcd. , .
? «Comcast — - device eth1 – packet – loss 100 %».
. , . time line. , -, . , term 4.
. , heartbeat_interval election_timeout. , followers , heartbeat , followers , . follower heartbeat - - -, . .
, , - . , . heartbeat_interval – 100 . , -, . election_timeout – .
. . , , RTT , election_timeout. Election_timeout . Ansible. .
`comcast --device eth1 --stop
: comcast --device eth1 --latency 600
. .
latency
600 . 600 – . RTT 200 .
ping . RTT 1 .
. , term . . , - , term. .
, heartbeat_interval election_timeout. , heartbeat , election_timeout 10 . Ansible. . Etcd-config. , . , . . , -. Etcd .
. . follower’.
member list, , , fallowers .
, , , , - 10 .
- Etcd, . bar. Deadline exceeded – , , . Etcd. timeout . 5 . total_timeout , 10 .
«get», . -. .
. , .
. Election_timeout , heartbeat 100 .
, RAFT - . , : , , . .
. Etcdctl member list. . – follower.
. bash – comcast – device, . . . - sleep . Comcast – device eth1 – stop sleep 1,5. done . , , . .
Etcd. , term , , - , term, . Term . . . .
, , Etcd, . 1 . , . . . , , Etcd fsync . , .
. Comcast – device eth1 – stop.
https://github.com/etcd-io/etcd/blob/master/Documentation/tuning.md#time-parameters
https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#example-hardware-configurations
. Etcd , .
, , .
, Etcd , , , , .
Patroni Stolon. , .
. netsplit, , DCS. , , Postgres , DCS . , Postgres, .
DCS. , . . . , DCS Patroni Stolon.
Stolon.
. DCS stolon-sentinel, . DCS : election, , statefull .
Postgres’ Stolon-keeper. – stolon-proxy, .
https://github.com/sorintlab/stolon/issues/313
3- , . , , 2- sentinel, . , 2- stolon-proxy. , , 2- Stolon-keeper, postgres-.
41 20
, . . , . , stolon’ . -, . Etcd . . . . – superuser, . . , .
Stolon. stolon.d/test-cluster.conf. , «test.cluster» . , , . Postgres, -. ,
- . , . Superuser, Stolon-keeper . . . , .
«test.cluster»? system/system/Stolon-keeper@.service. template-, . , - . ? Stolon, … . , , - , -, .
Ansible. . . , . . . Stolon-keeper. Name=Stolon-keeper@test-cluster state=started enable=on. .
. Test-cluster. , . lock - . , : Stolon-keeper, sentinel proxy . .
sentinel. . , , DCS. . . . sentinel , sentinel . . State=started enabled=on. - . , . , test-cluster. . , - . .
.
https://postgrespro.ru/docs/postgrespro/12/server-shutdown
https://github.com/sorintlab/stolon/issues/707
workflow Stolon:
- «stolonctl init».
- PostgreSQL pg_hba update.
- , PostgreSQL , , , . . , Keeper, post-master. Stolon-keeper PostgreSQL.
- «automaticPgRestart», postgres- .
- , . , max_connections, max_lock_per_transaction postgres-. . , , «max_connections» «max_lock_per_transaction». , , , . .
- – Stolon-keeper. – Stolon-keeper. . , .
, pg_pba. , pba. /opt/stolon/test-cluster
. . . Stolon-test-cluster-spec.json. , . . , .
.
https://github.com/sorintlab/stolon/blob/master/doc/initialization.md
https://github.com/sorintlab/stolon/blob/master/doc/standbycluster.md
Stolon :
- – .
- – PITR, . standby cluster.
- – existing. , DCS. DCS , , . , «existing».
. unitdb, checksums, , pgrewind . . Stolonctl. . .
Keeper, . . Keeper , sentinel, . , unitdb, . standby.
. «status». , Keepers, heaths check Keepers Postgres, . , , . sentinel.
, . wantedgeneration currentgeneration. Stolon-keeper . sentinel , , , . Keeper . .
. json, . . . Keepers . , , . , . . . , . Etcd .
. : Etcd . , Etcd. , . , , Consul. Consul , . , , , , Stolon-keeper . Postgres, Stolon . , Stolon-keeper. systemd, on abort, kill -9
.
Postgres. kill -9
, . . – . . Stolon-keeper, Ok.
. . - . Postgres . Stolon-keeper . Postgres. .
. fail. Postgres-. , . pgbench.
- , Postgres, ? select , , select.
, checksums, , checksums , . Postgres , . , , checksums , - . Postgres. Patroni/Stolon .
pgbench. . , . 25432. . . Stolon/test-cluster/postgres/pg_hba.conf.
, Stolon superuser, , . , .
. «default», . «pg_hba». «update». json- pgHBA . local all posters. Posters trust. – host all postgres 172.20.20.0/24 trust.
, . . , Postgres. . Create user postgres superuser. , Postgres . pg_bench . HBA user test. Patroni. .
while. 20 , . , . .
Stolon . :
- SleepInterval – .
- RequestTimeout – deadline PostgreSQL. Deadline DCS – 5 .
- FailInterval – , sentinel , . Sentinel failInterval, , . , , , . . - , . . failInterval .
autofailover Stolon?
1 – fail . Stolon-keeper Postgres . sentinel. , sentinel. . sleepInterval. 10 .
2 – - , , sentinel. , Keeper .
3 – sentinel. Keepers. sleepInterval.
: (λ1 + λ2) * sleepInterval. . .
4 – . DCS. sentinel , .
, , DCS sentinel , failover 25 50 .
fail sentinel’, failover sentinel. sentinel. failover .
, Stolon-proxy Keeper , Keeper read only . Postgres. Postgres Stolon-proxy.
. DCS, , , , .
Stolon. Stolon . , DCS . , «deadKeeperRemovalInterval». 48 . , DCS. , . , , WAL. 48 , .
, Stolon . . , -, deadlines - Postgres. , dbWaitReadyTimeout deadline . – 60 . checkpoints, deadline .
syncTimeout – deadline . 30 . , . .
InitTimeout – deadline , initdb .
-. conversion timeout. , Keeper . -. Stolon . - -, Stolon .
Patroni.
Patroni, , , . ? Stolon. Patroni . DCS, , Patroni.
Patroni, . . , DCS time to live . , , . . , - . s… . Patroni , WAL-, REST API, , . WAL . Proxy – .
. . . 3- Etcd. Postgres Pro HAProxy confd, Etcd .
2- Patroni. Patroni Postgres.
https://patroni.readthedocs.io/en/latest/existing_data.html
Patroni , . basebackup’ . Patroni , , .
basebackup. , , tablespace.
https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-ADMIN
workflow Patroni. , bootstrap. , , . . Stolon, , , . bootstrap. .
Patroni? postgres.conf pg_hba.conf, recovery.conf DCS , Stolon. . .
Patroni postgres-. , , .
, – , Patroni.
https://github.com/zalando/patroni/blob/master/haproxy.cfg
https://github.com/zalando/patroni/tree/master/extras/confd
.
– . . . Patroni- REST API endpoints, , , .
HAProxy, healthchecks Patroni.
Patroni callbacks. , . .
HAProxy , DCS HAProxy. HAProxy + confd. consul-temlate. . .
10- Postgres libpq , , «target_session_attrs», . ? – , target_session_attrs.
, , watchdog Postgres, , , Patroni-. ? Postgres , . .
Stolon Patroni , . - , .
https://www.consul.io/docs/guides/forwarding.html
https://learn.hashicorp.com/consul/day-2-operations/advanced-operations/dns-caching
https://pgconf.ru/2019/242817 https://pgconf.ru/2019/242821
https://github.com/cybertec-postgresql/vip-manager
, DNS. Consul . DNS . .
IP-. HAProxy + keepalived. vip-manager, DCS, IP- , . , Postgres Pro , , IP-. , kill stop keepalived’, VRRP IP- HAProxy, IP- . , , . vip-manager. vip-manager , switchover, IP . , .
, , . Stolon :
- ttl – .
- Loop_wait – Patroni-.
- Retry-timeout – DCS PostgreSQL.
- Master_start_timeout- PostgreSQL ( Patroni-).
, , . , Patroni- Postgres, DCS. - loop_wait. , .
failover Patroni?
- , DCS . Patroni- . . – 20-30 .
- Patroni- REST API, endpoint Patroni WAL-. - 2 . 2 , , . , , WAL-.
- DCS. - .
- .
DCS , - , , 5 .
, , .
, , Patroni-. - - .
https://www.postgresql.org/message-id/C1F7905E-5DB2-497D-ABCC-E14D4DEE506C@yandex-team.ru
https://github.com/zalando/patroni/blob/master/docs/watchdog.rst
.
- . , , , Postgres. , Postgres WAL-commit .
- Zalando – watchdog. Patroni- - , : , .
- HAProxy Confd, . . , .
- Corosync & Pacemaker — ( ) , . . . , , , .
, HAProxy Confd .
, netsplit? HAProxy Patroni . . health check’ Patroni-.
Confd. Confd , DCS.
, HAProxy PgBouncer. PgBouncer DCS. , , Patroni .
- , Patroni . . , , - DCS . downtime , . wal_keep_stgments, .
- , , . . . , , , . .
- Patroni? Patroni Stolon , enterprise . :
- . .
- . , , . , , failover , - . Max Availability Oracle Data Guard.
- PostgreSQL Stolon.
!
.
Etcd, . , - ?
-, . , , Etcd, Consul mail , . fsync , .
-? , . , , , , ?
, -.
, Postgres , . DCS , .
, .
? ? . , , Consul . Etcd . - ?
Consul Etcd. RAFT. fsync . Postgres DCS , , . , . . , .
, !
Zookeeper? , ? ?
Zookeeper , . Etcd . Stolon , Patroni – .
- Patroni? . - ?
. wal_keep_segments, . . WAL- , . , issue Patroni. , Stolon, , , - .
! -! , . Patroni , . , , .
, . . .
. . . , . WAL-, . , WALs . . , . !
. , - -. , . . . switchover failover, . promote checkpoint, WALs . . , , .
! , - - . , ?
enterprise, Patroni. Stolon. , . . -- Kubernetes, , . Keeper, Sentinel. , , .
Patroni . WAL-, , DCS. DCS . (, , ), DCS, . . issue, Consul . Patroni. Stolon . Kubernetes.
. , Stolon ?
.
– master-slave Stolon.
, . , standby . – Stolon, . , , standby .
. . .
, ?
, . . , , , .
, .
. . ?
, .
Patroni . , , , . , . .
, Patroni , . Stolon , Postgres keeper data, .
, Stolon ?
open source, .
, , ?
, issue. .
- , . - . , .
issue. , , , .
-, . .
! . . , HAProxy, . . . . HAProxy "on-marked-down shutdown-sessions", , .
, ? health checks?
, http check REST API.
, -, HAProxy, IP- . – PgBouncer, health checks. HAProxy – , health checks , . , , – Patroni, - .
Patroni Etcd REST API.
, Etcd , Etcd.
Etcd? , , . watchdog, Patroni , , , watchdog reboot.
, watchdog – . watchdog. Patroni PostgreSQL, Patroni. watchdog – , , . .
, .
watchdog -, .. , , Patroni- , reboot. .
watchdog , , , , failover , . .
, . ? .
, …
, Patroni, . . - . watchdog – .
Patroni Etcd , , standby. , watchdog .
. , Patroni , , , . . : watchdog, HAProxy.
.
Etcd. ?
-.
- ?
-.
? , ?
-.
. . , , ?
Yes. I mentioned this in one thesis for an unstable configuration. And time out is the only way. These are heartbeat_interval and election_timeout in particular.
Thanks!