Hi, I am Alexander Nikitin, chief system administrator of BARS Group. In this article, I want to introduce you to the pg_probackup tool .
Pg_probackup is a development by Postgres Professional that helps to make backup copies of PostgreSQL DBMS. Unlike the standard pg_basebackup utility, this tool allows you to create incremental backups at the data block level (8Kb by default), validate backups and DBMS, set storage policies, and much more.
In this article, I do not set myself the goal of describing all possible aspects of working with pg_probackup, I just want to give an understanding of how you can use this tool in your work.
The following use cases will be considered:
- wal-
1
Given: We have two servers (OS CentOS 7), on the first we have our database (hostname srv_db1, user postgres, PostgreSQL 12.4 installed), and on the second we will store backups (hostname srv_bkp, user backup_user).
You must configure the backup so that copies of the PostgreSQL cluster are stored on the srv_bkp server.
Solution:
pg_probackup can work via ssh, launching agents on a remote host that use an ssh connection as a communication and transport channel. Accordingly, you need to make sure that the backup_user on the srv_bkp host can connect to the postgres user on the srv_db1 host.
1) Connect to srv_bkp with backup_user and execute the following commands:
ssh-keygen -t rsa
ssh-copy-id postgres@srv_db1
Turn on paranoid mode and edit the ~ / .ssh / authorized_keys file on the srv_db1 server
Before the line with the keys, insert the following:
command="pg_probackup-12 agent"
So you should end up with something like this:
command="pg_probackup-12 agent" ssh-rsa AAAA....
By this we say that nothing but the "pg_probackup-12 agent" can be launched via SSH (you can read more about this here ).
2) We install pg_probackup on both machines (in the example we are working on CentOS, but for other distributions the installation process is described in the documentation ):
yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
rpm -ivh https://repo.postgrespro.ru/pg_probackup/keys/pg_probackup-repo-centos.noarch.rpm
yum install pg_probackup-12
yum install pg_probackup-12-debuginfo
The latest available version of pg_probackup will be installed, at the time of this writing - 2.4.2.
3) create an alias on srv_bkp (this is not necessary, but it is more convenient) and define a variable containing the path to the directory with backups (having created this directory beforehand):
mkdir /home/backup_user/backup_db
echo "BACKUP_PATH=/home/backup_user/backup_db">>~/.bash_profile
echo "export BACKUP_PATH">>~/.bash_profile
echo "alias pg_probackup='pg_probackup-12'">>~/.bash_profile
load the profile
. .bash_profile
4) on srv_bkp, initialize the backup directory and add a new instance:
pg_probackup init
Add the PostgreSQL instance to the directory, give it the name db1, specify the ssh connection parameters - host, username, and path to PGDATA.
pg_probackup add-instance --instance=db1 --remote-host=srv_db1 --remote-user=postgres --pgdata=/var/lib/pgsql/12/data
If the line appears
INFO: Instance 'db1' successfully inited
So the initialization was successful.
Let's define a policy for retention of backups:
pg_probackup set-config --instance db1 --retention-window=7 --retention-redundancy=2
The resulting policy boils down to the following:
it is necessary to keep all backups under 7 days,
while the number of full backups must be at least two
5) To create backups, you need to connect to a DBMS, so we create a database in the PostgreSQL cluster to which the connection will occur to manage the backup process (from a security point of view, this is better than connecting to a product database), and also change the database parameters:
$ createdb backupdb
joining a new database
psql -d backupdb
create a database role on behalf of which the backup process will be managed:
BEGIN;
CREATE ROLE backup WITH LOGIN REPLICATION password 'Strong_PWD';
GRANT USAGE ON SCHEMA pg_catalog TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.current_setting(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_is_in_recovery() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_start_backup(text, boolean, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_stop_backup(boolean, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_create_restore_point(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_switch_wal() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_last_wal_replay_lsn() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current_snapshot() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_snapshot_xmax(txid_snapshot) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_control_checkpoint() TO backup;
COMMIT;
Let's pay attention to the listen_addresses parameter:
show listen_addresses;
if localhost, then change to *
alter system set listen_addresses='*';
If you changed listen_addresses, then PostgreSQL must be restarted.
Let's see where we have located the pg_hba.conf file
psql –c 'show hba_file'
Add the following rules to pg_hba.conf:
# pg_probackup access permission
host backupdb backup srv_bkp md5
host replication backup srv_bkp md5
We re-read the configuration:
psql -c 'select pg_reload_conf()'
We check if there were any typos:
psql -c 'select * from pg_hba_file_rules'
On srv_bkp, in the home directory of the backup_user user, create a file in which we write the server name or its ip-address, port, database name, username and password. This file is needed so that the password is automatically entered when creating a backup copy.
echo "srv_db1:5432:replication:backup:Strong_PWD">>~/.pgpass
echo "srv_db1:5432:backupdb:backup:Strong_PWD">>~/.pgpass
Set the necessary access rights to this file
chmod 600 ~/.pgpass
And create the first backup:
pg_probackup backup --instance=db1 -j2 --backup-mode=FULL --compress --stream --delete-expired --pguser=backup --pgdatabase=backupdb --remote-host=srv_db1 --remote-user=postgres
If we did everything correctly, then something like this will appear on the screen:
INFO: Backup start, pg_probackup version: 2.4.2, instance: db1, backup ID: QFERC9, backup mode: FULL, wal mode: STREAM, remote: true, compress-algorithm: zlib, compress-level: 1
WARNING: This PostgreSQL instance was initialized without data block checksums. pg_probackup have no way to detect data block corruption without them. Reinitialize PGDATA with option '--data-checksums'.
INFO: PGDATA size: 3373MB
INFO: Start transferring data files
INFO: Data files are transferred, time elapsed: 1m:0s
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
INFO: Syncing backup files to disk
INFO: Backup files are synced, time elapsed: 1s
INFO: Validating backup QFERC9
INFO: Backup QFERC9 data files are valid
INFO: Backup QFERC9 resident size: 975MB
INFO: Backup QFERC9 completed
INFO: Evaluate backups by retention
INFO: Backup QFERC9, mode: FULL, status: OK. Redundancy: 1/2, Time Window: 0d/7d. Active
INFO: There are no backups to merge by retention policy
INFO: There are no backups to delete by retention policy
INFO: There is no WAL to purge by retention policy
Pay attention to the warning issued by pg_probackup - checksum checking is not enabled on our cluster, so it cannot check data blocks for correctness. In other words, you are not protected from the fact that bad data blocks do not get into the backup.
Let's see the current settings:
pg_probackup show-config --instance db1
Now let's shorten the command for creating backups - we will write some parameters in the configuration of the db1 instance:
pg_probackup set-config --instance=db1 --remote-host=srv_db1 --remote-user=postgres --pguser=backup --pgdatabase=backupdb --log-filename=backup_cron.log --log-level-file=log --log-directory=/home/backup_user/backup_db/log
Let's compare: how it was and how it became
pg_probackup show-config --instance db1
It was | Has become |
---|---|
# Backup instance information
pgdata = /var/lib/pgsql/12/data system-identifier = 6863327140287059463 xlog-seg-size = 16777216 # Connection parameters pgdatabase = backup_user # Replica parameters replica-timeout = 5min # Archive parameters archive-timeout = 5min # Logging parameters log-level-console = INFO log-level-file = OFF log-filename = pg_probackup.log log-rotation-size = 0TB log-rotation-age = 0d # Retention parameters retention-redundancy = 2 retention-window = 7 wal-depth = 0 # Compression parameters compress-algorithm = none compress-level = 1 # Remote access parameters remote-proto = ssh |
# Backup instance information
pgdata = /var/lib/pgsql/12/data system-identifier = 6863327140287059463 xlog-seg-size = 16777216 # Connection parameters pgdatabase = backupdb pghost = srv_db1 pguser = backup # Replica parameters replica-timeout = 5min # Archive parameters archive-timeout = 5min # Logging parameters log-level-console = INFO log-level-file = LOG log-filename = backup_cron.log log-directory = /home/backup_user/backup_db/log log-rotation-size = 0TB log-rotation-age = 0d # Retention parameters retention-redundancy = 2 retention-window = 7 wal-depth = 0 # Compression parameters compress-algorithm = none compress-level = 1 # Remote access parameters remote-proto = ssh remote-host = srv_db1 remote-user = postgres |
Let's make an incremental copy in DELTA mode without specifying the parameters set in the config:
pg_probackup backup --instance=db1 -j2 --progress -b DELTA --compress --stream --delete-expired
Let's see what we got
BACKUP INSTANCE 'db1'
=====================================================================================================================================
Instance Version ID Recovery Time Mode WAL Mode TLI Time Data WAL Zratio Start LSN Stop LSN Status
=====================================================================================================================================
db1 12 QFERKZ 2020-08-21 05:55:52-04 DELTA STREAM 1/1 9s 136kB 32MB 1.00 0/B0000028 0/B0000168 OK
db1 12 QFERC9 2020-08-21 05:51:34-04 FULL STREAM 1/0 1m:3s 959MB 16MB 3.52 0/AE000028 0/AE0001A0 OK
There is a lot of information, a detailed description can be found in the documentation , nevertheless, let's dwell on some points:
Recovery Time - the minimum time for which you can recover using this backup
Mode - the mode in which the copy was made, currently 4 are available modes - one full (FULL) and three incremental (PAGE, DELTA and PTRACK)
WAL Mode- the following options are possible here - STREAM and ARCHIVE. Copies created in STREAM mode contain all the files necessary for restoring to a consistent WAL state. Just for this mode to work, it was necessary to give the REPLICATION role to our backup user. ARCHIVE mode assumes that we have configured WAL archiving, and then the necessary WAL files will be located in the path known to pg_probackup.
Status - there are a lot of statuses, they are all described in the documentation, but if we see something other than OK, then it makes sense to go to the documentation and see what went wrong.
As you may have guessed, we can parse the output of the pg_probackup show command and get the status of the last backup made, process it and send it to the monitoring system, and there we can already configure the rules by which the notification of the employees responsible for the DBMS will be triggered.
Let's create a bkp_base.sh script that will start the backup and send the result to the monitoring system:
#! /bin/sh
#
. /home/backup_user/.bash_profile
# , (FULL, DELTA ..)
pg_probackup backup --instance=db1 -j 2 --progress -b $1 --compress --stream --delete-expired
# zabbix.
if [ "$(pg_probackup show --instance=db1 --format=json | jq -c '.[].backups[0].status')" == '"OK"' ]; then
result=0;
else
result=1;
fi
# zabbix zabbix_trapper pg.db_backup
# , , , .
/opt/zabbix/bin/zabbix_sender -c /opt/zabbix/etc/pg.zabbix_agentd.conf -k pg.db_backup -o $result
We write a call to the resulting script in crontab, for example, you can set the following schedule:
00 01 * * 7 /home/backup_user/scr/bkp_base.sh FULL
00 01 * * 1,2,3,4,5,6 /home/backup_user/scr/bkp_base.sh DELTA
This completes the solution of the first task - we have configured the backup, defined the copy retention policy. We also send the status of backups to the monitoring system.
In the next part, we will look at creating a wal file archive and creating archive backups.