Familiarity with pg_probackup. First part

image



Hi, I am Alexander Nikitin, chief system administrator of BARS Group. In this article, I want to introduce you to the pg_probackup tool .



Pg_probackup is a development by Postgres Professional that helps to make backup copies of PostgreSQL DBMS. Unlike the standard pg_basebackup utility, this tool allows you to create incremental backups at the data block level (8Kb by default), validate backups and DBMS, set storage policies, and much more.



In this article, I do not set myself the goal of describing all possible aspects of working with pg_probackup, I just want to give an understanding of how you can use this tool in your work.



The following use cases will be considered:



  • wal-


1



Given: We have two servers (OS CentOS 7), on the first we have our database (hostname srv_db1, user postgres, PostgreSQL 12.4 installed), and on the second we will store backups (hostname srv_bkp, user backup_user).

You must configure the backup so that copies of the PostgreSQL cluster are stored on the srv_bkp server.



Solution:

pg_probackup can work via ssh, launching agents on a remote host that use an ssh connection as a communication and transport channel. Accordingly, you need to make sure that the backup_user on the srv_bkp host can connect to the postgres user on the srv_db1 host.



1) Connect to srv_bkp with backup_user and execute the following commands:



ssh-keygen -t rsa
ssh-copy-id postgres@srv_db1


Turn on paranoid mode and edit the ~ / .ssh / authorized_keys file on the srv_db1 server

Before the line with the keys, insert the following:



command="pg_probackup-12 agent"


So you should end up with something like this:

command="pg_probackup-12 agent" ssh-rsa AAAA....


By this we say that nothing but the "pg_probackup-12 agent" can be launched via SSH (you can read more about this here ).



2) We install pg_probackup on both machines (in the example we are working on CentOS, but for other distributions the installation process is described in the documentation ):



yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
rpm -ivh https://repo.postgrespro.ru/pg_probackup/keys/pg_probackup-repo-centos.noarch.rpm
yum install pg_probackup-12
yum install pg_probackup-12-debuginfo


The latest available version of pg_probackup will be installed, at the time of this writing - 2.4.2.



3) create an alias on srv_bkp (this is not necessary, but it is more convenient) and define a variable containing the path to the directory with backups (having created this directory beforehand):



mkdir /home/backup_user/backup_db
echo "BACKUP_PATH=/home/backup_user/backup_db">>~/.bash_profile
echo "export BACKUP_PATH">>~/.bash_profile
echo "alias pg_probackup='pg_probackup-12'">>~/.bash_profile


load the profile



. .bash_profile


4) on srv_bkp, initialize the backup directory and add a new instance:



pg_probackup init


Add the PostgreSQL instance to the directory, give it the name db1, specify the ssh connection parameters - host, username, and path to PGDATA.



pg_probackup add-instance --instance=db1 --remote-host=srv_db1 --remote-user=postgres --pgdata=/var/lib/pgsql/12/data


If the line appears



INFO: Instance 'db1' successfully inited


So the initialization was successful.

Let's define a policy for retention of backups:



pg_probackup set-config --instance db1 --retention-window=7 --retention-redundancy=2


The resulting policy boils down to the following:



it is necessary to keep all backups under 7 days,

while the number of full backups must be at least two



5) To create backups, you need to connect to a DBMS, so we create a database in the PostgreSQL cluster to which the connection will occur to manage the backup process (from a security point of view, this is better than connecting to a product database), and also change the database parameters:



$ createdb backupdb


joining a new database



psql -d backupdb


create a database role on behalf of which the backup process will be managed:



BEGIN;
CREATE ROLE backup WITH LOGIN REPLICATION password 'Strong_PWD';
GRANT USAGE ON SCHEMA pg_catalog TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.current_setting(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_is_in_recovery() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_start_backup(text, boolean, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_stop_backup(boolean, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_create_restore_point(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_switch_wal() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_last_wal_replay_lsn() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current_snapshot() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_snapshot_xmax(txid_snapshot) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_control_checkpoint() TO backup;
COMMIT;


Let's pay attention to the listen_addresses parameter:



show listen_addresses;


if localhost, then change to *



alter system set listen_addresses='*';


If you changed listen_addresses, then PostgreSQL must be restarted.



Let's see where we have located the pg_hba.conf file



psql –c 'show hba_file'


Add the following rules to pg_hba.conf:



# pg_probackup access permission
host    backupdb   backup          srv_bkp        md5
host    replication  backup          srv_bkp        md5


We re-read the configuration:

psql -c 'select pg_reload_conf()'


We check if there were any typos:



psql -c 'select * from pg_hba_file_rules'


On srv_bkp, in the home directory of the backup_user user, create a file in which we write the server name or its ip-address, port, database name, username and password. This file is needed so that the password is automatically entered when creating a backup copy.



echo "srv_db1:5432:replication:backup:Strong_PWD">>~/.pgpass
echo "srv_db1:5432:backupdb:backup:Strong_PWD">>~/.pgpass


Set the necessary access rights to this file



chmod 600 ~/.pgpass


And create the first backup:



pg_probackup backup --instance=db1 -j2 --backup-mode=FULL --compress --stream --delete-expired --pguser=backup --pgdatabase=backupdb --remote-host=srv_db1 --remote-user=postgres


If we did everything correctly, then something like this will appear on the screen:



INFO: Backup start, pg_probackup version: 2.4.2, instance: db1, backup ID: QFERC9, backup mode: FULL, wal mode: STREAM, remote: true, compress-algorithm: zlib, compress-level: 1
WARNING: This PostgreSQL instance was initialized without data block checksums. pg_probackup have no way to detect data block corruption without them. Reinitialize PGDATA with option '--data-checksums'.
INFO: PGDATA size: 3373MB
INFO: Start transferring data files
INFO: Data files are transferred, time elapsed: 1m:0s
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
INFO: Syncing backup files to disk
INFO: Backup files are synced, time elapsed: 1s
INFO: Validating backup QFERC9
INFO: Backup QFERC9 data files are valid
INFO: Backup QFERC9 resident size: 975MB
INFO: Backup QFERC9 completed
INFO: Evaluate backups by retention
INFO: Backup QFERC9, mode: FULL, status: OK. Redundancy: 1/2, Time Window: 0d/7d. Active
INFO: There are no backups to merge by retention policy
INFO: There are no backups to delete by retention policy
INFO: There is no WAL to purge by retention policy


Pay attention to the warning issued by pg_probackup - checksum checking is not enabled on our cluster, so it cannot check data blocks for correctness. In other words, you are not protected from the fact that bad data blocks do not get into the backup.



Let's see the current settings:



pg_probackup show-config --instance db1


Now let's shorten the command for creating backups - we will write some parameters in the configuration of the db1 instance:



pg_probackup set-config --instance=db1 --remote-host=srv_db1 --remote-user=postgres --pguser=backup --pgdatabase=backupdb --log-filename=backup_cron.log --log-level-file=log --log-directory=/home/backup_user/backup_db/log


Let's compare: how it was and how it became



pg_probackup show-config --instance db1


It was Has become
# Backup instance information

pgdata = /var/lib/pgsql/12/data

system-identifier = 6863327140287059463

xlog-seg-size = 16777216

# Connection parameters

pgdatabase = backup_user



# Replica parameters

replica-timeout = 5min

# Archive parameters

archive-timeout = 5min

# Logging parameters

log-level-console = INFO

log-level-file = OFF

log-filename = pg_probackup.log



log-rotation-size = 0TB

log-rotation-age = 0d

# Retention parameters

retention-redundancy = 2

retention-window = 7

wal-depth = 0

# Compression parameters

compress-algorithm = none

compress-level = 1

# Remote access parameters

remote-proto = ssh

# Backup instance information

pgdata = /var/lib/pgsql/12/data

system-identifier = 6863327140287059463

xlog-seg-size = 16777216

# Connection parameters

pgdatabase = backupdb

pghost = srv_db1

pguser = backup


# Replica parameters

replica-timeout = 5min

# Archive parameters

archive-timeout = 5min

# Logging parameters

log-level-console = INFO

log-level-file = LOG

log-filename = backup_cron.log

log-directory = /home/backup_user/backup_db/log


log-rotation-size = 0TB

log-rotation-age = 0d

# Retention parameters

retention-redundancy = 2

retention-window = 7

wal-depth = 0

# Compression parameters

compress-algorithm = none

compress-level = 1

# Remote access parameters

remote-proto = ssh

remote-host = srv_db1

remote-user = postgres






Let's make an incremental copy in DELTA mode without specifying the parameters set in the config:



pg_probackup backup --instance=db1 -j2 --progress -b DELTA --compress --stream --delete-expired


Let's see what we got

BACKUP INSTANCE 'db1'
=====================================================================================================================================
 Instance  Version  ID      Recovery Time           Mode   WAL Mode  TLI   Time   Data   WAL  Zratio  Start LSN   Stop LSN    Status
=====================================================================================================================================
 db1       12       QFERKZ  2020-08-21 05:55:52-04  DELTA  STREAM    1/1     9s  136kB  32MB    1.00  0/B0000028  0/B0000168  OK
 db1       12       QFERC9  2020-08-21 05:51:34-04  FULL   STREAM    1/0  1m:3s  959MB  16MB    3.52  0/AE000028  0/AE0001A0  OK 


There is a lot of information, a detailed description can be found in the documentation , nevertheless, let's dwell on some points:



Recovery Time - the minimum time for which you can recover using this backup



Mode - the mode in which the copy was made, currently 4 are available modes - one full (FULL) and three incremental (PAGE, DELTA and PTRACK)



WAL Mode- the following options are possible here - STREAM and ARCHIVE. Copies created in STREAM mode contain all the files necessary for restoring to a consistent WAL state. Just for this mode to work, it was necessary to give the REPLICATION role to our backup user. ARCHIVE mode assumes that we have configured WAL archiving, and then the necessary WAL files will be located in the path known to pg_probackup.



Status - there are a lot of statuses, they are all described in the documentation, but if we see something other than OK, then it makes sense to go to the documentation and see what went wrong.



As you may have guessed, we can parse the output of the pg_probackup show command and get the status of the last backup made, process it and send it to the monitoring system, and there we can already configure the rules by which the notification of the employees responsible for the DBMS will be triggered.



Let's create a bkp_base.sh script that will start the backup and send the result to the monitoring system:



#! /bin/sh
#    
. /home/backup_user/.bash_profile
#    ,        (FULL, DELTA  ..)
pg_probackup backup --instance=db1 -j 2 --progress -b $1 --compress --stream --delete-expired 
#         zabbix.
if [ "$(pg_probackup show --instance=db1 --format=json | jq -c '.[].backups[0].status')" == '"OK"' ]; then
  result=0;
else
  result=1;
fi
#  zabbix        zabbix_trapper   pg.db_backup
#         ,      ,  ,         .
/opt/zabbix/bin/zabbix_sender -c /opt/zabbix/etc/pg.zabbix_agentd.conf -k pg.db_backup -o $result


We write a call to the resulting script in crontab, for example, you can set the following schedule:



00 01 * * 7 /home/backup_user/scr/bkp_base.sh FULL
00 01 * * 1,2,3,4,5,6 /home/backup_user/scr/bkp_base.sh DELTA


This completes the solution of the first task - we have configured the backup, defined the copy retention policy. We also send the status of backups to the monitoring system.



In the next part, we will look at creating a wal file archive and creating archive backups.



All Articles