In the first part, we learned how to make incremental backups, upload them to a remote server (repository with backups) and roll back to the last backup.
In this article, we will learn how to encrypt backups, upload them to S3-compatible storage (instead of a second repository server), recover to a clean cluster, and finally, point in time recovery (PITR).
Moment The
author does not pretend to be a DBA, but sometimes he likes to set up and see everything himself.
Training
To reproduce this manual, we need:
- A server with a database (we will install pgbackrest on it);
- S3 storage (you can use Amazon or any S3 compatible, I will use Amazon S3);
- Third server. Optional. On it, we will practice deploying postgresql and pgbackrest from scratch, deploying an existing backup from the S3 storage (the server burned out, moving, etc.);
It is assumed that you already have postgresql installed, and therefore the postgres user also exists.
I described the installation process for pgbackrest in the previous article, but just in case I will duplicate it again (I remind you: before installing, create a sudo user for yourself - I will use a sudo user with the pgbackrest username).
So that you have fewer problems when reproducing the instructions, I write in italics where, by what user and with what rights I executed the command when writing and checking the article.
We will follow this path:
Configure postgresql and pgbackrest Configure
encryption of backups (two lines)
Learn how to backup and send to S3 storage (five lines)
Make a backup
Let's imagine that we have broken the cluster, deploy on a new server, connect the existing S3 repository and roll up the backup
Let's look at PITR (Point In Time Recovery): we will recover to a certain point in time (let's say pgbackrest has already backed up the drop table, but we need to roll back 4 hours when the sign still existed)
Let's go!
Installing pgBackRest
sudo user or root:
1. Download the archive from pgbackrest and transfer its contents to the / build folder:
sudo mkdir /build
sudo wget -q -O - \
https://github.com/pgbackrest/pgbackrest/archive/release/2.18.tar.gz | \
sudo tar zx -C /build
2. Install the dependencies necessary for building:
sudo apt-get update
sudo apt-get install build-essential libssl-dev libxml2-dev libperl-dev zlib1g-dev \
libpq-dev
3. Putting pgbackrest together:
cd /build/pgbackrest-release-2.18/src && sudo ./configure
sudo make -s -C /build/pgbackrest-release-2.18/src
4. Copy the executable file to the / usr / bin directory:
sudo cp /build/pgbackrest-release-2.18/src/pgbackrest /usr/bin
sudo chmod 755 /usr/bin/pgbackrest
5.pgBackRest requires perl. Install:
sudo apt-get install perl
6. Create directories for logs, give them certain rights:
sudo mkdir -p -m 770 /var/log/pgbackrest
sudo chown postgres:postgres /var/log/pgbackrest
sudo mkdir -p /etc/pgbackrest
sudo mkdir -p /etc/pgbackrest/conf.d
sudo touch /etc/pgbackrest/pgbackrest.conf
sudo chmod 640 /etc/pgbackrest/pgbackrest.conf
sudo chown postgres:postgres /etc/pgbackrest/pgbackrest.conf
7. Check:
pgbackrest version
Setting up postgresql and pgBackRest
sudo user or root:
1. Make the necessary settings in postgresql.conf (it is located in the / etc / postgresql / 11 / main folder) for pgBackRest to work:
archive_command = 'pgbackrest --stanza=main archive-push %p' # main - . postgres main.
archive_mode = on
max_wal_senders = 3
wal_level = replica
2. Let's make the necessary settings in the pgbackrest configuration file (/etc/pgbackrest/pgbackrest.conf):
[main]
pg1-path=/var/lib/postgresql/11/main
[global]
log-level-file=detail
repo1-cipher-pass=tr5+BXdfdoxeyUqfo6AzLTrW+c+Jfd/1QbQj2CDMMBwtB0YGH3EJajry4+Eeen6D
repo1-cipher-type=aes-256-cbc
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2 # , . .. , . " " - .
repo1-type=s3
repo1-s3-bucket=pgbackrest-part2-tutorial
repo1-s3-endpoint=s3.us-east-1.amazonaws.com
repo1-s3-region=us-east-1
repo1-s3-key=9wdS3G8U5wz7kNsFWVGck7DDZ7DtVDtbM
repo1-s3-key-secret=A9zRmW16zXKt2vVA8mmNsFWy2mUAPYHa
start-fast=y
[global:archive-push]
compress-level=3
As you understand, here we immediately configured encryption and configured support for S3 storage.
PS There should not be any comments in the config.
PS To generate a strong encryption key, you can use the command:
openssl rand -base64 48
Repository creation
sudo user or root:
sudo mkdir -m 770 /var/lib/pgbackrest
sudo chown -R postgres /var/lib/pgbackrest/
sudo -u postgres pgbackrest --stanza=main stanza-create
If everything worked out, then in the S3 bucket you will see the files that pgbackrest generated.
Making a backup:
sudo user or root:
sudo -u postgres pgbackrest --log-level-console=info --stanza=main backup
pgBackRest will create the first full backup. If you want, you can run the backup command again and make sure that the system will create an incremental backup.
If you want to re-make a full backup, then specify an additional flag:
sudo -u postgres pgbackrest --log-level-console=info --stanza=main --type=full backup
You can view the list of backups using the command:
sudo -u postgres pgbackrest --stanza=main info
Restoring the backup:
sudo user or root:
1. Stop the running cluster:
sudo pg_ctlcluster 11 main stop
2. Recovering from a backup:
sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta --recovery-option=recovery_target=immediate restore
To restore the database to the state of the last FULL backup, use the command without specifying recovery_target:
sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta restore
Important! After recovery, it may turn out that the database hangs in recovery mode (there will be errors like ERROR: cannot execute DROP DATABASE in a read-only transaction). It is solved as follows (you will need to wait a little after executing the command):
sudo -u postgres psql -c "select pg_wal_replay_resume()"
UPD: As I understand it, this error occurs due to the fact that after recovery with recovery_target = immediate, pgbackrest pauses the database . To avoid this, you need to additionally specify some flags (you can read their meaning here and here ):
sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta --recovery-option=recovery_target=immediate --target-action=promote --type=immediate restore
In fact, it is possible to restore a specific backup by its name. Here I will only provide a link to the description of this feature in the documentation . The developers advise using this parameter with caution and explain why. On my own I can add that I used it. The main thing is to make sure that after the recovery, the database exited recovery mode (select pg_is_in_recovery () should show "f") and just in case, make a full backup after recovery.
3. Start the cluster:
sudo pg_ctlcluster 11 main start
After restoring the backup, we will perform a second backup:
sudo -u postgres pgbackrest --log-level-console=info --stanza=main backup
Restoring the backup to a clean cluster:
Let's imagine that something terrible happened - the server burned down (fell, flooded, lost access). You need to restore everything to a new, clean server. On the new server, we have already created a sudo user, installed pgbackrest, installed postgresql and we have a clean and fresh main cluster.
All we need to do is set up the postgresql and pgBackRest config in the same way as on the previous server, restart postgresql and execute the command order similar to the restore:
sudo user or root:
1. Stop the running cluster:
sudo pg_ctlcluster 11 main stop
2. Recovering from a backup:
sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta --recovery-option=recovery_target=immediate --target-action=promote --type=immediate restore
PS I described the nuances and subtleties of this command above.
3. Start the cluster:
sudo pg_ctlcluster 11 main start
After restoring the backup, we need to perform a second backup:
sudo -u postgres pgbackrest --log-level-console=info --stanza=main backup
That, in fact, is all. The fire is extinguished.
Point In Time Recovery (PITR)
Let's imagine such a situation that at 16:00 a backup was created, and at 18:05 we accidentally erased an important plate into which a lot of important data that we would not want to lose managed to get in 2 hours. PgBackRest at the expense of PostgreSQL provides us with such an opportunity: we can rollback to a specific point in time, provided that we have enough information to recover.
It works like this:
- We say we want to rollback to the time 18:04;
- pgBackRest searches for the last current backup (16:00);
- Restores it, but not everything rolls over, but only what we managed to do before 18.04;
PS this mechanism is built on WAL logs. You can read it here .
You can perform recovery to a specific point in time (after stopping the cluster) with this command:
sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta --type=time "--target=2020-09-06 18:27:24.561458+02" --target-action=promote restore
Important note - PostgreSQL can only play WAL logs forward, not backward. What does this mean in practice?
Let's say a backup was created at 16:00. At 18:05, we accidentally deleted the table, and at 18:10, a backup was created again with the already deleted table. When you try to make a PITR, pgBackRest will take the backup that was created at 16:00 and roll over everything that was before 18:05, and will not take the backup that was created at 18:10 and will roll back everything. This is important to understand. This mechanism is described in more detail here .
Suppose you made a backup, which already contains information about deleting a table. Using the --set flag, let's say which backup should be used for the base (the one in which the table was still). pgBackRest will take this backup and roll over everything that was before the table was deleted.
sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta --type=time --set=20200905-183838F_20200906-182612I "--target=2020-09-06 18:27:24.561458+02" --target-action=promote restore
PS I showed the command for getting the list of backups above.
Let's start the cluster. After starting, let's examine the file /var/log/postgresql/postgresql-11-main.log. In it, we are interested in the following lines, showing that the recovery to the specified point in time was successful:
starting point-in-time recovery to 2020-09-07 11:26:52.493127+02
...
recovery stopping before commit of transaction 576, time 2020-09-07 11:27:14.584496+02
...
last completed transaction was at log time 2020-09-07 11:24:09.583761+02
That's all. I strongly advise you to experiment with this tool before using it in combat.