Good afternoon, Habr !
My name is Natalya. I am the team lead of the application administrators group at NPO Krista. We are Ops for a group of projects of our company. We have a rather peculiar situation: we install and maintain our software both on the servers of our company and on the servers located at the clients' premises. In this case, there is no need to backup the entire server. Only the "essential data" is important: the DBMS and individual directories of the file system. Of course, customers have (or do not have) their own backup regulations and often provide some kind of external storage for storing backups there. In this case, after creating a backup, we provide sending to external storage.
For a while, for backup purposes, we got by with a bash script, but as the options grew, the complexity of this script grew proportionally, and at one point we came to the need to "destroy it to the ground, and then ....".
Ready-made solutions did not work for various reasons: because of the need to decentralize backups, the obligation to store backups locally at the client, the complexity of setting, import substitution, and access restrictions.
It seemed to us that it was easier to write something of our own. At the same time, I wanted to get something that would be enough for our situation for the next N years, but with the possibility of a potential expansion of the scope.
The problem conditions were as follows:
- the basic backup instance is autonomous, works locally
- storage of backups and logs always within the client's network
- โ ยซยป
- Linux, ,
- ssh,
- ( ) ,
You can see what we got here: github.com/javister/krista-backup The
software is written in python3; works on Debian, Ubuntu, CentOS, AstraLinux 1.6.
The documentation is available in the docs directory of the repository.
Basic concepts used by the system:
action - an action that implements one atomic operation (database backup, directory backup, transfer from directory A to directory B, etc.). Existing actions are in the core / actions directory
task - a task, a set of actions describing one logical "backup task"
schedule - a schedule, a task set with an optional task execution time.
Backup configuration is stored in a yaml file; general config structure:
- General settings
- section actions: description of actions used on this server
- the schedule section: a description of all tasks (sets of actions) and the schedule for their launch by the crown, if such a launch is required
An example of a config can be found here
What the application can do at the moment:
- the main operations for us are supported: PostgreSQL backup via pg_dump, file system directory backup via tar; operations with external storage; rsync between directories; rotation of backups (deleting old copies)
- call external script
- manual execution of a single task
/opt/KristaBackup/KristaBackup.py run make_full_dump
- you can add (or remove) a separate task or the entire schedule in the crontab
/opt/KristaBackup/KristaBackup.py enable all
- generating a trigger file based on the backup results. This feature is useful in conjunction with Zabbix for monitoring backups
- can work in the background in webapi or web mode
/opt/KristaBackup/KristaBackup.py web start [--api]
The difference between the modes: the webapi does not have a proper web interface, but the application responds to requests from another instance. For web mode, you need to install flask and several additional packages, and this is not acceptable everywhere, for example, in a certified AstraLinux SE.
Through the web interface, you can view the status and logs of backups of the connected servers: the "web instance" requests data from the "backup instances" via the API. Access to the web requires authorization, access to the webapi does not.
Logs of incorrectly passed backups are marked with color: warning - yellow, error - red.
If the administrator does not need a cheat sheet on the parameters and the server operating systems are homogeneous, you can compile the file and distribute the ready-made package.
We distribute this utility mainly through Ansible, rolling out first to some of the least important servers, and after testing to all others.
As a result, we got a compact standalone copying utility, amenable to automation and suitable for operation even by inexperienced administrators. It is convenient for us - maybe it will be useful for you too?