About version control systems

Hello! Next week OTUS will start a "Super-workshop on using and configuring GIT" . This is what I decided to devote to today's publication.










Introduction



I propose to discuss the purpose and various ways of organizing version control systems.



Version control systems



A version control system is primarily a tool, and a tool is designed to solve a certain class of problems. So, a version control system is a system that records changes

to a file or set of files over time and allows you to return later to a specific version. We want to flexibly manage a certain set of files, roll back to certain versions if necessary. You can undo certain changes to the file, roll back its deletion, see who has changed something. Typically, version control systems are used to store source code, but this is not required. They can be used to store any type of file.



How do I store different versions of files? People did not come to such a tool as version control systems right away, and they themselves are very different. The proposed problem can be solved using good old copy-paste, local, centralized or distributed version control systems.



Copy-paste



A well-known method, when applied to this problem, may look like this: we will name files by the pattern filename_ {version}, possibly with the addition of the creation or modification time.



This method is very simple, but it is prone to various errors: you can accidentally change the wrong file, you can copy from the wrong directory (after all, this is how files are transferred in this model).



Local version control system



The next step in the development of version control systems was the creation of local version control systems. They were a simple database that keeps records of all changes in files.



One example of such systems is the RCS version control system, which was developed in 1985 (the last patch was written in 2015) and stores changes in files (patches) while maintaining version control. A set of these changes allows you to restore any state of the file. RCS ships with Linux.



The local version control system does a good job of solving the problem, but its problem is the main property - locality. It is not at all intended for collective use.



Centralized version control system



Centralized version control system is designed to solve the basic problem of local version control system.



To organize such a version control system, a single server is used that contains all versions of files. Clients accessing this server get from this centralized repository. The use of centralized version control systems has been the standard for many years. These include CVS, Subversion, Perforce.



These systems are easy to manage because of a single server. But at the same time, the presence of a centralized server leads to the emergence of a single point of failure in the form of this server itself. If this server is disabled, developers will not be able to download files. The worst scenario is the physical destruction of the server (or the crash of the hard disk), which leads to the loss of the code base.



Despite the fact that the fashion for SVN has passed, sometimes there is a reverse course - the transition from Git to SVN. The point is that SVN allows for selective checkout, which involves downloading only some files from the server. This approach is gaining popularity when using monorepositories, which can be discussed later.



Distributed version control system



Distributed version control systems are used to eliminate a single point of failure. They imply that the client will download the entire repository for itself instead of downloading specific files of interest to the client. If any copy of the repository dies, then this will not lead to the loss of the code base, since it can be restored from the computer of any developer. Each copy is a complete backup of the data.



All copies are equal and can be synchronized with each other. This approach is very similar to (and indeed is) master-master replication.



This type of version control system includes Mercurial, Bazaar, Darcs and Git. The last version control system will be discussed in more detail below.



Git history



In 2005, the version control company BitKeeper severed ties with the Linux kernel community. The community then decided to develop their own version control system. The main values ​​of the new system are: complete decentralization, speed, simple architecture, good support for non-linear development.



Conclusion



We examined the ways of organizing version control systems, discussed options for solving the tasks assigned to these systems, talked about the advantages and disadvantages of each of them, got acquainted with the history of the Git version control system.






All Articles