Why is immutability so important

Hello, Habr!



Today we want to touch on the topic of immutability and see if this problem deserves more serious consideration.



Immutable objects are an immeasurably powerful phenomenon in programming. Immutability helps you avoid all sorts of concurrency issues and a host of bugs, but understanding immutable constructs can be tricky. Let's take a look at what they are and how to use them.



First, take a look at a simple object:



class Person {
    public String name;
    
    public Person(
        String name
    ) {
        this.name = name;
    }
}


As you can see, the object Persontakes one parameter in its constructor and then puts it into a public variable name. Accordingly, we can do things like this:



Person p = new Person("John");
p.name = "Jane";


Simple, right? At any time, read or modify the data as we please. But there are a couple of problems with this method. The first and most important of them is that we use a variable in our class name, and thus, irrevocably introduce the internal storage of the class into the public API. In other words, there is no way we can change the way the name is stored inside the class, unless we rewrite a significant part of our application.



Some languages โ€‹โ€‹(for example, C #) provide the ability to insert a getter function to work around this problem, but in most object-oriented languages, you have to act explicitly:



class Person {
    private String name;
    
    public Person(
        String name
    ) {
        this.name = name;
    }
    
    public String getName() {
        return name;
    }
}


So far so good. If you now wanted to change the internal storage of the name, say, to the first and last name, you could do this:



class Person {
    private String firstName;
    private String lastName;
    
    public Person(
        String firstName,
        String lastName
    ) {
        this.firstName = firstName;
        this.lastName = lastName;
    }
    
    public String getName() {
        return firstName + " " + lastName;
    }
}


If you do not delve into the serious problems associated with such a representation of names , it is obvious that the API has getName()not changed externally .



What about setting names? What do you need to add to not only get the name, but also set it like this?



class Person {
    private String name;
    
    //...
    
    public void setName(String name) {
        this.name = name;
    }
    
    //...
}


At first glance, it looks great, because now we can change the name again. But there is a fundamental flaw in this way of modifying data. It has two sides: philosophical and practical.



Let's start with a philosophical problem. The object is Personintended to represent a person. Indeed, a person's surname can change, but it would be better to name a function for this purpose changeName, since such a name implies that we are changing the surname of the same person. It should also include business logic to change a person's last name, and not just act like a setter. The name setNameleads to a completely logical conclusion that we can voluntarily-compulsorily change the name stored in the person object, and we will not get anything for it.



The second reason has to do with practice: mutable state (stored data that can change) is prone to bugs. Let's take this object Personand define an interface PersonStorage:



interface PersonStorage {
    public void store(Person person);
    public Person getByName(String name);
}


Note that this PersonStoragedoes not indicate where exactly the object is stored: in memory, on disk, or in a database. The interface also does not require an implementation to create a copy of the object it stores. Therefore, an interesting bug may arise:



Person p = new Person("John");
myPersonStorage.store(p);
p.setName("Jane");
myPersonStorage.store(p);


How many people are currently in the person store? One or two? Besides, if you apply the method now getByName, which person will it return?



As you can see, two options are possible here: either it PersonStoragewill copy the object Person, in which case two records will be saved Person, or it will not do this, and will only save the reference to the passed object; in the second case, only one object with a name will be saved โ€œJaneโ€. The implementation of the second option might look like this:



class InMemoryPersonStorage implements PersonStorage {
    private Set<Person> persons = new HashSet<>();

    public void store(Person person) {
        this.persons.add(person);
    }
}


Even worse, the stored data can be changed without even calling the function store. Since the repository contains only a reference to the original object, changing the name will also change the saved version:



Person p = new Person("John");
myPersonStorage.store(p);
p.setName("Jane");


So, in essence, bugs creep into our program precisely because we are dealing with mutable state. There is no doubt that this problem can be circumvented by explicitly writing down the work of creating a copy in the repository, but there is also a much simpler way: working with immutable objects. Let's consider an example:



class Person {
    private String name;
    
    public Person(
        String name
    ) {
        this.name = name;
    }
    
    public String getName() {
        return name;
    }
    
    public Person withName(String name) {
        return new Person(name);
    }
}


As you can see, instead of a method, a method is setNamenow used withNamethat creates a new copy of the object Person. If we create a new copy each time, then we do without mutable state and without the corresponding problems. Of course, this comes with some overhead, but modern compilers can handle it and if you run into performance problems you can fix them later.



Remember:

Premature optimization is the root of all evil (Donald Knuth)


It could be argued that the persistence level that references the live object is a broken persistence level, but such a scenario is realistic. Bad code does exist, and immutability is a valuable tool in helping to prevent such breakages.



In more complex scenarios, where objects are passed through multiple layers of the application, bugs easily flood the code, and immutability prevents state bugs from occurring. Examples of this kind include, for example, in-memory caching or out-of-order function calls.



How immutability helps with parallel processing



Another important area where immutability comes in handy is in parallel processing. More precisely, multithreading. In multithreaded applications, several lines of code are executed in parallel, which, at the same time, access the same memory area. Consider a very simple listing:



if (p.getName().equals("John")) {
    p.setName(p.getName() + "Doe");
}


This code is not buggy by itself, but when run in parallel it starts to preempt and can get messy. Check out what the above code snippet looks like with a comment:



if (p.getName().equals("John")) {

    //     ,     John
    
    p.setName(p.getName() + "Doe");
}


This is a race condition. The first thread checks if the name is equal โ€œJohnโ€, but then the second thread changes the name. At the same time, the first thread continues to work, still assuming that the name is equal John.



Of course, one could use locking to ensure that only one thread is entering the critical part of the code at any given time, however, there can be a bottleneck. However, if the objects are immutable, then such a scenario cannot develop, since the same object is always stored in p. If another thread wants to influence the change, it creates a new copy that will not be in the first thread.



Outcome



Basically, my advice would be to always make sure that mutable state is minimized in your application. If you do use it, restrict it tightly with well-designed APIs, don't let it leak into other areas of the application. The fewer pieces of code you have that contain state, the less likely it is that state errors will occur.



Of course, most programming problems are not solvable if you don't use state at all. But if we consider all data structures to be immutable by default, then there will be much less random bugs in the code. If you really are forced to introduce mutability into the code, then you will have to do it carefully and think over the consequences, and not start all the code with it.



All Articles