Growing nested sets in .Net conditions





Hi, my name is Anton and I am a developer. Son I gave birth, the house built bought, it remains to grow a tree. Since I'm not very much an agronomist, I had to code a tree. Our project consists of several microservices on .Net Core, entities that form hierarchical relationships are stored in the PostgreSQL database. I'll tell you what structure is better to choose for storing such data, why exactly Nested sets, what rakes you can step on, and how to live with it later.



What are Nested sets



Everyone knows how a tree grows in a garden, and in the case of Nested sets, the tree grows like this: for each node, two fields Left and Right are stored, they are integers. The logic here is that Left is less than Right, and if a node has children, then all the Left and Right values ​​of the child nodes must be between the corresponding values ​​of the parent.





How they grow with us



We have dedicated a separate microservice for storing hierarchies. Fronted often has to draw a complete tree as well as subtrees of elements in the detail view, while inserting and moving elements is relatively rare. For such a case, Nested sets are perfect. Stored in a table like this:





Id is the identifier of the corresponding entity, using it you can get complete information from the appropriate microservice. TreeId is the identifier of the root element. The trees in the project are mostly small, but there are many of them. EntityFramework is used to read from the database, the class corresponds one to one to the table structure.



How to read



With this storage method, getting the elements of the subtree is simple - we request all nodes whose Left is greater than the Left of the parent, and the Right, respectively, is less. We also check that all nodes belong to the same tree - by the TreeId column. This is the most frequently needed operation and is performed quickly. For example, like this:



dataContext.Nodes.Where(_ => 
                        _.Left > node.Left && 
                        _.Right < node.Right && 
                        _.TreeId == node.TreeId); 


Another frequently performed operation is finding all the parent nodes of an object. Here, too, is not difficult - we request tree nodes whose Left is less than the Left of the parent element, and the Right, respectively, is larger. For example, in this way:



dataContext.Nodes.Where(_ => 
                        _.Left < node.Left && 
                        _.Right > node.Right && 
                        _.TreeId == node.TreeId);


How to grow new branches



Let's move on to the difficult part - transplant, i.e. adding nodes or moving from one subtree to another. Let's figure out how to make a transfer, because this operation essentially includes everything you need to add a child element.



The first thing to do for a successful insert is to allow only one insert or update operation. To do this, we will use blocking. It makes no sense to lock the entire table, since nodes can only navigate within one tree, so it is enough to lock only this tree. To do this, execute the following SQL query:



select * from "Nodes" where "TreeId" = <TreeId> for update; 


This will allow us to read the elements of the tree immediately, but if we need to add or change a node in the tree where such an operation has already begun, we will have to wait for the previous one to complete.



The next step is to prepare the place for the transplant - create a gap between Left and Right. Let's calculate how much space is needed - this is the difference between Right and Left of the root element of the subtree being moved. Add this difference to all the children of the node that will become the new parent. We can catch Exception here, and here's why. To speed up the search and reading in the table, two B-Tree indexes have been added, on the Left and Right fields, and if you change the values ​​of these fields at the same time, EntityFramework may generate a circular dependency error, since two indices can be recalculated simultaneously on one line. The error will be of type InvalidOperationException with the following message:



Unable to save changes because a circular dependency was detected in the data to be saved: 'Node [Modified] <- Index {' Right ',' TreeId '} Node [Modified] <- Index {' Left ',' TreeId '} Node [Modified] '.


To get around, we'll just make two separate loops - in one we'll change Left, in the other Right, and, of course, we'll save the changes after each of them.




            var nodesToMove = await dataContext.Nodes 
                .Where(n => 
                    n.Right >= parentNodeRight && 
                    n.TreeId == parentNode.TreeId) 
                .OrderByDescending(n => n.Right) 
                .ToListAsync(); 
 
            foreach (var n in nodesToMove) 
            { 
                n.Left += distance; 
            } 
 
            await dataContext.SaveChangesAsync(); 
 
            foreach (var n in nodesToMove) 
            { 
                n.Right += distance; 
            } 
 
            await dataContext.SaveChangesAsync(); 


Further, the transplant itself - the transfer distance will be equal to the difference between the Left of the new parent and the Left of the root of the subtree. Add this value to the Left and Right of all nodes in the subtree we are moving.




            var nodes = await dataContext.Nodes 
                .Where(n => 
                    n.Left >= node.Left && 
                    n.Right <= node.Right && 
                    n.TreeId == node.TreeId) 
                .ToListAsync(); 
 
            foreach (var n in nodes) 
            { 
                n.Left += distance; 
                n.Right += distance; 


And the last thing to do is to close the gap where the subtree was moved. Let's request all nodes to the right of this subtree - these will be elements whose Right is greater than or equal to the Left of the root of the subtree, and move them to the vacant space. To do this, subtract from the Left and Right of all these nodes the difference between the Right and Left of the root. Here, too, you have to do two separate loops:




            var nodesToMove = await dataContext.Nodes 
              .Where(n => n.Right >= gap.Left && n.TreeId == gap.TreeId) 
              .ToListAsync(); 
            nodesToMove = nodesToMove 
                .Where(n => n.Right >= gap.Right) 
                .OrderBy(n => n.Right) 
                .ToList(); 
 
            foreach (var n in nodesToMove) 
            { 
                if (n.Left >= gap.Right) 
                { 
                    n.Left -= distance; 
                } 
            } 
 
            await dataContext.SaveChangesAsync(); 
 
            foreach (var n in nodesToMove) 
            { 
                n.Right -= distance; 
            } 
 
            await dataContext.SaveChangesAsync();


Conclusion



Let's see what has grown. We got a tree with the ability to quickly read children and parents. If your project needs to frequently read data and retrieve subtrees, Nested sets are a great choice. We must be prepared for the fact that there may be problems with insert and update operations, but they are quite solvable. If you have to add and transfer frequently, it is better to think about using some other algorithm, or consider some hybrid options. For example, cross Nested sets and Adjacency List. To do this, in each node, in addition to Left and Right, you need to add a direct link to the parent element identifier. This will allow you to quickly navigate the hierarchy and find parent and child nodes, and, moreover, will increase the reliability of the algorithm.



All Articles