Essay on Implementing a Directed Graph with Unit Edges Using PL / pgSQL

This article provides general ideas and sketches for implementing a directed graph in PostgreSQL.



The graph was used to implement subordination between employees, instead of the previously used ancestor-child method in the department table.



The experience turned out to be successful, maybe it will be useful to someone and help save time. At one time I was looking for an implementation on pqSQL, but apparently I was looking badly. I had to implement it myself. Which, in general, is even for the best, the task is interesting, it is always nice to do something with your own hands, so that the time is not wasted.



Input data



In general, there is a standard table "employee position" with the primary key id . Positions have a hierarchy of subordination "chief-employee". The idea is that the links between posts are stored in a separate entity graph.

Dijkstra's algorithm was used to find a path in the graph, a general description can be found, for example, here



Output



  • List of subordinate employees for this
  • List of bosses for this employee
  • Is the employee a subordinate for this
  • List of subordinate employees for this (path from boss to employee)


Implementation using PL / pgSQL



Storing a graph as a table of edges



The vertices are the id values โ€‹โ€‹of the target table.



CREATE TABLE graph
(  
  vertex integer NOT NULL ,         --id    
  edge integer NOT NULL ,           --id 
  vertex_order integer NOT NULL --  ( 0 , 1 )
);


To generate the id of the edge, use the sequence



CREATE SEQUENCE enge_seq OWNED BY graph.edge; 


Filling the edge table



Two INSERT operations are used to insert a new edge (vertices id0, id1 )



--  id 
new_id = nextval('enge_seq');


-- 
INSERT INTO  graph (vertex  , edge , vertex_order  ) VALUES ( id0 , new_id , 0  );


-- 
INSERT INTO  graph (vertex  , edge , vertex_order  ) VALUES ( id1 , new_id , 1  );


Retrieving a list of subordinate employees for a given current_id



SELECT 
  id 
FROM 
  graph
WHERE 
  vertex_order  = 1 AND 
  edge IN (SELECT edge FROM graph WHERE vertex  = current_id AND vertex_order = 0  );


Getting the list of bosses for a given current_id



SELECT 
  id 
FROM 
  graph
WHERE 
  vertex_order  = 0 AND 
  edge IN (SELECT edge FROM graph WHERE vertex  = current_id AND vertex_order = 1  );


Generation of an availability matrix (Dijkstra's algorithm) from a given vertex start_id



Creating a temporary table tmp_matrix
CREATE TEMPORARY TABLE tmp_matrix
AS
(
  SELECT 
    DISTINCT vertex  , 
    FALSE AS label ,
    999999 AS distance ,
    FALSE AS is_finish 
  FROM 
   graph
);


Initial population of the tmp_matrix table
UPDATE tmp_matrix 
SET label = TRUE , distance = 0 
WHERE vertex = current_id ; 

current_id = start_id ;

SELECT 
  distance
INTO 
  current_distance
FROM 
  tmp_matrix
WHERE 
  vertex = current_id;

SELECT 
  EXISTS (SELECT 1 FROM tmp_matrix WHERE label = FALSE )
INTO
  not_visit;


Filling in the availability matrix
WHILE not_visit 
LOOP
  FOR v_rec IN
    SELECT 
      *
   FROM 
     tmp_matrix
  WHERE
     NOT label AND 
    --    
     vertex IN ( SELECT 
                       id 
                     FROM 
                      graph
                    WHERE 
                      vertex_order  = 1 AND 
                      edge IN (SELECT 
                                     edge 
                                    FROM 
                                      graph 
                                    WHERE 
                                      vertex  = current_id AND vertex_order = 0  ))
  LOOP    
    --  
    IF v_rec.distance > (current_distance +1)
    THEN 
      UPDATE tmp_matrix SET distance = (current_distance +1) WHERE vertex = v_rec.vertex;
    END IF ;
    
   --  
   IF SELECT 
        NOT EXISTS 
        (
          SELECT 1 FROM graph WHERE vertex = v_rec.vertex AND vertex_order = 0 
        )
   THEN
     UPDATE tmp_matrix SET label = TRUE , is_finish = TRUE WHERE vertex = v_rec.vertex;
   END IF ;   
  END LOOP;  

  UPDATE tmp_matrix SET label = TRUE WHERE vertex = current_id ;

  -- 
  SELECT 
    vertex
  INTO
    current_id 
  FROM 
    tmp_matrix 
  WHERE 
    NOT label AND
    distance < 999999 ;

  SELECT 
    distance
  INTO 
    current_distance
  FROM 
    tmp_matrix 
  WHERE 
     vertex = current_id ;

  SELECT 
    EXISTS (SELECT 1 FROM tmp_matrix  WHERE label = FALSE )
  INTO
   not_visit ;

  IF current_id IS NULL 
  THEN
     not_visit = FALSE ; 
  END IF;
END LOOP;


Return the resulting availability matrix
SELECT
   vertex ,
   label , 
   distance ,
   is_finish
FROM 
  tmp_matrix 
WHERE 
  distance != 999999 ;




Whether the employee with check_id is a subordinate for the given start_id



Get the availability matrix for start_id (see above).



IF EXISTS 
  ( SELECT distance FROM tmp_matrix WHERE vertex = check_id  )
THEN 
   RETURN TRUE;
ELSE
   RETURN FALSE;
END IF;


List of subordinate employees for this one (path from boss start_id to employee finish_id)



Get the availability matrix for start_id (see above).



Populate path table between start_id and finish_id tables
current_id = finish_id ;
result[1] = finish_id ; 
WHILE current_id != start_id 
LOOP
  SELECT 
    par.id 
  FROM 
   ( SELECT 
       id 
     FROM 
      graph
    WHERE 
      vertex_order  = 0 AND 
      edge IN (SELECT 
                     edge 
                    FROM 
                      graph 
                    WHERE 
                      vertex  = current_id AND vertex_order = 1  )) par
  JOIN  tmp_matrix m ON ( par.id = m.vertex  )
  INTO 
    parent_id
  LIMIT 1 ;

  SELECT 
    array_append( result , parent_id )
  INTO 
    result ;

 -- 
current_id = parent_id ;   
END LOOP;


As a result , the path in the result array will be formed as a set of id in reverse order. For ease of viewing, the array can be reversed in any way.



All Articles