Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Hackers > Transitive clos...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 2 Topic 3087 of 9770
Post > Topic >>

Transitive closure of a directed graph

by sgunderson@[EMAIL PROTECTED] ("Steinar H. Gunderson") Nov 2, 2005 at 06:31 PM

Hi,

I was asked to post this here for any interested parties -- please Cc me
on
any comments/followups as I'm not subscribed to -hackers.

This is a sort-of writeup on how to find the transitive closure[1] of a
directed graph G(V,E), or G+(V,E) if you want to get into notation. 
(There
are multiple ways to do this, though, and this only describes one of them.
I've been told it could be possible to do this with WITH RECURSIVE, but
I've
never used it myself and PostgreSQL doesn't sup****t it, so it's left as an
excercise for the reader :-) ) 

In short, the transitive closure of a graph is defined as follows: The
transitive closure G+(V,E) contains an edge A -> B if and only If there is
a
path from A to B (possibly via other vertices) in the graph G(V,E). In
other
words, one question the transitive closure can answer quickly (to other
queries) is "is there a path from A to B?". (We use this in our door card
access system, which sup****ts group inheritance -- a group can be a
subgroup
of another group and thus inherits all its door access.)

A relational database is unable to store and operate on a graph directly,
but
you can store the vertices and edges separately, as in:

  CREATE TABLE vertices (
    id INTEGER PRIMARY KEY
    <any other fields you'd want here>
  );
  CREATE TABLE edges (
    upper INTEGER REFERENCES vertices(id),
    lower INTEGER REFERENCES vertices(id)
  );

We'll be using PL/PgSQL, which unfortunately can't handle tem****ary tables
very well at the moment, so we'll also define a tem****ary copy we can work
on
from our function, identical to edges:

  CREATE TABLE temp (
    upper INTEGER,
    lower INTEGER
  );

The basic idea of the technique is to grow the closure outwards from the
original graph; this makes it somewhat like the Bellman-Ford algorithm
implemented in SQL. It is not particularily effective on dense graphs
(since
it generates potentially V^2 records in the tem****ary table) or in graphs
with very long paths (since it needs one iteration for every step of the
longest path), but works well on sparse graphs without too long paths.
It's also very simple, and tries to leave as much as possible of the work
to
PostgreSQL itself. (I implemented a variant using depth-first search and
simulated queues once, and it was 10-20 times slower, possibly due to
PL/PgSQL overhead -- an implementation in C or Perl might fare a lot
better.)

So, for every iteration we basically do this: For every edge (A,B) in our
table, find all edges (B,C) that we don't already have found. Since
there's a
path between A and C, insert (A,C) as an edge. Repeat until there are no
more
edges inserted, and we have our transitive closure.

The function goes as follows:

CREATE FUNCTION transitive_closure() RETURNS SETOF edges AS
'
DECLARE
        r record;
BEGIN
        -- Start with the initial set of edges as a base
        INSERT INTO temp SELECT * FROM edges;

        -- Grow the graph until nothing more happens
        LOOP
                INSERT INTO temp
                    SELECT e1.upper, e2.lower
                    FROM temp e1 JOIN temp e2 ON e1.lower=e2.upper
                    WHERE (e1.upper, e2.lower) NOT IN (
                        SELECT * FROM temp
                    );
                EXIT WHEN NOT FOUND;
        END LOOP;
        -- Return the entire data set
        FOR r in SELECT * FROM temp LOOP
                RETURN NEXT r;
        END LOOP;             

        -- Clean up
        TRUNCATE temp;
        RETURN;
END;
' LANGUAGE plpgsql;

Also note that after this, if there's any edge (A,A) in the graph, there
is
a loop somewhere, and the original graph is not a DAG.

A simple example, just for reference:

  graph=# select * from vertices;
   id 
  ----
    1
    2
    3
    4
    5
  (5 rows)

  graph=# select * from edges;
   upper | lower 
  -------+-------
       1 |     2
       2 |     3
       2 |     4
       1 |     5
       5 |     4
  (5 rows)

The matching graph in ASCII art looks something like (I haven't drawn the
arrows):
 
         1
       /   \
      2     5
    /  \   /
   3     4
  
The matching transitive closure then becomes:

  mdb2=# select * from transitive_closure();
   upper | lower 
  -------+-------
       1 |     2
       2 |     3
       2 |     4
       1 |     5
       5 |     4
       1 |     3
       1 |     4
       1 |     4
  (8 rows)

Or, again in ASCII (a bit harder this time :-) )
  
     --- 1
    /  / | \
   /  2  |  5
   |/  \ | /
   3     4

Note that there's a duplicate in there (because both the path 1 -> 2 -> 4
and 1 -> 5 -> 4 are added in the same iteration and thus produce the edge
1 -> 4); you might want to add a DISTINCT (either at the end or in each
iteration) if that's a problem.

/* Steinar */

[1] http://en.wikipedia.org/wiki/Transitive_closure
-- 
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
 




 2 Posts in Topic:
Transitive closure of a directed graph
sgunderson@[EMAIL PROTECT  2005-11-02 18:31:56 
Re: Transitive closure of a directed graph
sgunderson@[EMAIL PROTECT  2005-11-10 20:18:51 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan13V112 Sun Jul 20 4:32:34 CDT 2008.