What is the best way to model hierarchical data in SQL?

549 Views Asked by At

I have relationship data in the form:

Parent ID   ParentName  ParentType  RelatedToID RelatedToName   RelatedType
----------------------------------------------------------------------------
     1      A           Business           2        B           Individual
     1      A           Business           4        D           Business
     1      A           Business           3        C           Business
     1      A           Business           6        F           Business
     1      A           Business           3        C           Business
     1      A           Business           9        I           Business
     1      A           Business           9        I           Business
     1      A           Business           3        C           Business
     1      A           Business          12        L           Business
     1      A           Business           5        E           Business
     2      B           Individual         1        A           Business
     2      B           Individual         3        C           Business
     2      B           Individual         3        C           Business
     2      B           Individual         6        F           Business
     2      B           Individual         3        C           Business
     2      B           Individual         4        D           Business
     2      B           Individual         4        D           Business
     3      C           Business           1        A           Business
     3      C           Business           1        A           Business
     3      C           Business           2        B           Individual
     3      C           Business          10        J           Business
     3      C           Business           6        F           Business
     3      C           Business          14        N           Business
     3      C           Business           4        D           Business
     3      C           Business           7        G           Business
     3      C           Business           1        A           Business
     3      C           Business           2        B           Individual
     4      D           Business           2        B           Individual
     4      D           Business           3        C           Business
     4      D           Business           3        C           Business
     4      D           Business          10        J           Business
     4      D           Business           1        A           Business
     4      D           Business           1        A           Business
     4      D           Business           7        G           Business
     5      E           Business           1        A           Business
     5      E           Business           1        A           Business
     6      F           Business           2        B           Individual
     6      F           Business           1        A           Business
     6      F           Business           3        C           Business
     6      F           Business           3        C           Business
     6      F           Business           1        A           Business
     7      G           Business           3        C           Business
     7      G           Business           4        D           Business
     7      G           Business           3        C           Business
     7      G           Business           3        C           Business
     8      H           Individual         9        I           Business
     8      H           Individual         9        I           Business
     9      I           Business           1        A           Business
     9      I           Business           8        H           Individual
    10      J           Business           3        C           Business
    10      J           Business           3        C           Business
    10      J           Business          13        M           Business
    10      J           Business           3        C           Business
    10      J           Business           4        D           Business
    10      J           Business          11        K           Individual
    11      K           Individual        10        J           Business
    11      K           Individual        13        M           Business
    11      K           Individual        10        J           Business
    11      K           Individual        13        M           Business
    12      L           Business           1        A           Business
    13      M           Business          11        K           Individual
    13      M           Business          10        J           Business

I'm using DiagrammeR to make a relationship chart based on this data. I need to transform this data in SQL to feed into graphviz. i.e:

  • A column in my final table in the form of ParentID "->" RelatedToID (e.g "A->B")
  • exclude any redundant backwards relationships i.e A->B & B->A should be reduced to A->B

I'm willing to go 5 levels into the relationship tree in my data prep step. Ultimately this example above should reduce to this:

ReadySet

A->B
A->D
A->C
A->F
A->I
A->L
A->E
B->C
B->F
B->D
C->J
C->F
C->N
C->D
C->G
D->J
D->G
H->I
J->M
J->K
K->M

which in GraphViz results in:

enter image description here

What I have tried and my difficulty:

I started with defining the parents in which the ParentType = 'Individual' I then used self joining to obtain the hagiarchy on a row level.

What I want (and can't seem to do) is, produce a single SQL table that will produce ReadySet if a user makes a selection Name that is contained within the breadth of the relationship tree i.e if A is selected then ReadySet or if M is selected then ReadySet… There are obviously more parent ID's/names in my entire dataset.

1

There are 1 best solutions below

0
On

You need a table to define the edges between the nodes and their direction.

create table edges (
  from_id bigint not null references nodes(id),
  to_id bigint not null references nodes(id),
  primary key(from_id, to_id);
);

"nodes" here is whatever table actually holds the data.

If the relationship is two way, we need to rows, one for A -> B and one for B -> A.

Then do a recursive CTE to find all the matching rows.

with recursive ready_set as (
  select *
  from edges
  where from_id = ?
  union
  select e.*
  from edges e
  inner join ready_set rs on e.from_id = rs.to_id
)
select *
from ready_set;

The first part of the union is the starting condition, and the second is the recursion which joins with the CTE.

For example, if we set up our edges like so:

1 <-> 2 <-> 3 -> 4
1 <-> 5 <- 6

insert into edges values
  (1, 2), (2, 1),
  (2, 3), (3, 2),
  (3, 4),
  (1, 5), (5, 1),
  (6, 5);

Everything has a path from each other, except 6 which is one way to 5. If we ask for 5, we'll get all the edges except 6, 5. If we ask for 4, we'll get nothing because 4 has no outgoing connections.

And if we select distinct to_id from ready_set; starting at 5 we'll get just the node IDs 1, 2, 3, 4, and 5. No 6.

Try it.