Is it practical to store unique paths through a directed acyclic graph?

140 Views Asked by At

A client has a database of composable objects stored as a simple DAG.

item (id, info...)
link (parent_id, child_id)

For archaic (and unfortunately non-negotiable) certification purposes they now want to store additional information for every combination of source and vertex. e.g. for a composition

  A
  |
  B
 / \
C   D

They want a cert (A,B) cert (A,C) and cert (A,D). Easy right?

cert(parent_id, child_id, info...)

Now the thing is these certs are supposed to be for each path not each node. So for items which have common ancestors

  A
 / \
B   C
 \ /
  D

We would need certs (A,B), (A,C) and (A,D)[via B] and (A,D)[via C]

I cannot think of a way to store these certs that doesn't involve storing a reference to each vertex in the path that it represents, but that seems frighteningly exponential. There are hundreds of thousands of records and if a few graphs incorporate just a few common ancestors things could get out of hand fast.

Is there a better way to store these paths than just referencing every vertex in every path for every cert?

1

There are 1 best solutions below

1
On BEST ANSWER

You will certainly have as many records as there are paths if there is need to store them. No way around here.

If I understand correctly you seek for a design that prevents you from storing all vertices as a single attribute ('A,B,D') or a separate table.

In this case, take a hierarchical approach:

path_id  segment_id  next_vertex      -- which path it represents
1                    A
2                    B
3                    C
4                    D
5         1          B                 -- A B
6         1          C                 -- A C
7         2          D                 -- B D
8         3          D                 -- C D
9         5          D                 -- A B D
10        6          D                 -- A C D

Here segment_id represents the subpath without last vertex. Now, for example, path 10 represents "A->C->D", which consists of "A->C" (path 6) with vertex D added at the end. You can traverse the hierarchy by going down the tree same way.

Note we also got here 'degenerate paths', containing just a single vertex, where segment_id is null.