Find the minimum set of vertices in a DAG that disconnects a certain fraction of paths

Question

Find the minimum set of vertices in a DAG that disconnects a certain fraction of paths

2.3k Views Asked by Dant3 At 21 September 2015 at 10:56

The problem is given as follows: Given a DAG and a number 0 < p ≤ 1, return the minimum-cardinality set of vertices that disconnects at least a p-fraction of paths from a source (i.e., no incoming arcs) to a sink (i.e., no outgoing arcs). For p = 1, the problem is equivalent to minimum cut. For other values of p, however, I'm not sure what the answer will be.

An algorithm I'm thinking of is to first compute the min-cut set for a DAG and then try to prune it to satisfy the criteria. This by itself is interesting to see if the subset that we find is actually the minimum cut-set for the specific p that is given. The problem with this algorithm is that it is wasteful, because it computes many nodes that we don't need in the final answer and in fact, is solving "bigger" problem in the first place.

Any pointers to the solution of this problem? wouldn't it be possible that for some of the algorithms of min-cut, we put this extra constraint as an early-stopping criterion?

For checking how many paths are removed, suppose we have indexed each vertex (and will keep it updated if needed) so that we know how many paths get disconnected by their removal. Please do not worry for the complexity of the index being updated. One last thing, there is no constraint on the resulting components in terms of size or anything.

Original Q&A

There are 2 best solutions below

**A.S.H** · Answer 1 · 2015-09-21T12:35:08.907000

EDIT1: after seeing the OP's update, this solution applies to the case of one source u and one sink v.

EDIT2: this is actually a heuristic, see the counter-example in the comments below.

This is a O(V+E) solution based on the method of counting paths provided in the following thread (pointed out by David Eisenstat, thanks) :

Number of paths between two nodes in a DAG

In a first phase, we will apply exactly the “backward” method suggested by stalker. At the end of this phase, we will obtain and store the following information:

For each vertex i, the number F(i, v) of paths from i to v
F(u, v), the number of paths from source u to sink v.

In a second phase, we proceed forward with the same method: we simulate the algorithm as if the question was to find the “backward paths” from v to u. At the end, we obtain:

For each vertex i, the number B(i, u) of backward paths from i to u. Obviously B(i, u) = F(u, i)
B(v, u) which is equal to F(u, v).

In the third phase, we compute for each vertex i, the number P(i) = F(u, i) * F(i, v). It is easy to prove that the number of (u, v) paths that traverse i is P(i). Therefore, removing the vertex i results in removing P(i) paths.

In the fourth and last phase, we proceed in a greedy way: remove the vertex that has the highest P(i) until the total number of removed paths exceeds p*F(u, v)

The overall algorithm is O(V+E), since:

According to the reference post, the first two phases are O(V+E).
The third and fourth phases are obviously O(V), hence O(V+E)

**David Eisenstat** · Answer 2 · 2015-09-21T13:24:18.653000

Here's an idea for getting approximately optimal solutions.

There's a variant of set cover that I want to call partial set cover where we have sets and elements and want the minimum number of sets whose union contains a p-fraction of elements. To apply to this problem, sets correspond to nodes, and elements correspond to maximal paths. (Yes, there are too many paths to do this naively; see below.) A set contains an element if and only if the node is contained in the path.

It's not too hard to write partial set cover as an integer program.

minimize sum_{sets S} x_S
subject to
for all elements e, sum_{sets S containing e} x_S >= y_e
sum_{elements e} y_e >= p * number of elements
for all sets S, x_S in {0, 1}      # 1 if set S is chosen
for all elements e, y_e in [0, 1]  # 1 if element e is covered

This program can be solved by an integer program solver. The solvers are surprisingly good, though they of course cannot promise optimal solutions to this NP-hard set cover generalization.

In an interesting DAG, there are of course far too many paths to be enumerated. Sampling to the rescue! Since it's easy to count maximal paths in DAGs, it's easy to sample them uniformly at random. Sample a number of paths and use these as the elements in the integer program.

The tradeoff is that with more paths, the estimate of the objective is better, but the time to solve the integer program is worse. Hoeffding's inequality gives some hint as to the proper choice of the number of samples. With n vertices, there are 2^n possible solutions. We want the estimated fraction for each of these to be accurate to within some epsilon. Hoeffding says that we need to choose the number of samples to be Theta(n/epsilon^2) so that, almost all of the time, all of the estimates are approximately correct. I'd work out the exact constant, but I doubt that it's practically relevant.

Find the minimum set of vertices in a DAG that disconnects a certain fraction of paths

There are 2 best solutions below

Related Questions in ALGORITHM

Related Questions in MATH

Related Questions in COMPLEXITY-THEORY

Related Questions in GRAPH-THEORY

Related Questions in MINIMUM-CUT

Trending Questions

Popular # Hahtags

Popular Questions