Combinatorial optimization with interdependent variables

240 Views Asked by At

This is my first post on these forums, thanks in advance for all responses.

I am developing a java application in which I have encountered what I believe is termed a “Combinatorial Optimization Problem”. I have only basic math skills, so trying to investigate the setup of such a problem have not been fruitful so far.

Basically, what I would like to do is to program the most efficient way of finding the optimal subset n of a larger set N with variables v1, v2, v3 etc. The variables all have a definite value/score in addition to a value/score that is dependant on certain other variables that may or may not be included in the subset.

I am interested in selecting the subset which gives the maximum total value/score.

So, for example, the full set N consists of the following variables and have the following definite values as well as relations to the other variables:

v1  8   { v2 ; v8 }
v2  6   { v1 ; v4 }
v3  9   { }
v4  7   { v2 ; v5 ; v8 }
v5  6   { v4 ; v10 }
v6  8   { v7 }
v7  5   { v6 }
v8  9   { v1 ; v4 }
v9  6   { } 
v10 7   { v5 }

Having a relation to one or more other chosen variable means that the total value will have a certain “take-off” – for the sake of this example let’s say one for each relation. So selecting the first five variables as a subset will give a total value of 30:

v1  8   { v2 ; v8 }      = 8 - 1 = 7
v2  6   { v1 ; v4 }      = 6 - 2 = 4
v3  9   { }              = 9 - 0 = 9
v4  7   { v2 ; v5 ; v8 } = 7 - 2 = 5
v5  6   { v4 ; v10 }     = 6 - 1 = 5

This is of course not a problem for such small sets, but I am currently facing sets of 100K and subsets of 10K – with the current algorithm, my computer calculated the solution in 3 days!

I do not necessarily need a code for solving this, but rather the optimal mathematical way to do it (if there are any!). Keep in mind though that I’m having a hard time understanding mathematical notation above basic level.

Again, thanks in advance for all responses!

3

There are 3 best solutions below

1
On BEST ANSWER

Unfortunately this problem is NP-hard to find the optimal solution.

If you could solve this problem efficiently then you could solve the NP-hard maximum independent set problem by setting the value for each vertex to 1, and a very large penalty for each edge.

So you should instead look for approximate solutions. You may find you get a reasonable answer with simulated annealing or genetic algorithms.

0
On

You can see your set as a graph. Each vX is a node/vertice with the respective value. Example v1 is a node/vertice with value 8, v2 is a node/vertice with value 6, etc. Then there are edges between them. Example v1 has 2 edges: one for v2 and another for v8. Each edge can also have a value (in your case -1).

So if you use graphs, and select v1 to v5: you have 8 + 6 + 9 + 7 + 6 (vertice value) -1 -1 -1 -1 -1 -1 (edges value).

Try to see this http://jgrapht.org/ to see if it helps you.

Also see some Graph Teory: http://www.people.vcu.edu/~gasmerom/MAT131/graphs.html. Observe Longest/Shortest Path algoritms (example: http://www.maxburstein.com/blog/introduction-to-graph-theory-finding-shortest-path/)

0
On

To a linear program solver, take an input like

v1  8   { v2 ; v8 }
v2  6   { v1 ; v4 }
v3  9   { }
v4  7   { v2 ; v5 ; v8 }
v5  6   { v4 ; v10 }
v6  8   { v7 }
v7  5   { v6 }
v8  9   { v1 ; v4 }
v9  6   { }
v10 7   { v5 }

and convert it to a integer program like

maximize   8*v1 - v1v2 - v1v8
         + 6*v2 - v2v1 - v2v4
         + 9*v3
         + 7*v4 - v4v2 - v4v5 - v4v8
         + 6*v5 - v5v4 - v5v10
         + 8*v6 - v6v7
         + 5*v7 - v7v6
         + 9*v8 - v8v1 - v8v4
         + 6*v9
         + 7*v10 - v10v5

subject to
v1 + v2 - v1v2 <= 1
v1 + v8 - v1v8 <= 1
v2 + v1 - v2v1 <= 1
v2 + v4 - v2v4 <= 1
v4 + v2 - v4v2 <= 1
v4 + v5 - v4v5 <= 1
v4 + v8 - v4v8 <= 1
v5 + v4 - v5v4 <= 1
v5 + v10 - v5v10 <= 1
v6 + v7 - v6v7 <= 1
v7 + v6 - v7v6 <= 1
v8 + v1 - v8v1 <= 1
v8 + v4 - v8v4 <= 1
v10 + v5 - v10v5 <= 1

binary v1, v1v2, v1v8,
       v2, v2v1, v2v4,
       v3,
       v4, v4v2, v4v5, v4v8,
       v5, v5v4, v5v10,
       v6, v6v7,
       v7, v7v6,
       v8, v8v1, v8v4,
       v9,
       v10, v10v5

Your instance size is likely too much for an optimal solution, but one never knows...