I am trying to solve the following problem:
- I have several blocked sets (which may contain duplicate elements).
- I must pick a (varying) number of elements from each blocked set to unblock it.
- I am only allowed to pick elements that also occur in the picking set.
- Whenever I remove an element from a blocked set I must also remove it from the picking set.
- The picking set may contain more or less elements than strictly required.
Minimal example:
//Syntax for blocking sets:
//Name (number of elements required to pick): {elements in set}
//Syntax for picking set:
//Name: {elements in set}
BS1 (1): {0, 1}
BS2 (1): {1, 2}
Picking Set: {1, 2}
Possible solution:
BS1 (1): {0, 1} <- take 1
BS2 (1): {1, 2} <- take 2
Picking Set: {1, 2} <- remove 1, 2
If picking 1 from BS2, the problem becomes unsolvable. The picking set will reduce to {2} and BS1 only contains {0, 1}, making further picks impossible.
A more difficult scenario:
BS1 (1): {1, 2, 4}
BS2 (2): {2, 3, 4}
BS3 (3): {1, 3, 4, 4}
Picking Set: {1, 2, 3, 4, 4, 4}
Possible Solution:
BS1 (1): {1, 2, 4} <- take 1
BS2 (2): {2, 3, 4} <- take 2, 4
BS3 (3): {1, 3, 4, 4} <- take 3, 4, 4
Picking Set: {1, 2, 3, 4, 4, 4} <- remove all
This scenario is solvable in multiple ways, but certain picks will lead to a dead end:
BS1 (1): {1, 2, 4} <- take 1
BS2 (2): {2, 3, 4} <- take 2, 3
BS3 (3): {1, 3, 4, 4} <- take 4, 4, and then dead end
Picking Set: {1, 2, 3, 4, 4, 4} <- remove all but one 4
My solution:
I wrote a recursive brute force algorithm that tests all picking combinations, then all following combinations of the next set based on that, etc. It works, but is slow. Number of combinations explodes but roughly half of the branches are successful, even for large problem instances. This makes me hopeful a heuristic or other method exists which can construct a valid solution directly.
My questions:
- Is there a name for this problem?
- What is the fastest way to construct a valid solution?
- Can we do better than brute force (generate guaranteed valid solutions without trial and error)?


I would place this problem in a class of maximum bipartite matching with additional constraints problems. It's likely impossible to give "the fastest way to construct a valid solution" without knowing limits on set sizes, blocking set amount, number of distinct elements and their multiplicity. But I'll give an efficient polynomial-time solution: reduction to the maximum flow problem, similar to other problems in that class. This should allow solving your problem even for total size of multisets on the order of 100'000.
Definitions
Let's define a graph describing our problem.
The picking multiset will be represented by one vertex,
a_i, per each distinct elementv_ihaving a multiplicitypickcnt_i. Thej-th blocked multiset will be represented by one vertex,b_jk, per each distinct elementu_jkhaving a multiplicity ofblockcnt_jk, and an auxiliary "bottleneck" vertexc_j.price_jwill designate the necessary amount of elements to unblock thej-th blocked multiset. Additionally verticesS, source, andT, sink, are defined.v_iis usablepickcnt_itimes, soSis connected toa_iby an edge with capacitypickcnt_i. Similarlyb_jkis connected toc_jwith capacityblockcnt_jk.c_jis connected toTwith capacityprice_jto limit the progress of "partial unblocking".a_iandb_jkare connected by an edge iffv_i == u_jk, with unlimited capacity.Interpreting a flow
This graph represents a flow network. Let's look at an arbitrary feasible flow. Each unit of flow consumes: a unit capacity on
S->a_i, modeling removal of a singlev_ifrom the picking multiset; a unit capacity onb_jk->c_j, modeling removal of a singleu_jkfrom thej-th blocked multiset; a unit capacity onc_j->T, modeling a single partial unblocking. Hence it is trivial to convert between a feasible flow and a matching of picking and blocking sets elements.Let's look at a maximum flow. It doesn't violate any constraints from our original problem, and its value corresponds to the number of matched elements. So its value can't be higher than
Σprice_j, can reachΣprice_jonly by unblocking all sets, and must reach it if all sets can be unblocked. Therefore maximum flow gives a solution to the original problem if it satiates allc_j->T, and otherwise there is no solution.Complexity
There are many algorithms for finding a maximum flow with complexities favoring dense or sparse graphs with small or unlimited capacities. Many perform in practice better than their complexity would suggest, particularly on special graphs like those produced by a bipartite matching problem. For some such graphs there are additional theorems proving a better complexity. Not knowing the limits I can't suggest a specific algorithm, only describe the size of the reduced problem.
The number of vertices is dominated by the sum of unique element counts for each set. The number of edges — by the number of valid "initial moves": what element can be used to partially unblock what multiset. The maximum flow is the maximum number of "moves" than can be performed.
To give an example of prewritten, ready to be used maxflow implementations, you can take a look at Dinic and PushRelabel here.