Maximum sum, min length subset

322 Views Asked by At

I am trying to solve this popular problem:

"Given an integer array, divide the array into two subsets, A and B, while respecting the following conditions.

  1. the intersection of A and B is null.

  2. the union A and B is equal to the original array.

  3. the number of elements in subset A is minimal.

  4. the sum of A's elements is greater than the sum of B's elements.

Return the subset A in increasing order, where the sum of A's elements is greater than the sum of B's elements. If more than one subset exists, return the one with the maximal sum."

I found several solutions online but none of them solve it when the test case is [2,2,2,5]

I wrote the following code:

def subsetA(nums):
    nums.sort(reverse=True)
    subset_a = []
    sum_a = 0
    sum_b = 0

    for num in nums:
        if sum_a <= sum_b:
            sum_a += num
            subset_a.append(num)
        else:
            sum_b += num

    return sorted(subset_a)

This solution works for a lot of cases but it doesn't work for the test case I mentioned above.

The answer should be [2,2,2]. But i cannot think of any solution that will return that answer.

How can I improve this code?

2

There are 2 best solutions below

0
Cary Swoveland On

One way to solve this is as a linear program with binary variables. Suppose the array is held by the variable arr, where arri is the value of the element at index i.

First define the binary variables.

xi equals 1 if arri is assigned to A, and equals 0 if it is assigned to B

The objective (function) and constraints are as follows::

min ∑ixi

subject to:

iarrixi >= ∑iarri(1-xi) + t

t is a pre-defined tolerance to ensure that a strict inequality is achieved (linear programs cannot have strict inequalities).

Notice that 1-xi equals 1 if arri is assigned to B.

The constraint needs to be rearranged as follows.

iarrixi >= (∑iarri + t)/2


An integer linear programming (ILP) package can then be used to obtain optimal values for the variables.

5
MBo On

You are using greedy algorithm, it does not guarantee the best solution. If sum of list is reliable, you can apply dynamic programming.

k-th entry of list a contains tuple (last index, number of items to compose sum k).

We fill table a: for every value n from nums check a entries whether a) sum k might be composed with n + k-n sum b) number of items for this sum is better than before

Then look for the smallest number of items in the second half of the table - it is the best sum for given condition. Then we retrieve used items.

def bestsubsetsum(nums):
    l = len(nums)
    S = sum(nums)
    nums.sort()
    a =[(0,l+1)]*(S+1)
    a[0] = (-1,0)

    for j, n in enumerate(nums):
        for k in range(S, n-1,-1):
            if a[k-n][0] and a[k-n][1]+1 < a[k][1]:
                a[k] = (j+1, a[k-n][1]+1)
    #print(a)

    a[0] = (-1, l+1)
    imin = 0
    for k in range(S//2+1,S+1):
        if a[k][1] <= a[imin][1]:
            imin = k
    A = []
    while imin:
        A.insert(0, nums[a[imin][0]-1])
        imin -= nums[a[imin][0]-1]
    return A


print(bestsubsetsum([3,7,12,1]))
print(bestsubsetsum([2,2,2,5]))
print(bestsubsetsum([2,4,3,7,3,7,3]))
print(bestsubsetsum([2,4,3,9,3,7,3]))

>>
[12]
[2, 5]
[4, 7, 7]
[7, 9]