Scipy's linear_sum_assignment giving incorrect result

1.1k Views Asked by At

When I tried using scipy.optimize.linear_sum_assignment as shown, it gives the assignment vector [0 2 3 1] with a total cost of 15.

However, from the cost matrix c, you can see that for the second task, the 5th agent has a cost of 1. So the expected assignment should be [0 3 None 2 1] (total cost of 9)

Why is linear_sum_assignment not returning the optimal assignments?

from scipy.optimize import linear_sum_assignment

c = [
    [1, 5, 9, 5],
    [5, 8, 3, 2],
    [3, 2, 6, 8],
    [7, 3, 5, 4],
    [2, 1, 9, 9],
]

results = linear_sum_assignment(c)
print(results[1]) # [0 2 3 1]
1

There are 1 best solutions below

0
On

linear_sum_assignment returns a tuple of two arrays. These are the row indices and column indices of the assigned values. For your example (with c converted to a numpy array):

In [51]: c
Out[51]: 
array([[1, 5, 9, 5],
       [5, 8, 3, 2],
       [3, 2, 6, 8],
       [7, 3, 5, 4],
       [2, 1, 9, 9]])

In [52]: row, col = linear_sum_assignment(c)

In [53]: row
Out[53]: array([0, 1, 3, 4])

In [54]: col
Out[54]: array([0, 2, 3, 1])

The corresponding index pairs from row and col give the selected entries. That is, the indices of the selected entries are (0, 0), (1, 2), (3, 3) and (4, 1). It is these pairs that are the "assignments".

The sum associated with this assignment is 9:

In [55]: c[row, col].sum()
Out[55]: 9

In the original version of the question (but since edited), it looks like you wanted to know the row index for each column, so you expected [0, 4, 1, 3]. The values that you want are in row, but the order is not what you expect, because the indices in col are not simply [0, 1, 2, 3]. To get the result in the form that you expected, you have to reorder the values in row based on the order of the indices in col. Here are two ways to do that.

First:

In [56]: result = np.zeros(4, dtype=int)

In [57]: result[col] = row

In [58]: result
Out[58]: array([0, 4, 1, 3])

Second:

In [59]: result = row[np.argsort(col)]

In [60]: result
Out[60]: array([0, 4, 1, 3])

Note that the example in the linear_sum_assignment docstring is potentially misleading; because it displays only col_ind in the python session, it gives the impression that col_ind is "the answer". In general, however, the answer involves both of the returned arrays.