Create origin-Destination matrix from a Data Frame in Python

3.5k Views Asked by At

I would like to create an origin-destination matrix from the following data frame in python:

Origin  Destination
1         2
1         3
1         4
2         3
3         4

I expect the following matrix:

   1  2  3  4
1  0  1  1  1
2  0  0  1  0 
3  0  0  0  1
4  0  0  0  0

I know that it could be done in R using table() function, but I don't know how to do it in python. Many thanks for any help.

1

There are 1 best solutions below

0
On

You could use pivot_table with and aggregate function of len to build the matrix:

df.pivot_table(values='Destination', index="Origin", columns='Destination',
           fill_value=0, aggfunc=len)

which gives:

Destination  2  3  4
Origin              
1            1  1  1
2            0  1  0
3            0  0  1

But you will only find the origins and destination existing in the original matrix.

If you want a row and a column for every possible endpoint, you will have to first build an empty matrix and then add the above one:

resul = pd.DataFrame(0, index=list(range(1,5)), columns = list(range(1,5))
                ).add(df.pivot_table(values='Destination', index="Origin",
                                     columns='Destination', aggfunc=len),
                      fill_value=0).astype('int')

which gives the expected matrix:

   1  2  3  4
1  0  1  1  1
2  0  0  1  0
3  0  0  0  1
4  0  0  0  0