I have two columns (such as):
from | to |
---|---|
1 | 2 |
1 | 3 |
2 | 4 |
4 | 2 |
4 | 2 |
4 | 3 |
3 | 3 |
And I want to create a transition matrix (where sum of rows in a columns add up to 1):
1. 2. 3. 4.
1. 0 0 0 0
2. 0.5* 0 0 2/3
3. 0.5 0.5 1 1/3
4. 0 0.5 0 0
where 1 -> 2 would be : (the number of times 1 (in 'from') is next to 2 (in 'to)) / (total times 1 points to any value).
You can create this kind of transition matrix using a window and pivot.
First some dummy data:
To create a pct column, first group the data by unique combinations of from/to and get the counts. With that aggregated dataframe, create a new column,
pct
that uses theWindow
to find the total number of records for eachfrom
group which is used as the denominator.Lastly, pivot the table to make the
to
values columns and thepct
data the values of the matrix.