I have two columns (such as):
| from | to |
|---|---|
| 1 | 2 |
| 1 | 3 |
| 2 | 4 |
| 4 | 2 |
| 4 | 2 |
| 4 | 3 |
| 3 | 3 |
And I want to create a transition matrix (where sum of rows in a columns add up to 1):
1. 2. 3. 4.
1. 0 0 0 0
2. 0.5* 0 0 2/3
3. 0.5 0.5 1 1/3
4. 0 0.5 0 0
where 1 -> 2 would be : (the number of times 1 (in 'from') is next to 2 (in 'to)) / (total times 1 points to any value).
You can create this kind of transition matrix using a window and pivot.
First some dummy data:
To create a pct column, first group the data by unique combinations of from/to and get the counts. With that aggregated dataframe, create a new column,
pctthat uses theWindowto find the total number of records for eachfromgroup which is used as the denominator.Lastly, pivot the table to make the
tovalues columns and thepctdata the values of the matrix.