I have a dataset (nba_data) which I'm having trouble transposing. What I want is to transform the following,
TEAM_ABBREVIATION GAME_DATE WinLoss HomeAway
ATL 2016-10-27 W H
ATL 2016-10-29 W A
ATL 2016-10-31 W H
ATL 2016-11-02 L H
BKN 2016-10-26 L A
BKN 2016-10-28 W H
BKN 2016-10-29 L A
BKN 2016-10-31 L H
to the following,
TEAM_ABBREVIATION GAME_DATE HomeWin HomeLoss AwayWin AwayLoss
ATL 2016-10-27 1 0 0 0
ATL 2016-10-29 1 0 1 0
ATL 2016-10-31 2 0 1 0
ATL 2016-11-02 2 1 1 0
BKN 2016-10-26 0 0 0 1
BKN 2016-10-28 1 0 0 1
BKN 2016-10-29 1 0 0 2
BKN 2016-10-31 1 1 0 2
If you could please help that would be great.
Thanks, Tom
yields
The first idea is that there are 4 kinds of "events" corresponding to the 4 combinations of possible values from the
WinLossandHomeAwaycolumns:(W,H),(W,A),(L,H)and(L,A).Thus it is natural to want to combine the
WinLossandHomeAwaycolumns into a single column:and then use
get_dummiesto convert this Series into a table of 1's and 0's:Now by comparison with your desired result, we can see we also want to take a cumulative sum, grouped by
TEAM_ABBREVIATION:The next two lines reorders and renames the columns:
Finally, we can use
pd.concatto concatenatedfwithresultand build the desired DataFrame: