I have a dataset (nba_data) which I'm having trouble transposing. What I want is to transform the following,
TEAM_ABBREVIATION GAME_DATE WinLoss HomeAway
ATL 2016-10-27 W H
ATL 2016-10-29 W A
ATL 2016-10-31 W H
ATL 2016-11-02 L H
BKN 2016-10-26 L A
BKN 2016-10-28 W H
BKN 2016-10-29 L A
BKN 2016-10-31 L H
to the following,
TEAM_ABBREVIATION GAME_DATE HomeWin HomeLoss AwayWin AwayLoss
ATL 2016-10-27 1 0 0 0
ATL 2016-10-29 1 0 1 0
ATL 2016-10-31 2 0 1 0
ATL 2016-11-02 2 1 1 0
BKN 2016-10-26 0 0 0 1
BKN 2016-10-28 1 0 0 1
BKN 2016-10-29 1 0 0 2
BKN 2016-10-31 1 1 0 2
If you could please help that would be great.
Thanks, Tom
yields
The first idea is that there are 4 kinds of "events" corresponding to the 4 combinations of possible values from the
WinLoss
andHomeAway
columns:(W,H)
,(W,A)
,(L,H)
and(L,A)
.Thus it is natural to want to combine the
WinLoss
andHomeAway
columns into a single column:and then use
get_dummies
to convert this Series into a table of 1's and 0's:Now by comparison with your desired result, we can see we also want to take a cumulative sum, grouped by
TEAM_ABBREVIATION
:The next two lines reorders and renames the columns:
Finally, we can use
pd.concat
to concatenatedf
withresult
and build the desired DataFrame: