Say I have a dataset like this:
is_a is_b is_c population infected
1 0 1 50 20
1 1 0 100 10
0 1 1 20 10
...
How do I reshape it to look like this?
feature 0 1
a 10/20 30/150
b 20/50 20/120
c 10/100 30/70
...
In the original dataset, I have features a
, b
, and c
as their own separate columns. In the transformed dataset, these same variables are listed under column feature
, and two new columns 0
and 1
are produced, corresponding to the values that these features can take on.
In the original dataset where is_a
is 0
, add infected
values and divide them by population
values. Where is_a
is 1
, do the same, add infected
values and divide them by population
values. Rinse and repeat for is_b
and is_c
. The new dataset will have these fractions (or decimals) as shown. Thank you!
I've tried pd.pivot_table
and pd.melt
but nothing comes close to what I need.
After doing the
wide_to_long
, your question is more clear