reshaping and rearranging a pandas table

451 Views Asked by At

I have the following dataframe (pandas version 0.13.1)

>>> import pandas as pd
>>> DF = pd.DataFrame({'Group':['G1','G1','G2','G2'],'Start':['10','10','12','13'],'End':['13','13','14','15'],'Sample':['S1','S2','S3','S3'],'Status':['yes','yes','no','yes'],'pValue':[0.13,0.12,0.96,0.76],'pValueString':['13/100','12/100','96/100','76/100'],'desc':['aaaaaa','bbbbbb','aaaaaa','cccccc']})
>>> DF
  End Group Sample Start Status  pValue pValueString desc
0  13    G1     S1    10    yes    0.13       13/100 aaaaaa   
1  13    G1     S2    10     no    0.12       12/100 bbbbbb
2  14    G2     S3    12     no    0.96       96/100 aaaaaa
3  15    G2     S3    13    yes    0.76       76/100 cccccc

[4 rows x 8 columns]

To the dataframe above

  1. I would like to groupby 'Group'.
  2. Then groupby a Start-End couplet.
  3. Pivot the sample values for each group. aggregate by max(pValue)
  4. Get the corresponding Status, desc corresponding to the sample with the higher pvalue and replace its value with a pValueString.

I need to ultimately get this to the following format

Group Start End Sample           Status  desc
                    S1   S2
G1    10    13    13/100 12/100  yes     aaaaaa
                    S3
G2    12    14    96/100         no      aaaaaa
      13    15    76/100         yes     cccccc

I have tried to use pivot_table and groupby but to no avail. Any help would be much appreciated.

I have

grouped=DF.groupby('Group')

for g,v in grouped: pandas.pivot_table(data=v,values=['pValue','pValueString']),rows=['Group','Start','End'],cols=['Sample'])['pValueString']

How do I get the corresponding desc and Status?

1

There are 1 best solutions below

1
On BEST ANSWER

For pandas pivot table, you pass the rows you want as index and the columns you want as colums:

pvt = DF.pivot_table(index = ['Group','Start','End','Status'], columns = ['Sample'])
pvt
Out[209]: 
                       pValue            
Sample                     S1    S2    S3
Group Start End Status                   
G1    10    13  yes      0.13  0.12   NaN
G2    12    14  no        NaN   NaN  0.96
      13    15  yes       NaN   NaN  0.76

Then for your