Groupby with multiple conditions and count

72 Views Asked by At

Hi would be great to get some help on this for Python Pandas.TIA I have a dataframe which has 1M rows with below columns:

PID lurn_fls locality Comparision
ACT933 2 Kambah mbn:match both non-empty
ACT934 3F Charwood xne:mismatch neither empty
ACT935 3R Glenden mbe:match both empty
. . . .
. . . .
ACT155 4 Glebe xhe:mismatch h_empty

I need to do a groupby "lurn_fls" and also count the different 'Comparision' column results per group, so that my result should look like below table. for eg:

lurn_fls mbn:match both non-empty xhe:mismatch h_empty xne:mismatch neither empty Total
1 600 12 15 XXX
2 700 10 14 XXX
3F 800 8 10 XXX
3R 900 6 12 XXX
4 500 4 10 XXX
5 400 2 14 XXX
2

There are 2 best solutions below

2
Andrej Kesely On BEST ANSWER

IIUC, you can use pd.crosstab:

out = pd.crosstab(df["lurn_fls"], df["Comparision"])
out["Total"] = out.sum(axis=1)

print(out)

Prints:

Comparision  mbe:match both empty  mbn:match both non-empty  xhe:mismatch h_empty  xne:mismatch neither empty  Total
lurn_fls                                                                                                            
2                               0                         1                     0                           0      1
3F                              0                         0                     0                           1      1
3R                              1                         0                     0                           0      1
4                               0                         0                     1                           0      1
0
Parvez Alam On

You can try:

df = pd.DataFrame({'kind': ['cat', 'dog', 'cat', 'dog'],
                   'height': [9.1, 6.0, 9.5, 34.0],
                   'weight': [7.9, 7.5, 9.9, 198.0]
    }) 
df.groupby('kind').agg(
max_height=('height', 'max'), min_weight=('weight', 'min'),)