counting consequtive duplicate elements in a dataframe and storing them in a new colum

49 Views Asked by At

I am trying to count the consecutive elements in a data frame and store them in a new column. I don't want to count the total number of times an element appears overall in the list but how many times it appeared consecutively, i used this:

a=[1,1,3,3,3,5,6,3,3,0,0,0,2,2,2,0]
df = pd.DataFrame(list(zip(a)), columns =['Patch']) 
df['count'] = df.groupby('Patch').Patch.transform('size') 
print(df)

this gave me a result like this:

Patch  count
0       1      2
1       1      2
2       3      5
3       3      5
4       3      5
5       5      1
6       6      1
7       3      5
8       3      5
9       0      4
10      0      4
11      0      4
12      2      3
13      2      3
14      2      3
15      0      4

however i want the result to be like this:

    Patch  count
0       1      2
1       3      3
2       5      1
3       6      1
4       3      2
5       0      3
6       2      3
7       0      1
1

There are 1 best solutions below

1
On
df = (
    df.groupby((df.Patch != df.Patch.shift(1)).cumsum())
    .agg({"Patch": ("first", "count")})
    .reset_index(drop=True)
    .droplevel(level=0, axis=1)
    .rename(columns={"first": "Patch"})
)
print(df)

Prints:

   Patch  count
0      1      2
1      3      3
2      5      1
3      6      1
4      3      2
5      0      3
6      2      3
7      0      1