python Count Column

101 Views Asked by At

I'm trying to create new column with the count of each attribute (Number) but it's given me the total count of row as shown below

POM = pd.read_excel('POM.xlsx' , sheet_name='Sheet1')
POM['Merged']=POM['ACD ID'].astype(str)+PG['Date'].astype(str)
POM['Count']=POM['Merged'].count()
print(POM.head(2))
POM.to_excel('Test.xlsx' , index=False)

Result

        Date  ACD ID           Merged  Count
0 2022-03-08   14004  140042022-03-08    986
1 2022-03-09   14004  140042022-03-09    986

How can I count the unique code which is repetition found in column (Merged)?

# Date              ACD ID     Merged     Count
2022-03-08 00:00:00 14004   140042022-03-08 10
2022-03-09 00:00:00 14004   140042022-03-09 49
2022-03-10 00:00:00 14004   140042022-03-10 62

i have try the bellow code as answered & i found accurate answer but in same time the Merged New coulmn that i created by group(date+ ACD ID) are not accuaret as shown below , that date in ( merged ) not matched the date

POM = pd.read_excel('POM.xlsx' , sheet_name='Sheet1')
POM['Merged']=POM['ACD ID'].astype(str)+PG['Date'].astype(str)
#POM['Count'] = POM['Merged'].count()
POM['Count'] = POM['Merged'].copy().replace(POM['Merged'].value_counts().to_dict())
#POM['Count']=POM.groupby(["Merged"])["Merged"].transform("count")
#POM['counts'] = POM['Merged'].map(POM['Merged'].value_counts())

print(POM.head(6))
#POM.to_excel('Test.xlsx' , index=False)

result

        Date  ACD ID           Merged  Count
0 2022-03-08   14004  140042022-03-08      1
1 2022-03-08   14005  140052022-03-09      1
2 2022-03-08   14006  140062022-03-10      1
3 2022-03-08   14007  140072022-03-15      1
2

There are 2 best solutions below

1
On

Try groupby() and then transform("count"):

import pandas

df = pandas.DataFrame({"date": ["2022", "2023", "2023", "2024"],
                       "id": ["a", "a", "a", "c"]})
df["merged"] = df["date"] + "_" + df["id"]
print(df)
>    date id  merged
  0  2022  a  2022_a
  1  2023  a  2023_a
  2  2023  a  2023_a
  3  2024  c  2024_c

df["counts"] = df.groupby(["merged"])["merged"].transform("count")
print(df)
>    date id  merged  counts
  0  2022  a  2022_a       1
  1  2023  a  2023_a       2
  2  2023  a  2023_a       2
  3  2024  c  2024_c       1

Check out this thread.

1
On

You could do something like this:

df['Count'] = df['Merged'].copy().replace(df['Merged'].value_counts().to_dict())

You create a copy of Merged, then count the values in Merged and convert it to a dict of {id: count}, then use that dictionary to replace the id values in the copied Merged column.

This has the added benefit over groupby that you can retain your original indexing and shape of the dataframe while still filling in the counts of each id.

Ex:

df = pd.DataFrame({"date": ["2022", "2023", "2023", "2024"],
                       "id": ["a", "a", "a", "c"]})
df["merged"] = df["date"] + "_" + df["id"]
df['Count'] = df['merged'].copy().replace(df['merged'].value_counts().to_dict())

   date id  merged  Count
0  2022  a  2022_a      1
1  2023  a  2023_a      2
2  2023  a  2023_a      2
3  2024  c  2024_c      1