Suppose I have a dataset (df)
Group | Employee_Title | Employee_Name
A | Manager | John
A | Analyst | Adam
A | Analyst | Smith
B | Manager | Bill
B | Analyst | Ed
B | Analyst | Jay
I want to create a new column "Group_Manager" so that the new dataset would be:
Group | Employee_Title | Employee_Name | Group_Manager
A | Manager | John | John
A | Analyst | Adam | John
A | Analyst | Smith | John
B | Manager | Bill | Bill
B | Analyst | Ed | Bill
B | Analyst | Jay | Bill
I am looking for python code that can do this in some "cumulative" way, like (not working right now) :
df['Group_Manager']=df.groupby('Group').apply(lambda Employee_Title,Employee_Name: Employee_Name if Employee_Title=="Manager" else keep previous Group_Manager)
By retrieving the manager's name for each group and then reindex it according to the main dataframe 'Group' column, you can achieve the results that you desired
which results in
Another approach using
itertools.accumulatewhich results in