I have the below pandas DF
steps to create the DF
data=[['1','0','0','0','0'],['2','1','1','0','0|0'],['3','1','1','1','0|1'],['4','2','2','0','0|0|0'],['5','2','2','1','0|0|1'],['6','2','2','2','0|0|2'],['7','3','2','0','0|1|0'],['8','3','2','1','0|1|1'],['9','3','2','2','0|1|2'],['10','3','2','3','0|1|3'],['11','4','3','0','0|0|0|0'],['12','4','3','1','0|0|0|1'],['13','10','3','0','0|1|3|0']]
df = pd.DataFrame(data, columns=['eid','m_eid','level','path_variable','complete_path'])
df=df.drop('complete_path',axis=1)
Here:
eid = employee id
m_eid = manager id
level = level in org(0 being top boss)
path_variable = incremental number assigned to an employee based on there level this number resets for each manager(for example: eid[4,5,6,7,8,9,10] belong to same level 2 but eid[4,5,6] has same manager(m_eid=2) so there path_variable is 0,1,2 whereas eid[7,8,9,10] has a different manager(m_eid=3) so the path_variable restarts from 0)
i want to create a new column which shows the complete path till level 0 for each eid. Like shown below:
Complete path is concatenations of path_variable till level 0(top boss).
Its like path from root node to the edge node. For ex. lets take eid 10
there can be level skips between immediate managers. I m trying to avoid iterrows() due to performance constraints.



IIUC, you can build a directed graph with
networkx, then find theshortest_pathbetween each node and'0', then use that to map thepath_variable:Output:
Graph:
If you already have unique values in
eidyou could avoid the mapper and use:To make it easier to understand, here is a more classical path with the nodes ids (not the path_variables):
Output: