I am currently trying to create a tree diagram from a dataframe that holds certain scripts that need to be executed in a certain order depending on their parent/child
relationship. However, I am not receiving the correct number of children. My DataFrame looks like this below:
And this is the code I have so far:
from anytree import Node, RenderTree
def add_nodes(nodes, parent, child):
if parent not in nodes:
nodes[parent] = Node(parent)
if child not in nodes:
nodes[child] = Node(child)
nodes[child].parent = nodes[parent]
nodes = {}
for parent, child in zip(df["parent_script"],df["child_script"]):
add_nodes(nodes, parent, child)
roots = list(df[~df["parent_script"].isin(df["child_script"])]["parent_script"].unique())
for root in roots:
for pre, _, node in RenderTree(nodes[root]):
print("%s%s" % (pre, node.name))
I have 3 roots and several parents that are displayed correctly, but the children seem to go missing after they have been included once.
Results I get:
I think because of this logic, I am not receiving the full results I expect.
if child not in nodes:
nodes[child] = Node(child)
nodes[child].parent = nodes[parent]
I need all the children to be inside the tree. So for example after the 3rd Root (00_06_MaxBIS_v2-2.sql)
has 00_07_AnlagenDatenLaden_v1-7.sql
as a child but is missing the 00_Gebietsstrukturen_v1-2.sql
.
Is there any way to include duplicate children and have them inside the tree correctly? And what kind of logic can I use to receive the expected results?