I have excel file which looks like this:
currently i have excel file with this data: column 1 - date, column 2 - user id, column 3 - parent of user, column 4 - grandparent of user(i.e. parent of previous user), column 5 - parent of previous user, column 6 - parent of previous user, column 7 - parent of previous user. data contains monthly information for every user, i.e. each user id repeats maximum of 12 times.
My desired output is to count multiple levels of referrals.
I want to calculate how many users did every user invited in the first column, then I need to calculate number of users which was invited by the users which was invited by our user(i.e. grandchildren of first user) and so on. finally I want output which would look like pyramid if it was visible, first member of the pyramid will be user, next members will be users which were invited by first user. next members will be users which were invited by the users from previous level(i.e. users which were invited by first user). I am working with python pandas but can't do anything yet. I can work with excel as well.
I've tried too many variants, I don't remember any of them right now, my last code was this:
import pandas as pd
df = pd.read_excel("L0-L5.xlsx", sheet_name="Sheet1")
df["invited_at"] = pd.to_datetime(df["Rem Date"], format="%d.%m.%Y")
levels = ["L0", "L1", "L2", "L3", "L4", "L5"]
df_list = []
for i, level in enumerate(levels):
temp_df = df[df[level].notnull()]
temp_df["level"] = level
temp_df["user_id"] = temp_df[levels[i]]
df_list.append(temp_df)
df = pd.concat(df_list)
result = df.pivot_table(index="user_id", columns="level", values="invited_at", aggfunc="count",
fill_value=0)
result = result[["L1", "L2", "L3", "L4", "L5"]]
result.to_excel("L5-L0.xlsx")
and guess, it didn't work.