How to check column values of a dataframe with dictionary items

783 Views Asked by At

I have a dataframe as follows:

>>> df
     ID   first    last
0   123     Joe  Thomas
1   456   James   Jonas
2   675   James   Jonas
3   457   James  Thomas
4   676  Joseph  Thomas
5   678    Joey  Thomas
6   670     Jim   Jonas
7   671    Katy   Perry

And then I have a dictionary which has keys as "nick name" and value list as all the names that have that particular nickname as follows:

nicknames =  {'KATY': ['KATHERINE', 'KATHLEEN'], 'CHET': ['CHESTER'], 'PENNY': ['PENELOPE'], 'PAT': ['PATRICIA', 'PATRICK'], 'BART': ['BARTHOLOMEW'], 'BELLE': ['ARABELLA', 'BELINDA', 'ISABEL', 'ISABELLE', 'ROSABEL'], 'JOE': ['JOSEPH', 'JOSHUA'], 'JOEY': ['JOSEPH', 'JOSOPHINE'], 'JIM': ['JAMES']}

From the dataframe, I want to check all the rows that have nicknames and for them a proper name exists in another row. And get the output as:

output = [[123, 678], [670]]

How do I get that? Thanks!

ANSWER:

    final1={}
    final=[]
    tuplist = zip(df['ID'], df['first'], df['last'])
    for i in range(len(tuplist)):
        if tuplist[i][1].upper() in nicknames.keys():
            val_list = nicknames.get(tuplist[i][1].upper())
            for item in val_list:
                l1 = [j[1].upper() for j in tuplist]
                l2 = [j[2] for j in tuplist if j[1].upper() == item]
                if item in l1 and tuplist[i][2] in l2: 
                    final.append((tuplist[i][0], item))
                    break
    #print final

    c = Counter([y[1] for y in final])
    for t in final:
        final1[t[0]] = c.get(t[1])   
    return final1
0

There are 0 best solutions below