I'm trying to combine all URL fields returned by an id in the same DF into a comma delimitted field. What I can't figure out is why my assign is not working and my for loop is..
The table looks like as follows:
| Ref | Option1_Ref | URL |
|---|---|---|
| 1 | 2,3,4 | /path1 |
| 2 | 1,4,5 | /path2 |
| 3 | 1,6 | /path3 |
| 4 | 1,5 | /path4 |
| 4 | 2,5 | /path5 |
| 5 | 3,1 | /path6 |
| 7 | 2,5 | /path7 |
using assign :
df=df.assign(options_url1=df.loc[df['Ref'].isin(df["Option1_Ref"].str.split(',')),['URL','Ref']].drop_duplicates('Ref')['URL'].str.join(','))
This is just returning an empty column.
for loop:
for x in df.index: df.loc[x,'options_url1']=','.join(df.loc[df['Ref'].isin(df.loc[x,"Option1_Ref"].split(',')),['URL','Ref']].drop_duplicates('Ref')['URL'].array)
This is working as expected
I'm expecting:
| Ref | Option1_Ref | URL | options_url1 |
|---|---|---|---|
| 1 | 2,3,4 | /path1 | /path2,/path3,/path4 |
| 2 | 1,4,5 | /path2 | /path1,/path4,/path6 |
| 3 | 1,6 | /path3 | /path1 |
| 4 | 1,5 | /path4 | /path1,/path6 |
| 4 | 2,5 | /path5 | /path2,/path6 |
| 5 | 3,1 | /path6 | /path1,/path3 |
| 7 | 2,5 | /path5 | /path2,/path6 |
Create
Seriesby remove duplicates byDataFrame.drop_duplicates, convertRefto strings and map splitted values withDataFrame.explodebySeries.map, last aggregatejoin:Or use list comprhension for mapping and
join, if possible some values not exist addifstatement: