I have a tennis dataset that looks like the following:
tourney_id = ['French Open 2018','French Open 2018','Wimbledon 2018','Wimbledon 2018','Australian Open 2019','Australian Open 2019','US Open 2019','US Open 2019']
player_name = ['Novak Djokovic','Roger Federer','Andy Murray','Rafael Nadal','John Isner','Novak Djokovic','Andy Murray','Roger Federer']
match_num = [103, 103, 217, 217, 104, 104, 243, 243]
df = pd.DataFrame(list(zip(tourney_id, player_name, match_num)),
columns =['TournamentID','Name','MatchID'])
I want to create a dictionary where the keys are the players and the items are also the players (opponents). So it would look like the following based on my dataset:
{'Novak Djokovic': ['Roger Federer','John Isner'],
'Roger Federer': ['Novak Djokovic','Andy Murray'],
'Andy Murray': ['Rafael Nadal','Roger Federer'],
'Rafael Nadal': ['Andy Murray'],
'John Isner': ['Novak Djokovic']}
I want to identify players who have played each other when they have the same values for both TournamentID and MatchID.
The last thing I tried was: df.set_index(['TournamentID','MatchID'])['Name'].to_dict()
but that's not quite what I'm looking for.
What can I try next?
Here's how you can do it:
Output: