for x in range(len(fclub1)-1):
for y in range(x+1,len(fclub1)-1):
if SequenceMatcher(None,fclub1[x], fclub1[y]).ratio() > 0.4:
if SequenceMatcher(None,fclub2[x], fclub2[y]).ratio() > 0.4:
if float(fbest_odds_1[x]) < float(fbest_odds_1[y]):
fbest_odds_1[x] = fbest_odds_1[y]
if float(fbest_odds_x[x]) < float(fbest_odds_x[y]):
fbest_odds_x[x] = fbest_odds_x[y]
if float(fbest_odds_2[x]) < float(fbest_odds_2[y]):
fbest_odds_2[x] = fbest_odds_2[y]
fclub1.pop(y)
fclub2.pop(y)
fbest_odds_1.pop(y)
fbest_odds_x.pop(y)
fbest_odds_2.pop(y)
It can't reliably match club names from different bookkeeps, for example Manchester United and Man. Utd.
I tried fixing it with SequenceMatcher and making it recognize at least some part of the club name, but then it started to compare different clubs saying that they are the same:Aston Villa - Atherton Collieries and Leeds - Liversedge
The best solution is probably the most boring, just make a list of often used names for each team and use that, such as: