How to resolve Boolean Series ReIndex warning

67 Views Asked by At

I have a dataframe

   Unnamed: 0                       game score home_odds draw_odds away_odds country                 league             datetime
0           0  Sport Recife - Imperatriz   2:2      1.36      4.31      7.66  Brazil  Copa do Nordeste 2020  2020-02-07 00:00:00
1           1           ABC - America RN   2:1      2.62      3.30      2.48  Brazil  Copa do Nordeste 2020  2020-02-02 22:00:00
2           2  Frei Paulistano - Nautico   0:2      5.19      3.58      1.62  Brazil  Copa do Nordeste 2020  2020-02-02 00:00:00
3           3    Botafogo PB - Confianca   1:1      2.06      3.16       3.5  Brazil  Copa do Nordeste 2020  2020-02-02 22:00:00
4           4          Fortaleza - Ceara   1:1      2.19      2.98      3.38  Brazil  Copa do Nordeste 2020  2020-02-02 22:00:00

I am performing the following functions

df['game'] = df['game'].astype(str).str.replace('(\(\w+\))', '', regex=True)
df['league'] = df['league'].astype(str).str.replace('(\s\d+\S\d+)$', '', regex=True)
df['game'] = df['game'].astype(str).str.replace('(\s\d+\S\d+)$', '', regex=True)
df[['home_team', 'away_team']] = df['game'].str.split(' - ', expand=True, n=1)
df[['home_score', 'away_score']] = df['score'].str.split(':', expand=True)
df['away_score'] = df['away_score'].astype(str).str.replace('[a-zA-Z\s\D]', '', regex=True)
df['home_score'] = df['home_score'].astype(str).str.replace('[a-zA-Z\s\D]', '', regex=True)
df = df[df.home_score != "."]
df = df[df.home_score != ".."]
df = df[df.home_score != "."]
df = df[df.home_odds != "-"]
df = df[df.draw_odds != "-"]
df = df[df.away_odds != "-"]
m = df[['home_odds', 'draw_odds', 'away_odds']].astype(str).agg(lambda x: x.str.count('/'), 1).ne(0).all(1)
n = df[['home_score']].agg(lambda x: x.str.count('-'), 1).ne(0).all(1)
o = df[['away_score']].agg(lambda x: x.str.count('-'), 1).ne(0).all(1)
df = df[~m]
df = df[~n]
df = df[~o]
df = df[df.home_score != '']
df = df[df.away_score != '']
df = df.dropna()

However when I do that, I get the warning:

UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  df = df[~n]
UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  df = df[~o]

How do I resolve this?

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

1

There are 1 best solutions below

0
jezrael On

I think you can try change:

n = df[['home_score']].agg(lambda x: x.str.count('-'), 1).ne(0).all(1)
o = df[['away_score']].agg(lambda x: x.str.count('-'), 1).ne(0).all(1)

to:

n = df['home_score'].str.count('-').ne(0)
o = df['away_score'].str.count('-').ne(0)

And it should be same like:

n = ~df['home_score'].str.contains('-')
o = ~df['away_score'].str.contains('-')

also should be change:

df = df[df.home_score != "."]
df = df[df.home_score != ".."]
df = df[df.home_score != "."]
df = df[df.home_odds != "-"]
df = df[df.draw_odds != "-"]
df = df[df.away_odds != "-"]

to:

df = df[~df.home_score.isin([".",".."]) | 
         df[['home_odds','draw_odds','away_odds']].ne("-").any(axis=1)]