Both df[booleanMask] and df.loc[booleanMask] are working for me but I don't understand why. The shorthand df[] without using .loc I thought applied to the column whereas I am trying to apply to the row, so I thought I needed to use .loc
Here is the specific code:
# Boolean operators
# All the games where a team scored at least 4 goals and won to nil
hw_4_0 = (pl23['FTHG'] >= 4) & (pl23['FTAG'] == 0)
aw_0_4 = (pl23['FTHG'] == 0) & (pl23['FTAG'] >= 4)
pl23.loc[aw_0_4 | hw_4_0]
For example, pl23.loc[aw_0_4 | hw_4_0, :] also works, but pl23.loc[:, aw_0_4 | hw_4_0] doesn't. I thought that df[boolean mask] was short hand for the latter (as with indexing), so why does it work in this instance?
Used pl23.loc[aw_0_4 | hw_4_0] which returned the data frame the query was designed for, whereas I was expecting IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
df[…]
vsdf.loc[…]
applies on columns vs index, when you use labels.If you pass a boolean Series (or other iterable) for boolean indexing, then they both act on the index level. To perform boolean indexing on columns, you need
df.loc[:, …]
Example: