I'm trying to remove outliers from the 'Price' column in a dataset. I have been able to create a data frame of the outliers with their corresponding values in other columns but I'm struggling to exclude these entries from the parent dataset. How do i go about this?
this is the code i used to create the new dataframe stated above:
lower_limit = pq1 - 1.5 *iqr
upper_limit = pq3 + 1.5 *iqr
newdf = df[((df['price'] < lower_limit) | (df['price'] > upper_limit))]
newdf
I tried using the tilde(~) sign before i specified the boolean operations but that didn't give the desired results.
You could use the
.loc
attribute to get a sample of your original dataframe that excludes the elements of thenewdf
dataframe through the indeces: