I am using this code to remove outliers.
import pandas as pd
import numpy as np
from scipy import stats
df = pd.DataFrame(np.random.randn(100, 3))
df[np.abs(stats.zscore(df[0])) < 1.5]
This works. We can see that the number of rows of data frame has reduced. However, I need to remove outliers in the percentage change values of a similar data frame.
df = df.pct_change()
df.plot.line(subplots=True)
df[np.abs(stats.zscore(df[0])) < 1.5]
This results in an empty data frame. What am I doing wrong? Should the value 1.5 be adjusted? I tried several values. Nothing works.
It's because the first value of your dataframe is null due to
pct_change. So usefillnato remove nan value.Output: