So I have been studying the SP500 yearly returns with information downloaded from my quandl subscription. I have used resample() and pct_change() to study the data but my results are not coming as to what is expected for some reason.
sp500_df = quandl.get("MULTPL/SP500_REAL_PRICE_MONTH", authtoken="YOUR OWN AUTH KEY")
sp500_Y_ret_df = sp500_df['Value'].resample('Y').mean().pct_change().dropna()
The expected value for the SP 500 return for year ending 2008 should be -38.5% but my code is showing -17% for some reason? If for some reason you cannot access the data I can provide a .csv file for the data. Thanks a million for the help.
sp500_Y_ret_df.loc['2008-12-31']
output:
-0.17319465450687388
last 20 years:
sp500_Y_ret_df.tail(20)
output:
2001-12-31 -0.164631
2002-12-31 -0.164795
2003-12-31 -0.032081
2004-12-31 0.173145
2005-12-31 0.067678
2006-12-31 0.085836
2007-12-31 0.126625
2008-12-31 -0.173195
2009-12-31 -0.224552
2010-12-31 0.203406
2011-12-31 0.113738
2012-12-31 0.087221
2013-12-31 0.190603
2014-12-31 0.175436
2015-12-31 0.067610
2016-12-31 0.014868
2017-12-31 0.170363
2018-12-31 0.121093
2019-12-31 0.065247
2020-12-31 0.061747
Freq: A-DEC, Name: Value, dtype: float64
USING Random made data:
aapl_df = pd.DataFrame({
'ticker':np.repeat( ['aapl'], 2500 ),
'date':pd.date_range('1/1/2011', periods=2500, freq='D'),
'price':(np.random.randn(2500).cumsum() + 10) }).set_index('date')
aapl_df.head()
date
2011-01-01 aapl 9.011290
2011-01-02 aapl 9.092603
2011-01-03 aapl 9.139830
2011-01-04 aapl 7.782112
2011-01-05 aapl 8.316270
using 'last' as stated yielded closer results but not sure if that is pure luck
aapl_Y_ret_df = aapl_df['price'].resample('Y').last()
aapl_Y_ret_df.tail()
output
date
2013-12-31 18.535328
2014-12-31 15.201832
2015-12-31 36.040411
2016-12-31 42.272464
2017-12-31 20.421079
Freq: A-DEC, Name: price, dtype: float64
--
aapl_Y_ret_df = aapl_df['price'].resample('Y').last().pct_change()
aapl_Y_ret_df.tail()
date
2013-12-31 0.569359
2014-12-31 -0.179846
2015-12-31 1.370794
2016-12-31 0.172918
2017-12-31 -0.516918
Freq: A-DEC, Name: price, dtype: float64
'Adj Close'
, which is what the OP wants.Close
orAdj Close
, and thensum
and multiply by 100.groupby
andDataFrameGroupBy.pct_change
to get the values by year.df['Adj Close'].resample('Y').mean()
returns the mean of the'Adj Close'
values for each year, which is not how to determine the yearly return.-17.4%
. This is not the return.python 3.11.2
,pandas 2.0.0