I am trying to use various charting packages for ohlc bar charting. Some success but I keep getting stuck on "TypeError: Expect data.index as DatetimeIndex". The samples that I copy work perfectly fine, like this below:
import yfinance as yf
import mplfinance as mpf
symbol = 'AAPL'
df = yf.download(symbol, period='6mo')
mpf.plot(df, type='candle')
which has the following type of index for the df :
DatetimeIndex(['2022-06-30', '2022-07-01', '2022-07-05', '2022-07-06',
'2022-12-29', '2022-12-30'],
dtype='datetime64[ns]', name='Date', length=128, freq=None)
So I am trying to get my dataframe index to look the same, with a DatetimeIndex format. My index looks like this:
0 2022-11-09 14:30:00+00:00
1 2022-11-09 14:35:00+00:00
2 2022-11-09 14:40:00+00:00
3 2022-11-09 14:45:00+00:00
4 2022-11-09 14:50:00+00:00
...
2299 2022-12-21 20:35:00+00:00
2300 2022-12-21 20:40:00+00:00
2301 2022-12-21 20:45:00+00:00
2302 2022-12-21 20:50:00+00:00
2303 2022-12-21 20:55:00+00:00
Name: date, Length: 2304, dtype: object
Note the default integer index on left. I believe that I dont need to format it exactly the same, as long the internal datatype being datetime64 in a DatetimeIndex form.
thanks for any help.
So I tried this ( and whole lot of other ideas)
df['timestamp'] = pd.to_datetime(df.date)
new = pd.DataFrame(index=[df.timestamp])
which gives
MultiIndex([('2022-11-09 14:30:00+00:00',),
...
('2022-12-21 20:55:00+00:00',)],
names=['timestamp'], length=2304)
as well as this:
df['timestamp'] = mpl_dates.datestr2num(df.date)
which gives:
MultiIndex([(19305.604166666668,),
( 19305.60763888889,),
(19347.868055555555,),
(19347.871527777777,)],
names=['timestamp'], length=2304)
and neither work.
Am I on the right track, and what is the correct way to do this? How do I get rid of the MultiIndex? And how do I get it to be of type DatetimeIndex?
responding to the question on source of data, its from IBKR, using API routines and I am storing the data in an intermediary CSV file. It has the following format:
,date,open,high,low,close,volume,barCount,average 0,2022-11-09 14:30:00+00:00,174.44,174.44,173.8,174.05,994,64,174.408 1,2022-11-09 14:35:00+00:00,174.11,174.38,173.58,173.62,160,123,173.95 2,2022-11-09 14:40:00+00:00,173.59,173.6,173.14,173.56,98,73,173.363 3,2022-11-09 14:45:00+00:00,173.55,174.02,173.52,173.96,88,53,173.716
I was reading in with the following: `bars = pd.read_csv(name, header=0, index_col=0, sep=",")