I am successfully requested multiple stocks real-time data from realtimeBar from ibapi library. Like this code:
def stream_data(self, reqId, time, open, high, low, close, volume, wap, count):
print(reqId, time, open, high, low, close, volume, wap, count)
Output looks like this:
1 1647459265 158.06 158.09 158.04 158.08 175 158.06 82
0 1647459265 832.66 832.86 832.44 832.66 4 832.64 4
2 1647459265 14.89 14.9 14.89 14.9 7 14.896 5
1 1647459270 158.08 158.14 158.08 158.13 76 158.11 43
2 1647459270 14.91 14.9237 14.91 14.91 65 14.9179 46
0 1647459270 832.98 833.7 832.98 833.36 36 833.48 12
And it goes forever until I stopped the script. I have a problem to insert/adapt this print in DataFrame from pandas. How should I do this? Could you show me a snippet, please? I was trying different options but different errors occurred, that's why I don't show my error type here. I want to achieve table like this:
0 1647459265 832.66 832.86 832.44 832.66 4 832.64 4 1 1647459265 158.06 158.09 158.04 158.08 175 158.06 82 2 1647459265 14.89 14.9 14.89 14.9 7 14.896 5
0 1647459270 832.98 833.7 832.98 833.36 36 833.48 12 1 1647459270 158.08 158.14 158.08 158.13 76 158.11 43 2 1647459270 14.91 14.9237 14.91 14.91 65 14.9179 46
The goal of this DataFrame is to sort real time (every 5 sec) streaming data by reqId (because this is different stocks in fact), and make calculations (like average price, percentage change, etc.) on let's say 10 last bar (recognized by unixtime) I received. Should I use MultiIndex? How? Or multiple DataFrames? Or maybe do you have another method, library, solution? Any help I would highly appreciated :-)
If I understood correctly, every 5 seconds you get values from
reqId, time, open, high, low, close, volume, wap, count
. Then, you want to pass them to a dataframe and sort byreqId
and put all values with the sametime
on a single row.I don't get why you want to combine the values with the same
time
on a single row, but I suspect you don't need to do it to achieve what you want.I think this may help:
This will get the stream values into a dataframe and convert the unixtime to a more friendly datetime format.
Then, you can calculate whatever you want for a given frequency. For example, for your example dataframe, we can calculate the mean of every column every minute:
After your comment, if I understood correctly, you want (or that is already happening, not sure here) to have your
stream_data
function called every 5 seconds and spit out new rows. But you are re-creating every time a dataframe with these rows that come fromstream_data
, and you want these lines added to a dataframe that has all the rows thatstream_data
has returned before.You can just create an empty dataframe and then keep appending new rows to it. Then, you filter the rows you want do do your calculations as I suggested above.