Problem with websocket output into dataframe with pandas

279 Views Asked by At

I have a websocket connection to binance in my script. The websocket runs forever as usual. I got each pair's output as seperate outputs for my multiple stream connection.

for example here is the sample output:

{'stream': 'reefusdt@kline_1m', 'data': {'e': 'kline', 'E': 1651837066242, 's': 'REEFUSDT', 'k': {'t': 1651837020000, 'T': 1651837079999, 's': 'REEFUSDT', 'i': '1m', 'f': 95484416, 'L': 95484538, 'o': '0.006620', 'c': '0.006631', 'h': '0.006631', 'l': '0.006619', 'v': '1832391', 'n': 123, 'x': False, 'q': '12138.640083', 'V': '930395', 'Q': '6164.398584', 'B': '0'}}}
{'stream': 'ethusdt@kline_1m', 'data': {'e': 'kline', 'E': 1651837066253, 's': 'ETHUSDT', 'k': {'t': 1651837020000, 'T': 1651837079999, 's': 'ETHUSDT', 'i': '1m', 'f': 1613620941, 'L': 1613622573, 'o': '2671.86', 'c': '2675.79', 'h': '2675.80', 'l': '2671.81', 'v': '1018.530', 'n': 1633, 'x': False, 'q': '2723078.35891', 'V': '702.710', 'Q': '1878876.68612', 'B': '0'}}}
{'stream': 'ancusdt@kline_1m', 'data': {'e': 'kline', 'E': 1651837066257, 's': 'ANCUSDT', 'k': {'t': 1651837020000, 'T': 1651837079999, 's': 'ANCUSDT', 'i': '1m', 'f': 10991664, 'L': 10992230, 'o': '2.0750', 'c': '2.0810', 'h': '2.0820', 'l': '2.0740', 'v': '134474.7', 'n': 567, 'x': False, 'q': '279289.07500', 'V': '94837.8', 'Q': '197006.89950', 'B': '0'}}}

is there a way to edit this output like listed below. Main struggle is each one of the outputs are different dataframes. I want to merge them into one single dataframe. Output comes as a single nested dict which has two columns: "stream" and "data". "Data" has 4 columns in it and the last column "k" is another dict of 17 columns. I somehow managed to get only "k" in it:

 json_message = json.loads(message)
result = json_message["data"]["k"]

and sample output is:

{'t': 1651837560000, 'T': 1651837619999, 's': 'CTSIUSDT', 'i': '1m', 'f': 27238014, 'L': 27238039, 'o': '0.2612', 'c': '0.2606', 'h': '0.2613', 'l': '0.2605', 'v': '17057', 'n': 26, 'x': False, 'q': '4449.1499', 'V': '3185', 'Q': '831.2502', 'B': '0'}
{'t': 1651837560000, 'T': 1651837619999, 's': 'ETCUSDT', 'i': '1m', 'f': 421543741, 'L': 421543977, 'o': '27.420', 'c': '27.398', 'h': '27.430', 'l': '27.397', 'v': '2988.24', 'n': 237, 'x': False, 'q': '81936.97951', 'V': '1848.40', 'Q': '50688.14941', 'B': '0'}
{'t': 1651837560000, 'T': 1651837619999, 's': 'ETHUSDT', 'i': '1m', 'f': 1613645553, 'L': 1613647188, 'o': '2671.38', 'c': '2669.95', 'h': '2672.38', 'l': '2669.70', 'v': '777.746', 'n': 1636, 'x': False, 'q': '2077574.75281', 'V': '413.365', 'Q': '1104234.98707', 'B': '0'}

I want to merge these outputs into a single dataframe of 6 columns and (almost 144 rows) which is closer to ss provided below. The only difference is my code creates different dataframes for each output.

DESIRED OUTPUT

1

There are 1 best solutions below

3
On

Create a list of your messages. Your messages list should be like below:

message_list = [message1,message2,message3]

df = pd.DataFrame()
for i in range(len(message_list)):
    temp_df = pd.DataFrame(message_list[i], index=[i,])
    df = df.append(temp_df, ignore_index = True)
    
print(df)

    t   T   s   i   f   L   o   c   h   l   v   n   x   q   V   Q   B
0   1651837560000   1651837619999   CTSIUSDT    1m  27238014    27238039    0.2612  0.2606  0.2613  0.2605  17057   26  False   4449.1499   3185    831.2502    0
1   1651837560000   1651837619999   ETCUSDT 1m  421543741   421543977   27.420  27.398  27.430  27.397  2988.24 237 False   81936.97951 1848.40 50688.14941 0
2   1651837560000   1651837619999   ETHUSDT 1m  1613645553  1613647188  2671.38 2669.95 2672.38 2669.70 777.746 1636    False   2077574.75281   413.365 1104234.98707   0

You can manipulate the dataframe later as needed.