apply a function to each level of grouping factor and create new column in existing data frame

72 Views Asked by At

I have a data frame (df) that looks like this

           timestamp                  datetime        date      time    open  \
0      1667520000000 2022-11-04 00:00:00+00:00  2022-11-04  00:00:00  0.2186   
1      1667606400000 2022-11-05 00:00:00+00:00  2022-11-05  00:00:00  0.2589   
2      1667692800000 2022-11-06 00:00:00+00:00  2022-11-06  00:00:00  0.2459   
3      1667779200000 2022-11-07 00:00:00+00:00  2022-11-07  00:00:00  0.2315   
4      1667865600000 2022-11-08 00:00:00+00:00  2022-11-08  00:00:00  0.2353   
              ...                       ...         ...       ...     ...   
15012  1675728000000 2023-02-07 00:00:00+00:00  2023-02-07  00:00:00  0.2449   
15013  1675814400000 2023-02-08 00:00:00+00:00  2023-02-08  00:00:00  0.2610   
15014  1675900800000 2023-02-09 00:00:00+00:00  2023-02-09  00:00:00  0.2555   
15015  1675987200000 2023-02-10 00:00:00+00:00  2023-02-10  00:00:00  0.2288   
15016  1676073600000 2023-02-11 00:00:00+00:00  2023-02-11  00:00:00  0.2317   
         high     low   close        volume              symbol  
0      0.2695  0.2165  0.2588  1.239168e+09  1000LUNC/USDT:USDT  
1      0.2788  0.2414  0.2458  1.147000e+09  1000LUNC/USDT:USDT  
2      0.2554  0.2292  0.2315  5.137089e+08  1000LUNC/USDT:USDT  
3      0.2398  0.2263  0.2352  4.754763e+08  1000LUNC/USDT:USDT  
4      0.2404  0.1320  0.1895  1.618936e+09  1000LUNC/USDT:USDT  
       ...     ...     ...           ...                 ...  
15012  0.2627  0.2433  0.2611  8.097549e+07       ZRX/USDT:USDT  
15013  0.2618  0.2432  0.2554  7.009100e+07       ZRX/USDT:USDT  
15014  0.2651  0.2209  0.2287  1.217487e+08       ZRX/USDT:USDT  
15015  0.2361  0.2279  0.2317  6.072029e+07       ZRX/USDT:USDT  
15016  0.2418  0.2300  0.2409  2.178281e+07       ZRX/USDT:USDT 

I want to apply a function from pandas ta, called bbands to each level of symbol using the column 'close' as the input. The function return multiple variables, but I only want to keep the one labeled 'BBM_20_2.0' and store this as another column in the df.

If I were to just apply the function to entire df ignoring the fact that each symbols has to be treated separately I would do this

daily_df['bbm'] = bbands(daily_df.close, 20, 2)['BBM_20_2.0']

I have tied to use groupby like this

daily_df['bbm'] = daily_df.groupby(["symbol"]).apply(bbands(daily_df.close, 20, 2)['BBM_20_2.0'])

but Im getting errors. Can anyone help?

1

There are 1 best solutions below

0
On BEST ANSWER

Did you try

bbm = daily_df.groupby(["symbol"]).apply(
    lambda grp: bbands(grp.close, 20, 2)['BBM_20_2.0']
).reset_index()

bbm.columns=["symbol", "bbm"]

df = df.merge(bbm, on=['symbol'])