Is it possible to keep a shared state between windows when using UDFs in BigQuery?

89 Views Asked by At

This is a follow up question to my previous question about being able to emulate aggregate functions (like in PGSQL) in BigQuery.

The solution propsed in the previous question does indeed work for cases where the function applied on each window is independant of the previous window - like calculating simple average etc., But when calculating recursive functions like exponential moving average, where the formula is: EMA[i] = price[i]*k + EMA[i-1]×(1−k)

Using the same example from the previous question,

CREATE OR REPLACE FUNCTION temp_db.ema_func(arr ARRAY<int64>, window_size int8)
RETURNS int64 LANGUAGE js AS """
    if(arr.length<=window_size){
        // calculate a simple moving average till end of first window
        var SMA = 0;
        for(var i = 0;i < arr.length; i++){
            SMA = SMA + arr[i]
        }
        return SMA/arr.length
    }else{
        // start calculation of EMA where EMA[i-1] is the SMA we calculated for the first window
        // note: hard-coded constant (k) for the sake of simplicity
        // the problem: where do I get EMA[i-1] or prev_EMA from?
        // in this example, we only need the most recent value, but in general case, we would 
        // potentially have to do other calculations with the new value 
        return curr[curr.length-1]*(0.05) + prev_ema*(1−0.05)
    }
""";

select s_id, temp_db.ema_func(ARRAY_AGG(s_price) over (partition by s_id order by s_date rows 40 preceding), 40) as temp_col
from temp_db.s_table;

Storing state variable as a custom type is very easy in PGSQL and is a part of the aggregate function parameters. Would it be possible to do emulate the same functionality with BigQuery?

1

There are 1 best solutions below

1
On BEST ANSWER

i don't think it can be done generically for BigQuery and rather wanted to see the specific case and see if some reasonable workaround is possible. Meantime, again recursiveness and aggregate UDF is something that is not supported [hopefully yet] in BQ, so you might want to submit respective feature request(s).

Meantime checkout BQ scripting but i don't think your case will fit there