How to hash multiple columns of different types in Trino/Athena

169 Views Asked by At

I am trying to generate a hash based on multiple columns in a table. The columns could be of different types.

I currently have some complex code that one needs to remember to update anytime another column is added. In addition the coalesce ends up adding to the complexity, plus the need to cast non-string columns.

, murmur3(to_utf8(concat_ws('::',coalesce(value1,'-'),coalesce(value2,'-'),coalesce(cast(value3 as varchar))))) as hash1
, xxhash64(to_utf8(concat_ws('::',coalesce(value1,'-'),coalesce(value2,'-'),coalesce(cast(value3 as varchar))))) as hash2

The valuex fields could be of any type. In my example above, they are string, string, int. The coalesce is needed to make sure null values are handled correctly.

Is there a quicker/easier way to generate a hash from multiple fields without the use of coalesce and/or concat_ws?

I will be using the computed hash to determine if the record changed and only update the record if the record changed using a merge command.

merge into .....
WHEN MATCHED and src.hash <> tgt.hash
    THEN UPDATE SET  ....

I need this to work with Athena Engine 3.0

0

There are 0 best solutions below