Spark Iceberg table merge into update all

420 Views Asked by At

I am trying to update the entire row when matched. I want this command to be an upsert.

WHEN MATCHED THEN UPDATE SET target.* = updates.* is not the right syntax.
How can I update the entire row?

MERGE INTO glue_catalog.{DATABASE_NAME}.{TABLE_NAME} target
USING (SELECT * FROM upsert_items) updates
ON {join_condidtion}
WHEN MATCHED THEN UPDATE SET target.* = updates.*
WHEN NOT MATCHED THEN INSERT *

Running Spark version 3.3.0-amzn-1
Iceberg v1.0.0

2

There are 2 best solutions below

0
On

Based on the example given with the WHEN NOT MATCHED THEN INSERT *, it should be the same for UPDATE.

MERGE INTO glue_catalog.{DATABASE_NAME}.{TABLE_NAME} target
USING (SELECT * FROM upsert_items) updates
ON {join_condidtion}
WHEN MATCHED THEN UPDATE *
WHEN NOT MATCHED THEN INSERT *
0
On

SparkSqlParser's grammar file has the following definition

matchedAction
    : DELETE
    | UPDATE SET ASTERISK
    | UPDATE SET assignmentList
    ;

notMatchedAction
    : INSERT ASTERISK
    | INSERT LEFT_PAREN columns=multipartIdentifierList RIGHT_PAREN
        VALUES LEFT_PAREN expression (COMMA expression)* RIGHT_PAREN
    ;

notMatchedBySourceAction
    : DELETE
    | UPDATE SET assignmentList
    ;

According to it, we can infer the SQL to be

MERGE INTO glue_catalog.{DATABASE_NAME}.{TABLE_NAME} target
USING (SELECT * FROM upsert_items) updates
ON {join_condidtion}
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

It works on Spark 3.3.4 with iceberg-spark-runtime-3.3_2.12:1.4.3