I would like to use Upsolver MERGE command in my new transformations to populates S3/Athena and Snowflake tables. Since Snowflake is supporting Upsert command, while defining my transformation job, do I rely on Snowflake functionality and use Upsolver INSER statement or define Upsolver MERGE transformation in the same way I do it for Athena , i.e.
CREATE JOB my_job_upsert
START_FROM = BEGINNING
ADD_MISSING_COLUMNS = TRUE
RUN_INTERVAL = 1 MINUTE
AS MERGE INTO default_glue_catalog.upsolver_samples.test_upsert_with_merge AS target
/*
Use the SELECT statement below to choose your columns and performed the desired transformations.
In this example, we aggregate the sample orders data by customer and filter it to only include repeat purchasers.
*/
USING (SELECT field1 AS email,
COUNT(DISTINCT field2) AS count
MIN(field3) AS min_number,
MAX(date) AS last_date
FROM default_glue_catalog.upsolver_samples.test_raw_data
WHERE $commit_time BETWEEN run_start_time() AND run_end_time()
GROUP BY 1
HAVING COUNT(DISTINCT field2) > 1) source
ON (target.email = source.email)--primary key
WHEN MATCHED THEN REPLACE -- Update if primary keys match
WHEN NOT MATCHED THEN INSERT MAP_COLUMNS_BY_NAME; -- Insert if primary key is unique (new record)
It would be nice to know in general if MERGE command syntax is consistent across various target platforms.
I already build Athena transformation and it works as expected
You can use the same way you did for Athena. Upsolver INSERT command will insert new keys (APPEND) and if the table had primary key defined, INSERT command will update the existing keys (Upsert) as its default behavior.
MERGE as its definition is for UPSERT and can handle Deletes as well. And the syntax is consistent across all database/data warehouse/catalog targets