Problem updating table A using data from table B

28 Views Asked by At

In DB2, I have a "live" table with data that is copied to an identically structured backup table daily. Since some of the live data may change after the backup, the respective columns need to be updated later. I was trying to achieve this with the following SQL:

UPDATE
    table_a hist
SET
    someCol = (
    SELECT
        someCol
    FROM
        table_b doc
    WHERE
        hist.someCol = doc.someCol)
WHERE
    EXISTS (
    SELECT
        *
    FROM
        table_b doc
    WHERE
        hist.ID = doc.ID
        AND (HIST.someCol != Doc.someCol
            OR (HIST.someCol IS NULL
                AND doc.someCol IS NOT NULL)));

This results in the following error:

The result of a scalar fullselect, SELECT INTO statement, or VALUES INTO statement is more than one row.

Since I could not find any mistakes in the query despite extended search, I tried a merge instead:

MERGE INTO table_a hist
USING (SELECT DISTINCT someCol FROM table_b) doc
ON (hist.someCol != doc.someCol)
WHEN MATCHED THEN UPDATE SET hist.someCol = doc.someCol 

This gives another error:

The statement was not processed because a row of target table "MTCS.MOMS_SPLIT_DOC_HIST" was identified more than once for update, delete or insert

So, which approach is better first of all? I think merge is faster, but apparently I haven't quite understood how it works. The update is also harder to read. And what is wrong with the merge? As I read it, it should compare both tables using someCol field and update the field in table_a if both fields are not equal in both tables.

1

There are 1 best solutions below

0
JoshuaG On

I think even though your Merge/Update works, it is still gonna be pretty heavy, especially when your tables are getting larger and larger. So instead of doing the heavy query, I think the following steps could help.

  1. It would be better to have some indexes on both a and b, that would help on reading when you join.
  2. You could use HASH_BYTE function as another column, this function would help you to find if any column get changed since last update. So next time, you don't have to check all columns, just compare the single value. The Syntax is like this:
   select *, CAST(HASHBYTES  
        ('SHA2_256'  
        , ISNULL(some_cols, '') + '|'  
        + ISNULL(some_other_cols, '') + '|'  
        ) AS bigint) AS TableHash
    from table_A

So next time all you need to do is to compare this hashvalue.

update hist
set col = doc.col
from table_a hist
join table_b doc
    on hist.some_col = doc.some_col 
where hist.TableHash <> doc.TableHash