SQL - Compare rows in a table to find column differences - self join

1.3k Views Asked by At

I have the following table:

DECLARE @TABLE_A TABLE (
   id int identity, 
   name varchar(20), 
   start_date datetime, 
   end_date datetime, 
   details nvarchar(500), 
   copied_from int)

Users can clone a row and re-insert it into the same table, we record which row it's copied from. So if you have a row with ID = 1 and you copy all of its columns and re-insert it (from the UI) you get a new row with ID = 5 and copied_from field for the new row will have the value as 1.

After this users can update the new row values (ID 5 in this example), we needed a way to see the differences between the 2 rows. I have written the below to get the differences between the columns of ID 1 and ID 5.

DECLARE @id int = 5
DECLARE @TABLE_A TABLE (id int identity, name varchar(20), start_date datetime, end_date datetime, details nvarchar(500), copied_from int)

INSERT INTO @TABLE_A (name, start_date, end_date, details, copied_from)
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'John', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up - changed</p>', 1 

SELECT 
    'Name' AS column_name,
    ISNULL(s.name, '') AS value_before, 
    ISNULL(t.name, '') AS value_after, 
    t.id, 
    t.copied_from
FROM @TABLE_A s
FULL OUTER JOIN @TABLE_A t ON s.id = t.copied_from
WHERE t.id = @id AND ISNULL(s.name, '') <> ISNULL(t.name, '')
UNION ALL
SELECT 
    'Details' AS column_name,
    ISNULL(s.details, '') AS value_before, 
    ISNULL(t.details, '') AS value_after, 
    t.id, 
    t.copied_from
FROM @TABLE_A s
FULL OUTER JOIN @TABLE_A t ON s.id = t.copied_from
WHERE t.id = @id AND ISNULL(s.details, '') <> ISNULL(t.details, '')

.......

As you can see there is a self join on ID and COPIED_FROM fields and for each column I check to see if there is a difference.

This works but somehow I am not happy with the repeated UNIONS for each column, I was wondering if there is another way of achieving this?

Thanks

4

There are 4 best solutions below

1
On

Try the below script, this may be help you. Using the CASE WHEN expression we can identify the column that are modified. But this will return only a single record with all the details (before value, after value and status- 1:modified/0:not).

SELECT  ISNULL(s.name, '')                                      AS name_before, 
        ISNULL(t.name, '')                                      AS name_after, 
        (case when s.name <> t.name then 1 else 0 end)          AS name_status,

        ISNULL(s.details, '')                                   AS details_before, 
        ISNULL(t.details, '')                                   AS details_after, 
        (case when s.details <> t.details then 1 else 0 end)    AS details_status
FROM    @TABLE_A s 
INNER JOIN @TABLE_A t ON s.id = t.copied_from
WHERE   t.id = @id
0
On

You can achieve that by using INNER JOIN also but I would say it would be the proper way to show the change value in single row and will be more readable

    select s.name AS [name_before], 
    t.name AS [name_after], 
    s.details AS [detail_before], 
    t.details AS [detail_after], 
    t.id, 
    s.id AS [copied_from]
FROM @TABLE_A s 
INNER JOIN @TABLE_A t ON s.id = t.copied_from
WHERE   t.id = @id
    AND (s.name <> t.name OR s.details <> t.details)

We can use OR it we have to show if both or one of from name & details is changes.

1
On

you can use dynamic script, even the table has hundreds columns, it doesn't matter.

   CREATE TABLE #tt (id int identity, name varchar(20), start_date datetime, end_date datetime, details nvarchar(500), copied_from int)

    INSERT INTO #tt (name, start_date, end_date, details, copied_from)
    SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
    SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
    SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
    SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
    SELECT 'John', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up - changed</p>', 1 
    DECLARE @cols VARCHAR(max),@sql VARCHAR(max)
    SELECT @cols=ISNULL(@cols+',('''+c.Name+''',CONVERT(VARCHAR,o.[','('''+c.Name+''',CONVERT(VARCHAR,o.[')+c.name+']),CONVERT(VARCHAR,c.['+c.name+']))'
    FROM tempdb.sys.columns AS c WHERE c.object_id=OBJECT_ID('tempdb..#tt')
    PRINT @cols
    SET @sql='
    SELECT x.* FROM #tt AS c
    LEFT JOIN #tt AS o ON c.copied_from=o.id
    CROSS APPLY(values'+@cols+') AS x(columnName,OrignalValue,CopiedValue) 
    WHERE c.copied_from IS NOT NULL'
    PRINT @sql
    EXEC(@sql)
columnName  OrignalValue                   CopiedValue
----------- ------------------------------ ------------------------------
id          1                              5
name        Tom                            John
start_date  Jan  1 2017 12:00AM            Jan  1 2017 12:00AM
end_date    Feb  1 2017 12:00AM            Feb  1 2017 12:00AM
details     

this column can contain htm

this column can contain htm copied_from NULL 1

0
On

well, i guess the original requirement is to aggregate all the changes into one set with (ColumnName, before, after, id, copied_from).

i think you may wish to use unpivot, example:

DECLARE @id int = 5
DECLARE @TABLE_A TABLE (id int identity, name varchar(20), start_date datetime, end_date datetime, details nvarchar(500), copied_from int)

INSERT INTO @TABLE_A (name, start_date, end_date, details, copied_from)
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'Tom', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up</p>', null UNION ALL
SELECT 'John', '2017-01-01', '2017-02-01', '<p>this column can contain html mark up - changed</p>', 1 ;

WITH    t AS ( SELECT   id,
                        CAST(ISNULL(name, '') AS NVARCHAR(500)) name,
                        CAST(start_date AS NVARCHAR(500)) start_date,
                        CAST(end_date AS NVARCHAR(500)) end_date,
                        details,
                        copied_from
               FROM     @TABLE_A
             ),
        m AS ( SELECT   u.id,
                        u.copied_from,
                        u.column_name,
                        u.data
               FROM     t UNPIVOT( data FOR column_name IN ( name, start_date,
                                                             end_date, details ) ) u
             )
    SELECT  toT.column_name,
            fromT.data value_before,
            toT.data value_after,
            toT.id,
            toT.copied_from
    FROM    m fromT
    INNER JOIN m toT ON toT.copied_from = fromT.id AND
                        toT.column_name = fromT.column_name AND
                        toT.data <> fromT.data;

Note: i have to cast all fields to nvarchar (to be consistent for all columns that need to unpivot), otherwise UNPIVOT would not work...