I am using 2 separate query to check for duplicates and both query were supposed to give the same answer when executed, however it did not. Below is an example of my database table data:
id Name Age Group SeatNo
------------------------------------------
1 Alpha 11 A NULL
2 Bravo 12 A 1
3 Alpha 11 B NULL
This is my first query where it shows there are duplicates
SELECT count(*)
FROM Test AS ta
JOIN Test AS tb ON ta.name=tb.name AND ta.age=tb.age
WHERE ta.GroupName='A'
AND tb.GroupName='B'
This is my second Query that shows zero duplicates
SELECT count(*)
FROM Test AS ta
JOIN Test AS tb ON ta.name=tb.name AND ta.age=tb.age AND ta.SeatNo=tb.SeatNo
WHERE ta.GroupName='A'
AND tb.GroupName='B
After looking through both query and the data in the table, it seems that NULL in the seatNo affected the second query and cause it to return 0 duplicates. Is there any solution to search for duplicates even though it is NULL?
You have learned that
NULL <> NULL
. What can you do about it?Well, ANSI SQL has a comparator for this purpose
IS DISTINCT FROM
. So, you could write:However, most databases do not support that. A more general form is:
Another method uses
UNION ALL
and aggregation. The following gets the duplicates:You can then use this as a subquery to get the count: