Does direction of join condition matter in left joins?

142 Views Asked by At

There are two postgres tables with roughly the following structures:

products
+----+-----------+
| id | rawDataId |
+----+-----------+
| 1  |     a     |
+----+-----------+
| 2  |     null  |
+----+-----------+
| 3  |     b     |
+----+-----------+
| 4  |     null  |
+----+-----------+
| 5  |     c     |
+----+-----------+

rawData
+----+-----------+
| id | productId |
+----+-----------+
| a  |     1     |
+----+-----------+
| b  |     3     |
+----+-----------+
| c  |     5     |
+----+-----------+
| d  |     6     |
+----+-----------+
| e  |     7     |
+----+-----------+
| f  |     8     |
+----+-----------+
| g  |     9     |
+----+-----------+

While FK constraints were not properly defined when these were set up, in application logic products.rawDataId maps to rawData.id and rawData.productId maps to products.id. Each record in products can have 1 or 0 records in rawData, meanwhile, there are some orphaned records in rawData whose associated products records have been deleted.

I'm trying to find all orphaned rawData records by doing a left join. However, I get different number of rows depending on my join condition:

psql> SELECT count(*)
       FROM "rawData" "r"
       LEFT JOIN "products" "p"
       ON "r"."productId" = "p"."id"
       WHERE "p"."id" IS NULL;

psql> 7000


psql> SELECT count(*)
       FROM "rawData" "r"
       LEFT JOIN "products" "p"
       ON "r"."id" = "p"."rawDataId"
       WHERE "p"."id" IS NULL;

psql> 9239

These tables have ~100k rows, so it's not easy to traverse them manually, but I was wondering under what scenarios I would be getting different results based on the join condition. Should they not return the same number of rows?

Are there any other queries I can execute to find the diff, or get the number of orphaned rows more accurately, given these table structures and constraints?

The output I would like to get, based on the sample tables is:

+----+-----------+------+-----------+
| id | productId |  id  | rawDataId |
+----+-----------+------+-----------+
| d  |     6     | null |   null    |
+----+-----------+------+-----------+
| e  |     7     | null |   null    |
+----+-----------+------+-----------+
| f  |     8     | null |   null    |
+----+-----------+------+-----------+
| g  |     9     | null |   null    |
+----+-----------+------+-----------+
0

There are 0 best solutions below