What actually happens during table JOINs?

150 Views Asked by At

I'm trying to see if my understanding of JOINs is correct.

For the following query:

SELECT * FROM tableA 
   join tableB on tableA.someId = tableB.someId
   join tableC on tableA.someId = tableC.someId;

Does the RDMS basically execute similar pseudocode as follows:

List tempResults
for each A_record in tableA
    for each B_record in tableB
        if (A_record.someId = B_record.someId)
            tempResults.add(A_record)

List results 
for each Temp_Record in tempResults
    for each C_record in tableC
        if (Temp_record.someId = C_record.someId)
            results.add(C_record)

return results;

So basically the more records with the same someId tableA has with tableB and tableC, the more records the RDMS have the scan? If all 3 tables have records with same someId, then essentially a full table scan is done on all 3 tables?

Is my understanding correct?

1

There are 1 best solutions below

2
On

Each vendor's query processor is of course written (coded) slightly differently, but they probably share many common techniques. Implementing a join can be done in a variety of ways, and which one is chosen, in any vendor's implementation, will be dependent on the specific situation, but factors that will be considered include whether the data is already sorted by the join attribute, the relative number of records in each table (a join between 20 records in one set of data with a million records in the other will be done differently than one where each set of records is of comparable size). I do not know the internals for MySQL, but for SQL server, there are three different join techniques, a Merge Join, a Loop Join, and a Hash Join. Take a look at this.