I was given the following question:
Write a SQL statement to make a join on the tables salesman, customer and orders in such a form that the same column of each table will appear once and only the relational rows will come.
I executed the following query:
SELECT * FROM orders NATURAL JOIN customer NATURAL JOIN salesman;
However, I was not expecting the following result:
My doubt lies in step 2.
Why didn't I get the rows with salesman_id 5002, 5003 & 5007?
I know that natural join uses the common columns to finalize the rows.
Here all the Salesman_ids are present in the result from step 1.
Why isn't the final result equal to the table resulting from step 1 with non duplicate columns from salesman added to it?
Yes
Natural Join
does that.I don't know what that means.
I disagree with those who are saying: do not use
Natural Join
. But it is certainly true that if you plan to useNatural Join
for your queries, you must design the schema so that (loosely speaking) 'same column name means same thing'.Then this exercise is teaching you the dangers of having same-named columns that do not mean the same thing. The danger is sometimes called the 'connection trap' or 'join trap'. (Not really a trap: you just need to learn ways to write queries over poorly-designed schemas.)
A more precise way to put that: if you have columns named the same in two different tables, the column must be a key of at least one of them. So:
city
is not a key in any of those tables, so should not get 'captured' in aNatural Join
.salesman_id
is not a key in tablecustomer
, so should not get 'captured' in the join from tableorders
.The main way to fix up this query is by renaming some columns to avoid 'capture' (see below). It's also worth mentioning that some dialects of SQL allow:
The
ON column(s)
phrase means: validate that the only columns in common between the two tables are those named. Otherwise reject the query. So your query would be rejected.Renaming means that you shouldn't use
SELECT *
. (Anyway, that's dangerous for 'production code' because your query might produce different columns each time there's a schema change.) The easiest way to tackle this might be to create three Views for your three base tables, with the 'accidental' same-named columns given some other name. For this one query:I'm using explicit
AS
to show renaming. Most dialects of SQL allow you to omit that keyword; just putcity cust_city, ...
.