|X| represents number of tuples in X
bold letters represent keys in the relation
Consider the relations R(A, B) and S(A, C), and that R has a foreign key on A that
references S.
|R ✶ S| (where ' * ' represents natural join) is:
The options are:
1. |R|
2. |S|
3. |R|.|S|
4. max(|R|, |S|)
5. min(|R|, |S|)
What I understand about the cardinality of natural join is that if there is no common attribute among the two relations then natural join will act like a cross-product and the cardinality will be r * s. But I don't understand how key constraints play a role in determining the cardinality.
Can someone please explain?
Find the Cardinality of Natural Join
1.3k Views Asked by rdr2 At
2
There are 2 best solutions below
1
On
There is absolutely not enough information to answer this question. The "natural" join can return any almost any value between 0 and R*S. The following are examples.
This example returns 12:
create table s1 (id int primary key);
create table r1 (s1_id int references s1(id));
insert into s1 (id) values (1), (2), (3);
insert into r1 (s1_id) values (1), (2), (2), (3);
This example returns 0:
create table s2 (id int primary key, x int default 2);
create table r2 (s2_id int references s2(id), x int default 1);
insert into s2 (id) values (1), (2), (3);
insert into r2 (s2_id) values (1), (2), (2), (3);
This examples returns 4:
create table s3 (id int primary key, y int default 2);
create table r3 (id int references s3(id), x int default 1);
insert into s3 (id) values (1), (2), (3);
insert into r3 (id) values (1), (2), (2), (3);
In all of these, r has a foreign key relationship to s. And a "natural" join is being used. Here is a db<>fiddle.
Even if you assume that the "A"s are the primary keys AND that there are no other columns, the number of rows still varies:
-- returns 4
create table s5 (id int primary key);
create table r5 (id int references s4(id));
insert into s5 (id) values (1), (2), (3);
insert into r5 (id) values (1), (1), (2), (2);
Versus:
-- returns 0
create table s4 (id int primary key);
create table r4 (id int references s4(id));
insert into s4 (id) values (1), (2), (3);
insert into r4 (id) values (NULL), (NULL), (NULL), (NULL);
Presuming the bold A in each schema means it is a key; and presuming the Foreign Key constraint holds -- that is, the
Avalue for every row inRdoes correspond to anAvalue inS:Rnaturally joins to a row inSonA.Sthat don't join toR(because there's no Foreign Key constraint to enforce that).R, answer 1.Is there are real-life use for a schema like this? Consider
Sis Customer Name inC, keyed by Customer number inA.Rholds date of birth inB, also keyed by Customer number inA. Every Customer must have a name; it's true every Customer (person) must have a d.o.b., but we don't need to record that unless/until they purchase age-restricted items.