BCNF normalization

383 Views Asked by At

Could you please provide me with an article that gives an example of a DB design that is in 3NF but not in BCNF and then illustrates how to convert it to BCNF? All the articles that I saw which try to explain BCNF give examples of tables that are in 1NF and then convert them to BCNF. This doesn't let me see the difference between 3NF and BCNF.

Thanks in advance

1

There are 1 best solutions below

0
On

An example with overlapping keys reveals the difference; having the predicate [P] and matching constraints (c x.y).

[P] Employee EMP, with email EMAIL, took course CRS in year YR.

(c 1.1) For each employee and course; that employee took that course at most once; it is possible that more than one employee took that course.

(c 1.2) For each each employee and course; that employee took that course in exactly one year.

(c 1.3) For each employee and year; it is possible that the employee took more than one course in that year.

(c 1.4) For each course and year; it is possible that more than one employee took that course in that year.

(c 2.1) For each employee, that employee has exactly one email.

(c 2.2) For each email, exactly one employee has that email.

(c 3.1) For each email and course; employee with that email took that course at most once; it is possible that more than one employee with that email took that course.

(c 3.2) For each each email and course; employee with that email took that course in exactly one year.

(c 3.3) For each email and year; it is possible that employee with that email took more than one course in that year.

(c 3.4) For each course and year; it is possible that more than one employee with specific email took that course in that year.

Note how verbalizing constraints intuitively reveals the problem. See how constraints c 3.x match (repeat) c1.x due to c 2.x.


R {EMP, EMAIL, CRS, YR}

KEY {EMP,   CRS}
KEY {EMAIL, CRS}

The FDs for this are

FD {EMP,  CRS} --> {YR}

FD {EMAIL, CRS} --> {YR}

FD {EMP} --> {EMAIL}

FD {EMAIL} --> {EMP}

So, considering each one of these as FD X --> Y it holds that either,

  • X is a superkey,
  • Y is a subkey.

Therefore the R is in the 3rd NF.

For BCNF, the requirement is that for any nontrivial FD X --> Y in R, X is a superkey.

Here is a check-list for 2NF to BCNF

---------------------------------------
For each nontrivial |        NF
    FD X --> Y      |
at least one holds  | 2nd   3rd   BCNF
---------------------------------------
X is a superkey        ✔     ✔     ✔

Y is a subkey          ✔     ✔

X is not a subkey      ✔
---------------------------------------

FD X --> Y is trivial iff Y ⊆ X

Now we could decompose R into:

{EMP, EMAIL} {EMP, CRS, YR}

OR

{EMP, EMAIL} {EMAIL,  CRS, YR}

which eliminates those two FDs to subkeys.


And finally, note that after decomposition into {EMP, EMAIL} {EMP, CRS, YR} or into {EMP, EMAIL} {EMAIL, CRS, YR} these tables are now all in 5NF -- actually in 6NF, but that's not important now. It is important to observe that it is possible to get into 5NF -- and hence into: (4, BCNF, ..., 1) -- just by using logic, verbalizing predicate and constraints. In other words, for a developer:

your tables can be in high NF even if you have no idea what all this terminology means.