I found a case in ER where for the life of me I can't figure out how to achieve referential integrity. The classical Employee, Manager, Department relationship can illustrate this problem.
With the following constraints:
- Employee can work in only one Department.
- Department can have many Employees.
- Employee can have one Manager working in the same Department.
- Manager can have many Employees working in the same Department.
- Employee that doesn't have a Manager is a Manager.
This diagram illustrates the concept.
Before normalisation I end up with the following table.
After normalisation I end up with these tables.
However, there is still nothing stopping me from accidentally assigning a manager working in one department to an employee working in a different department in the EmployeeManager
table.
One possible solution that I found was to put Department into the EmployeeManager
table and define a reference integrity constraint so that {Manager, Department}
refers {Employee, Department}
in the EmployeeDepartment
table.
However, for this to work doesn't {Manager, Department}
have to be a candidate key? Is there a different design that can solve this?
Update
Ok to answer my first question, doesn't {Manager, Department}
have to be a candidate key? It turns out that the {Manager, Department}
in the EmployeeManager
table doesn't have to be a candidate key or a unique key. It simply has to be a foreign key referencing the {Employee, Department}
in the EmployeeDepartment
table. The uniqueness of {Employee, Department}
key isn't well defined and may differ between different engines. MySQL for example advises that the foreign keys reference only unique keys.
Additionally, MySQL requires that the referenced columns be indexed for performance reasons. However, the system does not enforce a requirement that the referenced columns be UNIQUE or be declared NOT NULL. The handling of foreign key references to nonunique keys or keys that contain NULL values is not well defined for operations such as UPDATE or DELETE CASCADE. You are advised to use foreign keys that reference only UNIQUE (including PRIMARY) and NOT NULL keys.
In my case it will work because Employee can only work in one Department however if the constraint chances to allow Employees work in many Departments it won't work because {Employee, Department}
will no longer be unique.
It should work in all cases including if the constraint chances to allow Employees work in many Departments.
Is there a different design that can solve this? I also thought about replacing EmployeeDepartment
with ManagerDepartment
table with {Manager}
as a primary key and going back to a previous EmployeeManager
table with (Employee, Manager)
columns. So now to find out which Department an Employee works you need to join EmployeeManager
with ManagerDepartment
table.
Do you see any bad practises or anomalies with this design?
Assuming all these columns are declared NOT NULL . . .
Yes, add a column for "department" to the "EmployeeManager" table. But you need two foreign key constraints that overlap. (But see below . . .)
Since EmployeeDepartment.Employee is unique, the pair of columns EmployeeDepartment.Employee and EmployeeDepartment.Department is also unique. So you can declare "Employee" as a primary key, and also declare a unique constraint on the pair of columns (Employee, Department). Should the requirements change and allow employees to work in multiple departments, you can drop the single-column primary key. I would probably drop both the primary key and unique constraints, and create a new primary key constraint that included both columns, but all that's strictly necessary is to drop the primary key constraint.
In systems like yours, it's usually a good idea to have a table of managers, with the obvious foreign key references. Right now, if you delete the employee Will, you lose the fact that Steve is a manager.