I have a set of elements. There are various ways I would like to partition the set, in the mathematical sense. A partition is defined as a collection of subsets such that every element is in exactly one subset.
I want to represent this situation in a relational database, and I want the relational constraints to enforce the mathematical definition of a partition.
The closest I have gotten is a schema like this:
CREATE TABLE element (
id integer PRIMARY KEY
);
CREATE TABLE partition (
id integer PRIMARY KEY
);
CREATE TABLE subset (
id integer PRIMARY KEY,
partition_id integer REFERENCES partition (id)
);
CREATE TABLE element_subset_mapping (
element_id integer REFERENCES element (id) NOT NULL,
partition_id integer REFERENCES partition (id) NOT NULL,
subset_id integer REFERENCES subset (id) NOT NULL,
CONSTRAINT uq_element_partition UNIQUE (element_id, partition_id)
);
This represents correctly that an element can only occur in one subset for a given partition. But there are at least a couple problems:
- It does not enforce that, within each partition, every element belongs to some subset. I suppose I could take the semantics that a missing (element_id, partition_id) row indicates that the element belongs to a single element subset, but I would prefer to be explicit.
- It does not enforce the requirement that a subset only belongs to one partition. That is, I could have a subset with (id, partition_id) = (22, 37), but still have a element_subset_mapping with (element_id, partition_id, subset_id) = (4005, 42, 22). This should not be allowed because subset 22 belongs to partition 37, not to partition 42.
It feels like there should be a canonical way to do this, but my searches have not yielded any answers.
The requirement that a subset belong to no more than one partition is already being enforced by the primary key constraint on
subset; however, there is no enforcement that the combination ofsubset_idandpartition_idinelement_subset_mappingreference a corresponding pair insubset. The following DDL addresses these issues by making the combination of subset and partition unique insubsetand combining the foreign keys onsubset_idandpartition_idinelement_subset_mappinginto a single foreign key referencingsubset(id, partition_id):The requirement that each element belong to exactly one subset within each partition is a more complex constraint to enforce because it involves rows from multiple tables. Let's begin by creating a trigger function that raises an exception if any elements are missing from any partitions:
This function can then be used to define constraint triggers on each of the tables that could cause the constraint to be violated:
element,partition, andelement_subset_mapping.An important aspect of these triggers is that they are
DEFERRED, which means that they fire when the current transaction is committed instead of as each row or statement is executed. This allows a set of SQL commands to be executed within a transaction to achieve a state that is consistent with the relational requirements.A requirement that wasn't mentioned in the original post is that all subsets must be non-empty. The following SQL addresses this using the same approach that was used to ensure that all elements are included in every partition: