How to handle data vault hubs with no business key?

1.1k Views Asked by At

We have a project for loading data from and external source into a Data Vault Data Warehouse. The data are salary statements between and employer and an employee.

When starting to modelling this we find two business key the company id of the employer and the social security number (SSN) of the employee. Based on we get two hubs one for the employer and one for the employee. When adding a link between these two hubs we noticed that as there may (will) be more that one salary statement for each combination of employer and employee. This means we can't model this relationship with two hubs and one link.

Logically this could be handled by adding a third salary statement hub. Then we could have a link for all these three hubs. Our problem is that we don't have any business key for the salary statement!

My only thought as a workaround is to generate an artificial business key for the salary statement using company id, SSN and period of the salary statement. This don't really feel right to generate a business key in the Data Warehouse but do we have any other options? Could this maybe be modeled differently with Data Vault?

Any thoughts and ideas highly appreciated.

1

There are 1 best solutions below

0
On

What you've noticed here is a situation where Data Vault gets really difficult.

You have a situation where each data object, don't have a business key.

The Data Vault architecture needs business keys.

You generally have 4 options.

  1. Having a business object (in this case, a salary statements) without a business key is an anti-pattern. Convince the developer of the salary system to deliver a business key or unique transaction number for each salary statement.

  2. Create a composite key, like you mentioned. The biggest issue with this approach is: can you be sure that the composite key always is unique? Let's say you use company id, SSN and period. What if a mistake was made in the salary system and they had to make an extra payment in the same period? In this situation you would have 2 rows for the same composite key (company id, SSN and period).

  3. Create your own business key. Write a small program that takes the data from the salary system, and adds its own business key. This could be as simple as a database table with a primary key, and then use that primary key as a business key.

  4. Don't use Data Vault for this object. If an object don't fit in Data Vault, or if there is another structure that fits the data better, then use that.