With Singer taps and targets, why are columns created in the target even if they are deselected?

546 Views Asked by At

When using singer taps and targets, we sometimes see the target creating columns for columns that are explicitly filtered out (deselected) in the tap.

Why does this this happen and how can we resolve the issue?

1

There are 1 best solutions below

0
On

A little background info

By way of background and spec introduction, Singer taps and target communicate with each other by way of SCHEMA and RECORD message types.

SCHEMA messages are sent from the tap to the target first, and they tell the target what kind of tables need to be created. They allow the target to prepare the destination platform (if necessary) for the data which will arrive.

RECORD messages arrive after the SCHEMA message, and they contain the actual data.

What's happening here

This symptom (columns being created even when the corresponding fields are deselected) occurs when SCHEMA messages are not filtered and are just passed, raw, from the source's data catalog. Ideally SCHEMA records should be filtered based on the same selection logic that RECORD messages are filtered on, but this is not always the case.

Then, because the SCHEMA messages arrive before the RECORD messages, the target will go ahead and create a destination column for all fields, even those which are not going to have data when RECORD messages arrive.

How to fix it

The most direct fix is for the tap developer to add filtering logic into SCHEMA messages, just as then have for RECORD messages. Most tap maintainers will accept an Issue or Pull Request on this topic. If the tap is built on Meltano's SDK, then SCHEMA messages will automatically be filtered, along with RECORD messages - so another option is to port to the SDK or for the user to migrate to a variant of the tap that is already using the SDK.


Full disclosure: I work for Meltano and I work on Meltano's SDK for Singer Taps and Targets (https://sdk.meltano.com). I am also the author of several taps and targets.