We are looking to see if there is a tool within the Foundry platform that will allow us to have a list of field descriptions and when the dataset builds, it can populated those descriptions automatically. Does this exist and if so what is the tool called?

1

There are 1 best solutions below

0
On

If you upgrade your Code Repository to version 1.184.0+, this is released and available from this point onwards.

The method you use to push output column descriptions is to provide a new optional argument to your TransformOutput.write_dataframe(), namely column_descriptions.

This argument should be a dict with keys of column names and values of column descriptions (up to 200 characters in length for stability reasons).

The code will automatically compute the intersection of the column names available on your pyspark.sql.DataFrame and the keys in the dict you provide, so it won't try to put descriptions on columns that don't exist.

The code you use to run this process looks like this:

from transforms.api import transform, Input, Output


@transform(
    my_output=Output("/my/output"),
    my_input=Input("/my/input"),
)
def my_compute_function(my_input, my_output):
    my_output.write_dataframe(
        my_input.dataframe(),
        column_descriptions={
            "col_1": "col 1 description"
        }
    )