Concat all columns to one string in a mapping data flow

218 Views Asked by At

I want to concat every column within a table to one string separated with a separator. In PySpark this would look like this:

concat_ws("#?", *columns)

What would be the equivalent in data flow expression language?

concatWS('#?', columns())

won't work. I get the following error:

"Error: Function 'concatWS' argument 3 should be StringType but was BinaryType"

Spark obviously implicitly converts every column to a string

Even tried this, but does not work either:

concatWS('#?', toString(byNames(columnNames())))

Thanks, Jan

1

There are 1 best solutions below

5
JayashankarGS On BEST ANSWER

Your expression is correct and working fine. concatWS('#?', columns())

If you still not getting expected results you can use the below alternate expressions.

concatWS('#?', toString(columns()))

Or

rtrim(toString(reduce(map(array(columns()),toString(#item)+'#?'),'', #acc + toString(#item), #result)),'#?')

Below is the columns i am having.

enter image description here

Output:

enter image description here