Power Query - replace multiple substrings in a string from a list

5.2k Views Asked by At

I have a column with phrases like, which I need to clean from chars like ;,:! etc, by replacing them with a space (to avoid concatenation of words).

EXAMPLE OF INPUT AND OUTPUT:

"bla.blabla X&!:blablabla" needs to become "bla blabla X blablabla"

And I am not finding an easy simple solution!

source = data,  //which has Oldcol column with data to clean
CharsToRemove = { ".", "'", ":", ";", "!", "?", "<", ">", "{", "}", "[", "]", "'", "`", "^", """", "&", "-","/", "\", "|","(", ")", "," , "=", "+"}, 
//line below is where I'd like to get to, it is not working
Newcol = Table.AddColumn(source , "NEWCOL", each 
             Text.Replace([Oldcol], CharsToRemove , " ") ) 

i'm looking for the last line to work, basically.

The option of nested replace statements works but it's bulky since the actual list of chars to replace is long, and it happens in numerous parts of the code. I can turn it into the function, but I would love to know if there is a way to avoid all that, and just have a smart line-line that does that job?

thank you in advance

2

There are 2 best solutions below

4
On BEST ANSWER

For multiple text replacements, you can use List.Accumulate, e.g.

let
    CharsToRemove = Text.ToList(".':;!?<>{}[]'`^""&-/\|(),=+"),
    Source = #table({"Oldcol"},{{"bla.blabla X&!:blablabla"}}),
    Cleaned = Table.TransformColumns(Source,{{"Oldcol", each List.Accumulate(CharsToRemove,_,(String,Remove) => Text.Replace(String,Remove," "))}})
in
    Cleaned
1
On

For multiple columns, you can use Table.ReplaceValue.

let
    CharsToRemove = Text.ToList(".':;!?<>{}[]'`^""&-/\|(),=+"),
    Source = #table({"Oldcol1", "Oldcol2"},{{"bla.blabla X&!:blablabla","bla.blabla X&!:blablabla"}}),
    Cleaned = List.Accumulate(CharsToRemove,Source,(Table,Remove) => Table.ReplaceValue(Table,Remove," ",Replacer.ReplaceText,{"Oldcol1","Oldcol2"}))
in
    Cleaned