I have a tricky problem and I'm wondering if there's a clever regex solution. I have input data that consists of two columns, but the first column needs to be split into multiple lines with the second column intact. For example, a file called test:
cat_;_dog_;_rat animal
chair_;_desk object
The output needs to look like this:
cat animal
dog animal
rat animal
chair object
desk object
There are an arbitrary number of ; separators on each line. There is probably a way to do this in a one-liner, which I prefer since I'm piping the data in and out. I tried this:
perl -pe 's/(\w+)_;_(\w+)\t(.+)/$1\t$3\n$2\t$3/g' test
The first column has words (\w+) delimited by _;_, then a tab, and then the second column. But this only consumes one iteration of the data:
cat animal
dog_;_rat animal
chair object
desk object
I tried the following too just in case the /g global tag wasn't getting it right:
perl -pe 's/(\w+)(_;_(\w+))+\t(.+)/$1\t$4\n$3\t$4/g' test
It still only goes one round. Who's got some ideas?
-nreads the input line by line and runs the code for each line;-lremoves newlines from input and adds them to output;-asplits each input line on whitespace into the@Farray;_;_, and for each value, it prints it ($_) followed by the second column.