Substite newlines with a string with awk

127 Views Asked by At

I need to parse stdin in the following way:

(1) all newlines characters must be substituted with \n (a literal \ followed by n)

(2) nothing else should be performed except the previous

I chose awk to do it, and I would like an answer that uses awk if possible.

I came up with:

echo -ne "A\nB\nC" | awk '{a[NR]=$0;} END{for(i=1;i<NR;i++){printf "%s\\n",a[i];};printf "%s",a[NR];}'

But it looks cumbersome.

Is there a better / cleaner way?

5

There are 5 best solutions below

1
Cyrus On BEST ANSWER

With awk:

echo -ne "A\nB\nC" | awk 'BEGIN{FS="\n"; OFS="\\n"; RS=ORS=""} {$1=$1}1'

Output:

A\nB\nC

See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

0
Fravadona On
  • Handling malformed files (ie. that don't end with the record separator) with awk is tricky.

  • sed -z is GNU specific, and has the side effect of slurping the whole (text) file into RAM (that might be an issue for huge files)

Thus, for a robust and reasonably portable solution I would use perl:

perl -pe 's/\n/\\n/'
0
Ed Morton On

Using GNU awk for multi-char RS:

$ echo -ne "A\nB\n\nC" | awk -v RS='^$' -v ORS= -F'\n' -v OFS='\\n' '{$1=$1} 1'
A\nB\n\nC$
0
Daweo On

I would harness GNU AWK for this task following way

echo -ne "A\nB\nC" | awk '{printf "%s%s",$0,RT?"\\n":""}'

gives output

A\nB\nC

(without trailing newline)

Explanation: I do create string to be output based on current line context ($0) and backslash followed by n or empty string depending on RT which is row terminator for current line. RT value is newline for all but last lines and empty string for last line, therefore when used in boolean context it is true for all but last line. I used so-called ternary operator here condition?valueiftrue:valueiffalse.

(tested in GNU Awk 5.0.1)

0
RARE Kpop Manifesto On

this should solve the blank line in between problem :

gecho -ne "A\nB\n\nC" | 
{m,g,n}awk 'BEGIN {  RS = "^$" ; FS = "\n" 
                    ORS =  "" ; OFS = "\\n" } NF = NF' | gcat -b
     1  A\nB\n\nC%   

a gawk-specific way via RT :

 gawk 'BEGIN { _ = ""; ORS =__= "\\n" } (ORS = RT ? __ : _)^_'